The ImageNet is an extensive image database that has been instrumental in advancing computer vision and deep learning research. It contains more than 14 million, hand-annotated images classified into more than 20,000 categories. In at least one million of the images, bounding boxes are also provided as detection labels.
Since 2010 an annual computer vision contest takes place based on the ImageNet, the so-called ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a trimmed list of one thousand non-overlapping classes.
On 30 September 2012, a convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner-up, drawing worldwide attention to the potential of deep learning for computer vision applications 1.
The weights of several CNNs that have previously won the competition (such as AlexNet, VGG-16, etc) are saved and available for fast and easy implementation of transfer learning via most of the deep learning platforms.
The data is available for free to researchers for non-commercial use.