Summary of the ANIMAL-10N Dataset

Number of Training Images: 50,000 Attribute Characteristics: Real Missing Values: No
Number of Testing Images: 5,000 Data Set Characteristics: Multivariate Date Created: April 2019
Number of Image Labels: 10 Resolution: 64x64(RGB) Area: Animal

ANIMAL-10N dataset contains 5 pairs of confusing animals with a total of 55,000 images. The 5 pairs are as following: (cat, lynx), (jaguar, cheetah), (wolf, coyote), (chimpanzee, orangutan), (hamster, guinea pig).
The images are crawled from several online search engines including Bing and Google using the predifined labels as the search keyword. The images are then classified by 15 recruited participants(10 undergraduate & 5 graduate students); each participants annotated a total of 6,000 images with 600 images per class.
After removing irrelevant images, the training dataset contains 50,000 images and the test dataset contains 5,000 images. The noise rate(mislabeling ratio) of the dataset is about 8%. For more information, please refer to the paper.