【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition

数据集整体概述
The IP102 datset contains more than 75,000 images belongs to 102 categories. A natural long-tailed distribution presents on it. In addition, we annotate 19,000 images with bounding boxes for object detection. The IP102 has a hierarchical taxonomy and the insect pests which mainly affect one specific agricultural product are grouped into the same upper-level category.

下载地址:
https://github.com/xpwu95/IP102.

数据集特点:
exhibit a natural long-tailed distribution.
the challenges of interand intra- class variance and data imbalance.

Insect Pest Dataset

  1. taxonomic system establishment, 2) image collection, 3) preliminary data filtering, and 4) professional data annotation

Taxonomic System Establishment
We invite several agricultural experts and discuss the common categories of insect pests which exist in daily life【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
Figure 4. Taxonomy of the IP102 dataset. The ‘FC’ and ‘EC’ denote the field and economic crops, respectively. On the sub-class level, only 35 classes are shown. The full list of each sub-class can be found in the released IP102 dataset.

Image Collection
We utilize the Internet as the primary source to collect images,

As a consequence, we collect more than 300, 000 candidate images for the IP102 dataset.

Preliminary Data Filtering
We organize 6 volunteers to manually filter the candidate images.
training content:

  1. the common sense of insect pests from agricultural experts,
  2. the taxonomic system of the IP102
  3. different forms of insect pests.

volunteers delete the images which contain none or more than one insect pest category as illustrated
in Fig 2.
【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
Then, we convert the format of filtered images to JPEG and delete the images which are repeated or damaged.

Professional Data Annotation
For each crop,
we invite a corresponding agricultural expert who studies itprimarily.

We build a Question/Answer (Q/A) system for convenient annotation.

统计情况

【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition

【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition 实验:

手工特征
【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
深度学习方法:
【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition
定性分析:
【论文笔记】IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition