当前位置:网站首页>Small sample learning data set

Small sample learning data set

2022-06-25 05:03:00 MondayCat111

Article reprinted from :https://blog.csdn.net/qq_36104364/article/details/107508592  

This paper sorts out the small sample data sets commonly used in recent years , Provides an introduction to datasets , References and download addresses . All the resources I have have have been uploaded to Baidu cloud disk , Other datasets also provide official download addresses ( Some may need to climb over the wall ). Finally, a simple summary of each data set is made .

1.Omniglot

  Omniglot Data sets are generated from 50 In different languages 1,623 Composed of handwritten characters , Every character has 20 Different handwriting , This constitutes a very large number of sample categories (1623 Kind of ), But the number of samples in each category is very small (20 individual ) Small sample handwritten character data set . In use, we usually choose 1200 Characters as training set , remainder 423 Characters as a verification set , And by rotating 90°,180° and 270° Data set expansion , Each picture will be cut to uniform size 28*28.
   reference :Lake B, Salakhutdinov R, Gross J, et al. One shot learning of simple visual concepts[C]//Proceedings of the annual meeting of the cognitive science society. 2011, 33(33).
   Download address :https://pan.baidu.com/s/19Y5aGfa-lNEZTDUeL1jP4g
   Extraction code :4y3z

2. miniImageNet

  miniImageNet Data sets are from ImageNet In the data set 60,000 Of images , common 100 Categories , Each category has 600 Zhang image , The size of each image is 84*84. One of them is usually selected in use 80 Images of categories are used as training sets , remainder 20 Images of categories are used as validation sets . Some articles divide it into basic sets (Base Class,64 Kind of ), Verification set (Validation Class,16 Kind of ) And new category sets (Novel Class,20 Kind of ).
   reference :Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Advances in neural information processing systems. 2016: 3630-3638.
   Download address :https://pan.baidu.com/s/1nqBSA1w5mQuhlrQeCY4HgA
   Extraction code :ajrz

3. tieredImageNet

  tieredImageNet Data sets are also from ImageNet Selected in the dataset , contain 34 Two categories: (Categories), Each major class contains 10-30 A small class (Classes), Each category has a number of different image samples , total 608 Categories ,779,165 Zhang image ( On average, each category contains 1281 A picture ).34 These categories can be divided into training sets (20 Categories: ), Verification set (6 Categories: ) And test set (8 Categories: ), The data set division is shown in the following figure .
 Insert picture description here

   reference :Ren M, Triantafillou E, Ravi S, et al. Meta-learning for semi-supervised few-shot classification[J]. arXiv preprint arXiv:1803.00676, 2018.
   Download address :
https://drive.google.com/uc?export=download&confirm=_SLS&id=1g1aIDy2Ar_MViF2gDXFYDBTR-HYecV07

4. CUB-200

  CUB-200 The full name of the dataset is Caltech-UCSD Birds-200-2011 Data sets , Is a database of birds provided by the California Institute of technology , contain 200 Of birds 11,788 Zhang image . In use, it is usually divided into training sets (100 Kind of ), Verification set (50 Kind of ) And test set (50 Kind of ), The image size is uniformly cut to 84*84.
   reference :Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The caltech-ucsd birds- 200-2011 dataset. 2011.
   Download address :https://pan.baidu.com/s/1DEmLxePvDuJX1goSzM9r6Q
   Extraction code :f1l5

5. CIFAR-FS

  CIFAR-FS The full name of the dataset is CIFAR100 Few-Shots Data sets , It comes from CIFAR 100 Data sets , contain 100 Category , Each category 600 Zhang image , total 60,000 Zhang image . In use, it is usually divided into training sets (64 Kind of ), Verification set (16 Kind of ) And test set (20 Kind of ), The image size is unified as 32*32.
   reference :Bertinetto L, Henriques J F, Torr P H S, et al. Meta-learning with differentiable closed-form solvers[J]. arXiv preprint arXiv:1805.08136, 2018.
   Download address :https://pan.baidu.com/s/1HqRUw3dmsMBInt_Fh3J_Uw
   Extraction code :ub38

6. ImageNet-1K Challenge

  ImageNet-1K Challenge Data sets are also from ImageNet Data sets , Yes inclusive 1000 Category . In use, it is usually divided into basic data sets (389 Categories ) And new sample datasets (611 Kind of ).
   reference :Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 3018-3027.
   Download address :http://www.image-net.org/

7. FC100

  FC100 The full name of the dataset is Few-shot CIFAR100 Data sets , With the above CIFAR-FS Data sets are similar to , Also from CIFAR100 Data sets , contain 100 Category , Each category 600 Zhang image , total 60,000 Zhang image . But the difference is   FC100 Not by category (Class) Divided , But according to superclass (Superclass) Divided . contain 20 A superclass (60 Categories ), One of the training sets 12 A superclass , Verification set 4 A superclass (20 Categories ), Test set 4 A superclass (20 Categories ).
   reference :Oreshkin B, López P R, Lacoste A. Tadam: Task dependent adaptive metric for improved few-shot learning[C]//Advances in Neural Information Processing Systems. 2018: 721-731.
   Download address :https://pan.baidu.com/s/1Wnlp1-obKsMLcHITYQ1CLg
   Extraction code :kcd6

                    Summary table of small sample data set

Small sample data set source Number of categories Number of pictures Picture size
Omniglot-162332,46028*28
miniImageNetImageNet10060,00084*84
tieredImageNetImageNet608779,16584*84
ImageNet 1KImageNet1000--
CIFAR-FSCIFAR 10010060,00032*32
FC100CIFAR 10010060,00032*32
CUB-200-20011,78884*84

8.FewRel Data sets

   Relation extraction data set released by Tsinghua University RewRel, The dataset contains 100 individual Relation,44800 individual Instance( The sentence ), Belongs to a supervised data set .

   Download address :https://thunlp.github.io/fewrel.html

  GitHub Address :https://github.com/thunlp/FewRel

9.Stanford Dogs Data sets

   Download address :https://www.kesci.com/mw/dataset/5d22e94e688d36002c55105f

10.Stanford Cars Data sets

   Download address :http://ai.stanford.edu/~jkrause/cars/car_dataset.html

原网站

版权声明
本文为[MondayCat111]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202210528019675.html