当前位置:网站首页>【CVPR 2021】DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
【CVPR 2021】DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort
2022-06-26 09:24:00 【_ Summer tree】
List of articles

Speed reading summary
DatasetGAN:
A process of automatically generating a large number of high-quality semantic segmentation image data sets , Need the least manpower . Only a few labeled samples are needed to train decoder Generate the rest of the potential space , Thus, an infinite annotation data generator . The resulting dataset can then be used to train any computer vision architecture .
Annotation cost is the bottleneck of data scale .
Our goal is to synthesize large and high-quality label data sets , Only a few examples of labels are needed .
In our work , We show the latest and most advanced image generation models to learn very powerful potential representations , It can be used for complex pixel level tasks .
We introduced DatasetGAN, It can generate a large number of high-quality semantic segmentation image data sets , Need the least manpower .
The key to our approach is to observe , Trained to synthesize images GANs Must acquire rich semantic knowledge , To present diverse and realistic examples of objects .
Our key point is , Training a successful decoder requires only a small number of labeled images , Thus, an infinite annotation data set generator .
Because we only need to mark a few examples , therefore We annotate the image in great detail , And generate data sets with rich objects and partial segmentation .
We are 7 Image segmentation tasks generate data sets , These include 34 Personal face pieces and 32 Pixel level labels for car parts . Our approach is significantly superior to all semi supervised baselines , And it is equivalent to the method of full supervision , Although in some cases you need two orders of magnitude less annotation data .
In our work , We show the animation of the object 3D The reconstruction , There we use our method to generate detailed part tags .

DATASETGAN Composite image annotation pairs , Large high-quality datasets with detailed pixel level labels can be generated . Figure shows this 4 A step .(1,2). utilize StyleGAN, Only a few composite images are annotated . Train an efficient branch to generate labels .(3). Automatically generate a huge synthetic annotation image data set .(4). Train your favorite methods with synthetic datasets , And test it on real images .
chart 2:DATASETGAN The overall architecture of . We from StyleGAN Upsampling features are mapped to the highest resolution , Construct pixel level feature vectors for all pixels on the composite image . Then train MLP The set of classifiers , The semantic knowledge in the pixel feature vector is interpreted into its component label .
chart 3“: Small human annotated face and car datasets . Most datasets used for semantic segmentation (MS-COCO [33], ADE [56], cityscape[11]) It's too big , The user cannot check every training image . In this picture , We showed all the marked faces (a-c) And cars (d-f) Split training example .a) Shows an example of a segmentation mask and associated tags ,b) Shows the complete set of training images (GAN sample ),c) Shows a partial list of dimensions and the number of instances in the dataset . An interesting fact is , Please note that , There are more tags in a single image than in a dataset .

chart 4: come from DATASETGAN Examples of synthetic images and labels of faces and cars .StyleGAN For backbone 1024 Zhang 1024 Resolution CelebA-HQ (faces) Images and 512 Zhang 384 Resolution LSUN CAR (cars) Image training .DATASETGAN use 16 An annotated example for training . // This is annotated What label is it ?

chart 5: come from DATASETGAN The birds of China 、 cat 、 Examples of composite images and labels for bedrooms .StyleGAN stay NABirds(10241024 A picture )、LSUN CAT(256256 A picture ) and LSUN Bedroom(256256 A picture ) To be trained on .DATASETGAN stay 30 Only annotated bird samples 、30 A cat and 40 Training in a bedroom .

chart 6: The number of training examples is the same as mIOU We compare... On the benchmark ADE-Car- 12 Test set . The red dotted line indicates the full supervision method , It makes use of information from ADE20k Of 2.6k Training examples . // mIOU What is it? ?

Method
The key insight of DATASETGAN is that generativemodels such as GANs that are trained to synthesize highlyrealistic images must acquire semantic knowledge in theirhigh dimensional latent space.
DATASET-GAN aims to utilize these powerful properties of imageGANs. Intuitively, if a human provides a labeling corre-sponding to one latent code, we expect to be able to effec-tively propagate this labeling across the GAN’s latent space.
Specifically, we synthesize a small num-ber of images by utilizing a GAN architecture, StyleGANin our paper, and record their corresponding latent featuremaps.
By sampling latent codeszand passing eachthrough the entire architecture, we have an infinite datasetgenerator!
This video explanation is not bad : https://www.bilibili.com/video/av502581865/
边栏推荐
- Tutorial 1:hello behavioc
- 挖财打新债安全吗
- "One week's work on Analog Electronics" - integrated operational amplifier
- 【pulsar学习】pulsar架构原理
- "One week to finish the model electricity" - 55 timer
- Behavior tree XML file hot load
- 3 big questions! Redis cache exceptions and handling scheme summary
- Notes on setting qccheckbox style
- 行为树的基本概念及进阶
- Course paper: Copula modeling code of portfolio risk VaR
猜你喜欢

Detectron2 outputs validation loss during training

"One week's work on Analog Electronics" - diodes

Spark based distributed parallel processing optimization strategy - Merrill Lynch data

Phpcms V9 mall module (fix the Alipay interface Bug)

Talk about the development of type-C interface

Kubernetes cluster deployment (v1.23.5)

51 single chip microcomputer ROM and ram

"One week's work on digital power" -- encoder and decoder

Solutions for safety management and control at the operation site

Adding confidence threshold for demo visualization in detectron2
随机推荐
Phpcms applet plug-in version 4.0 was officially launched
《一周搞定模电》—电源电路
Lagrange multiplier method
Phpcms V9 mobile phone access computer station one-to-one jump to the corresponding mobile phone station page plug-in
3 big questions! Redis cache exceptions and handling scheme summary
Tutorial 1:hello behavioc
JSON file to XML file
Super data processing operator helps you finish data processing quickly
PD fast magnetization mobile power supply scheme
【pulsar学习】pulsar架构原理
thinkphp5使用composer安装插件提示php版本过高
运行时端的执行流程
Phpcms mobile station module implements custom pseudo static settings
《一周学习模电》-电容、三极管、场效应管
Self taught neural network series - 1 Basic programming knowledge
"One week to solve the model electricity" - negative feedback
Execution process at runtime
【C】青蛙跳台阶和汉诺塔问题(递归)
Merrill Lynch data tempoai is new!
Dedecms applet plug-in is officially launched, and one click installation does not require any PHP or SQL Foundation