当前位置:网站首页>[paper reading] unpaired image to image translation using cycle consistent advantageous networks
[paper reading] unpaired image to image translation using cycle consistent advantageous networks
2022-07-25 20:24:00 【xiongxyowo】
[ Address of thesis ][ Code ][ICCV 17]
Abstract
Image to image translation is a kind of visual and graphic problems , The goal is to use training of a set of aligned image pairs to learn the mapping between input images and output images . However , For many tasks , Paired training data is not available . We propose a way , Learn to remove images from the source domain without pairing instances X Translate to the target domain Y. Our goal is to learn a mapping G : X − > Y G:X->Y G:X−>Y, send G ( X ) G(X) G(X) Image distribution and use of antagonistic loss Y Y Y The distribution is indistinguishable . Because this mapping is highly under constrained , Let's map it to a reverse F : Y − > X F: Y -> X F:Y−>X Combine , And introduce a cyclic consistency loss to promote F ( G ( X ) ) X F(G(X)) ~ X F(G(X)) X( vice versa ). Qualitative results are presented on several tasks that do not have paired training data , Including the transfer of collection style 、 Object conversion 、 Shift of seasons 、 Photo enhancement and so on . The quantitative comparison with several previous methods shows the superiority of our method .
Method
This article is famous CycleGAN, The core idea of the method is as follows :
It consists of two generators ( G , F ) (G, F) (G,F) And two discriminators ( D X , D Y ) (D_X, D_Y) (DX,DY) constitute . For the input source domain image X X X, Send it to the first generator G G G, Then you can get a false target domain image G ( X ) G(X) G(X). Judging device D Y D_Y DY Need to be able to distinguish the actual target domain image Y Y Y And false target domain images G ( X ) G(X) G(X), So that the generated G ( X ) G(X) G(X) The style features included are more ; At the same time , Swap the target domain with the source domain , Then the target and image Y Y Y Sending a generator F F F after , You can get a fake source domain image G ( Y ) G(Y) G(Y). Judging device D X D_X DX You need to be able to distinguish the actual source domain image X X X And fake source domain images G ( Y ) G(Y) G(Y), So that the generated G ( Y ) G(Y) G(Y) The style features contained are more realistic .
The advantage of this is , Because the task of image style conversion in this paper is " Unsupervised ", No matching " From the - Target domain " The image is right , It is equivalent to only being able to constrain whether the generated image meets the new style , There is no way to constrain whether the generated image is consistent in content . And with cycle After the form , After a picture goes in , First, it becomes G ( X ) G(X) G(X), And then it becomes F ( G ( X ) ) F(G(X)) F(G(X)), By restraint X X X Should be the same F ( G ( X ) ) F(G(X)) F(G(X)) As similar as possible , So as to ensure that the network still needs to maintain details as much as possible while learning how to change styles , To achieve one " Self supervision ".
The loss function consists of two parts , One is to restrict the image style to complete the conversion of the confrontation loss : L GAN ( G , D Y , X , Y ) = E y ∼ p data ( y ) [ log D Y ( y ) ] + E x ∼ p data ( x ) [ log ( 1 − D Y ( G ( x ) ) ] \mathcal{L}_{\text{GAN}}(G,\ D_{Y},\ X,\ Y) = \mathbb{E}_{y\sim p_{\text{data}}(y)}[\log D_{Y}(y)]+\mathbb{E}_{x\sim p_{\text{data}}(x)}[\log(1- D_{Y}(G(x))] LGAN(G, DY, X, Y)=Ey∼pdata(y)[logDY(y)]+Ex∼pdata(x)[log(1−DY(G(x))]
This loss is necessary as long as style conversion is done , There's nothing to say . The other is the cyclic consistency loss of keeping the constraint content consistent : L cyc ( G , F ) = E x ∼ p data ( x ) [ ∥ F ( G ( x ) ) − x ∥ 1 ] + E y ∼ p data ( ( y ) [ ∥ G ( F ( y ) ) − y ∥ 1 ] \mathcal{L}_{\text{cyc}}(G,\ F)=\mathbb{E}_{x\sim p_{\text{data}}(x)}[\Vert F(G(x))-x \Vert_{1}]+\mathbb{E}_{y\sim p_{\text{data}}((y)}[\Vert G(F(y))-y \Vert_{1}] Lcyc(G, F)=Ex∼pdata(x)[∥F(G(x))−x∥1]+Ey∼pdata((y)[∥G(F(y))−y∥1]
For this kind of " Unsupervised " In terms of image style conversion , The upper limit of its effect is Pix2Pix such " Supervised " In the form of .CycleGAN One of the main problems of is the inability to deal with geometric transformations , Because the loss of cyclic consistency will make the content of the image as unchanged as possible in the process of converting to the target domain , That is, it is more likely to be " cat => cat => cat ", And it's hard " cat => Dog => cat ".
边栏推荐
- Myormframeworkjdbc review and problem analysis of user-defined persistence layer framework, and thought analysis of user-defined persistence layer framework
- 每条你收藏的资讯背后,都离不开TA
- Notes - record a cannotfinddatasourceexception: dynamic datasource can not find primary datasource problem solving
- 【高等数学】【6】多元函数微分学
- CarSim仿真快速入门(十四)—CarSim-Simulink联合仿真
- Link list of sword finger offer question bank summary (III) (C language version)
- Kubernetes进阶部分学习笔记
- Difference Between Accuracy and Precision
- 【高等数学】【8】微分方程
- How much memory does bitmap occupy in the development of IM instant messaging?
猜你喜欢

Remote monitoring solution of intelligent electronic boundary stake Nature Reserve

Difference Between Accuracy and Precision

JVM(二十三) -- JVM运行时参数
![[today in history] July 2: BitTorrent came out; The commercial system linspire was acquired; Sony deploys Playstation now](/img/7d/7a01c8c6923077d6c201bf1ae02c8c.png)
[today in history] July 2: BitTorrent came out; The commercial system linspire was acquired; Sony deploys Playstation now
![[today in history] July 7: release of C; Chrome OS came out;](/img/a6/3170080268a836f2e0973916d737dc.png)
[today in history] July 7: release of C; Chrome OS came out; "Legend of swordsman" issued

Technology cloud report: what is the difference between zero trust and SASE? The answer is not really important

Notes - record a cannotfinddatasourceexception: dynamic datasource can not find primary datasource problem solving
![[onnx] export pytorch model to onnx format: support multi parameter and dynamic input](/img/bd/e9a1d3a2c9343b75dbae5c7e18a87b.png)
[onnx] export pytorch model to onnx format: support multi parameter and dynamic input

Docker 搭建 Redis Cluster集群
![[advanced mathematics] [5] definite integral and its application](/img/b2/62748b7533982f2b864148e0857490.png)
[advanced mathematics] [5] definite integral and its application
随机推荐
Aircraft PID control (rotor flight control)
4、Nacos 配置中心源码解析之 服务端启动
Rainbond插件扩展:基于Mysql-Exporter监控Mysql
移动web布局方法
【高等数学】【8】微分方程
[today in history] June 28: musk was born; Microsoft launched office 365; The inventor of Chua's circuit was born
tga文件格式(波形声音文件格式)
如何保证定制滑环质量
[today in history] July 19: the father of IMAP agreement was born; Project kotlin made a public appearance; New breakthroughs in CT imaging
Docker 搭建 Redis Cluster集群
Google pixel 6A off screen fingerprint scanner has major security vulnerabilities
[today in history] July 1: the father of time-sharing system was born; Alipay launched barcode payment; The first TV advertisement in the world
【NOI模拟赛】字符串匹配(后缀自动机SAM,莫队,分块)
“链”接无限可能:数字资产链,精彩马上来!
「分享」DevExpress ASP.NET v22.1最新版本系统环境配置要求
分享 25 个有用的 JS 单行代码
[advanced mathematics] [5] definite integral and its application
FormatDateTime说解[通俗易懂]
Vivo official website app full model UI adaptation scheme
[today in history] July 17: Softbank acquired arm; The first email interruption; Wikimedia International Conference