当前位置:网站首页>Learning pyramid context encoder network for high quality image painting paper notes
Learning pyramid context encoder network for high quality image painting paper notes
2022-07-24 05:00:00 【Magic__ Conch】
IEEE Conference Proceedings arXiv: Computer Vision and Pattern Recognition Jan 2019
List of articles
Problems solved and improvement
Existing methods cannot be combined Direct visual information and deep semantic information .
- patch search And others lack the understanding of high-level semantic consistency .
- generative models Of stacked constructions and poolings There is over-smooth, lack of visually-realistic Other questions .
Model structures,
With UNet For the skeleton , In the image-level and feature-level Fill the missing area on .
pyramid-context encoder: Use cross-layer The mechanism of attention transmission and pyramid filling

Each level 𝜓 From this layer feature map - 𝜙 and On a higher level 𝜓 Common process ATN( In style f) obtain .
Attention Transfer Network(ATN)( It's the one above f)
One 、 Reconstruct feature map from high-level semantics ψ L \psi^L ψL Fill in the next layer of feature map ϕ L − 1 \phi^{L-1} ϕL−1, To get the reconstruction feature map of the next layer ψ L − 1 \psi^{L-1} ψL−1.
First extract ψ l ψ^l ψl, And then calculate patch Cosine similarity between .

Then use on similarity Softmax Function to get each patch My attention score (Attention Score).

After obtaining the attention score of high-level semantic features ( Namely the above formula α i , j l α_{i,j}^l αi,jl), The feature map of the next level can be weighted by the attention score context Fill in .

Calculate all patch after , You can get ψ l − 1 ψ^{l−1} ψl−1 ( above i All calculations of can be formulated into convolution calculation for end-to-end training ).
Two 、 elaboration
The multi-scale context information is aggregated by four groups of dilated convolutions with different rates , This design ensures the consistency between the structure of the final reconstruction feature and the environment , Improved the repair effect of the test .
multi-scale decoder
- multi-scale decoder Approved by ATN Reconstruction features and encoder Of latent feature Make input .
- decoder Characteristic graph φ L − 1 、 φ L − 2 φ^{L−1} 、φ^{L−2} φL−1、φL−2 etc. , It is calculated from the following formula .

among , from ATN The generated reconstruction feature is that the missing region encodes lower level information , It is beneficial to use fine-grained details to generate visually realistic results ; Compact extracted by convolution latent When the feature can't find the object in the area outside the missing , Synthesize new objects .
Semantic consistency depends on deep convolution , The texture is consistent ATN Shallow features of reconstruction .
- Pyramid L1 losses

An adversarial training loss
The total loss function consists of :Generator + Discriminator
- Use PatchGAN(Image-to-Image Translation with Conditional Adversarial Networks) As part of this article discriminator, At the same time, spectral normalization is used to stabilize the training .
- In this paper ,pyramid-context encoder and multi-scale decoder constitute Generator.
The definition of the loss function :
Definition generator The final prediction result z:
z = G ( x ⊙ ( 1 − M ) , M ) ⊙ M + x ⊙ ( 1 − M ) z=G(x ⊙(1−M), M)⊙M+x ⊙(1−M) z=G(x⊙(1−M),M)⊙M+x⊙(1−M)discriminator The confrontation loss function of can be expressed as :

generator The confrontation loss function of is :

PEN-NET By minimizing counter losses and pyramid L1 Loss ( At the end of the last section ) To optimize , The overall objective function is :

model analysis
analysis pyramid L1 Loss and ATN The role of these two network components .
Pyramid L1 Loss
Pyramid L1 Loss The loss function is gradually refined at each scale ,pyramid loss It is conducive to decoding compact features layer by layer .
ATN
Cross layer attention transmission mechanism to U-Net Skeleton brings improvement .
The first behavior is pure... Without using any attention mechanism U-Net The Internet , The second line is no deeper guidance Of CA Method , The third layer is ATN Apply to U-Net Architectural results .
边栏推荐
- MapReduce concept
- Uniapp learning
- Middle aged crisis, workplace dad who dare not leave, how to face life's hesitation
- Introduction to MapReduce
- Forward proxy, reverse proxy and XFF
- C. Recover an RBS (parenthesis sequence, thinking)
- P loose integration of SDA during a configuration file. But in fact
- What programmer is still being grabbed by the company at the age of 35? Breaking the "middle-aged crisis" of programmers
- e D件系统 NFDavi化,对工程师达高
- 打印1000年到2000年之间的闰年
猜你喜欢

An online accident, I suddenly realized the essence of asynchrony

Problems and solutions of QT (online installation package) crash in win10 installation

Several common sorts

Activation functions and the 10 most commonly used activation functions

Little black gnawing leetcode:589. Preorder traversal of n-ary tree

Array force buckle (continuously updated)

Chiitoitsu (expected DP)

Smart pointer, lvalue reference, lvalue reference, lambda expression

Airiot Q & A issue 5 | how to use low code business flow engine?

激活函数和最常用的10个激活函数
随机推荐
The software cannot be uninstalled. Please wait for the current program to complete the uninstallation or change the solution
Want to know how a C program is compiled—— Show you the compilation of the program
Chapter 0 Introduction to encog
How is it that desktop icons can't be dragged? Introduction to the solution to the phenomenon that desktop file icons can't be dragged
uniapp学习
Unable to delete the file prompt the solution that the file cannot be deleted because the specified file cannot be found
C. Recover an RBS (parenthesis sequence, thinking)
How can e-commerce projects solve the over issuance of online coupons (troubleshooting + Solutions) (glory Collection)
Introduction to MapReduce
P loose integration of SDA during a configuration file. But in fact
链接预测中训练集、验证集以及测试集的划分(以PyG的RandomLinkSplit为例)
Event extraction and documentation (2020-2021)
C language: bubble sorting
Several common sorts
Godson leader spits bitterness: we have the world's first performance CPU, but unfortunately no one uses it!
Common cross domain problems
想知道一个C程序是如何进行编译的吗?——带你认识程序的编译
E d-piece system is nfdavi oriented, reaching a high level for engineers
Web3 product manager's Guide: how to face the encryption world
Quick reference manual for the strongest collation of common regular expressions (glory Collection Edition)