当前位置：网站首页>Learning pyramid context encoder network for high quality image painting paper notes

Learning pyramid context encoder network for high quality image painting paper notes

2022-07-24 05:00:00 【Magic__ Conch】

IEEE Conference Proceedings arXiv: Computer Vision and Pattern Recognition Jan 2019
Insert picture description here

Problems solved and improvement

Existing methods cannot be combined Direct visual information and deep semantic information .

patch search And others lack the understanding of high-level semantic consistency .
generative models Of stacked constructions and poolings There is over-smooth, lack of visually-realistic Other questions .

Model structures,

With UNet For the skeleton , In the image-level and feature-level Fill the missing area on .

pyramid-context encoder： Use cross-layer The mechanism of attention transmission and pyramid filling

Insert picture description here

Each level 𝜓 From this layer feature map - 𝜙 and On a higher level 𝜓 Common process ATN（ In style f） obtain .

Attention Transfer Network(ATN)（ It's the one above f）

One 、 Reconstruct feature map from high-level semantics $\psi^L$ Fill in the next layer of feature map $\phi^{L-1}$ , To get the reconstruction feature map of the next layer $\psi^{L-1}$ .

First extract $ψ^l$ , And then calculate patch Cosine similarity between .
Then use on similarity Softmax Function to get each patch My attention score （Attention Score）.
After obtaining the attention score of high-level semantic features （ Namely the above formula $α_{i,j}^l$ ）, The feature map of the next level can be weighted by the attention score context Fill in .

Insert picture description here
Calculate all patch after , You can get $ψ^{l−1}$ （ above i All calculations of can be formulated into convolution calculation for end-to-end training ）.

Two 、 elaboration

The multi-scale context information is aggregated by four groups of dilated convolutions with different rates , This design ensures the consistency between the structure of the final reconstruction feature and the environment , Improved the repair effect of the test .
Insert picture description here

multi-scale decoder

multi-scale decoder Approved by ATN Reconstruction features and encoder Of latent feature Make input .
decoder Characteristic graph $φ^{L−1} 、φ^{L−2}$ etc. , It is calculated from the following formula .

Insert picture description here
among , from ATN The generated reconstruction feature is that the missing region encodes lower level information , It is beneficial to use fine-grained details to generate visually realistic results ; Compact extracted by convolution latent When the feature can't find the object in the area outside the missing , Synthesize new objects .
Semantic consistency depends on deep convolution , The texture is consistent ATN Shallow features of reconstruction .

Pyramid L1 losses

An adversarial training loss

The total loss function consists of ：Generator + Discriminator

Use PatchGAN（Image-to-Image Translation with Conditional Adversarial Networks） As part of this article discriminator, At the same time, spectral normalization is used to stabilize the training .
In this paper ,pyramid-context encoder and multi-scale decoder constitute Generator.

The definition of the loss function ：

Definition generator The final prediction result z：
$z = G (x ⊙ (1 - M), M) ⊙ M + x ⊙ (1 - M)$
discriminator The confrontation loss function of can be expressed as ：
generator The confrontation loss function of is ：
PEN-NET By minimizing counter losses and pyramid L1 Loss （ At the end of the last section ） To optimize , The overall objective function is ：

model analysis

analysis pyramid L1 Loss and ATN The role of these two network components .

Pyramid L1 Loss

Pyramid L1 Loss The loss function is gradually refined at each scale ,pyramid loss It is conducive to decoding compact features layer by layer .
Insert picture description here

ATN

Cross layer attention transmission mechanism to U-Net Skeleton brings improvement .
Insert picture description here

The first behavior is pure... Without using any attention mechanism U-Net The Internet , The second line is no deeper guidance Of CA Method , The third layer is ATN Apply to U-Net Architectural results .

原网站

版权声明
本文为[Magic__ Conch]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/204/202207221819328578.html

当前位置：网站首页>Learning pyramid context encoder network for high quality image painting paper notes

Learning pyramid context encoder network for high quality image painting paper notes

List of articles

Problems solved and improvement

Model structures,

pyramid-context encoder： Use cross-layer The mechanism of attention transmission and pyramid filling

Attention Transfer Network(ATN)（ It's the one above f）

One 、 Reconstruct feature map from high-level semantics $\psi^L$ Fill in the next layer of feature map $\phi^{L-1}$ , To get the reconstruction feature map of the next layer $\psi^{L-1}$ .

Two 、 elaboration

multi-scale decoder

An adversarial training loss

The total loss function consists of ：Generator + Discriminator

The definition of the loss function ：

model analysis

Pyramid L1 Loss

ATN

边栏推荐

猜你喜欢

随机推荐

当前位置：网站首页>Learning pyramid context encoder network for high quality image painting paper notes

Learning pyramid context encoder network for high quality image painting paper notes

List of articles

Problems solved and improvement

Model structures,

pyramid-context encoder： Use cross-layer The mechanism of attention transmission and pyramid filling

Attention Transfer Network(ATN)（ It's the one above f）

One 、 Reconstruct feature map from high-level semantics ψ L \psi^L ψL Fill in the next layer of feature map ϕ L − 1 \phi^{L-1} ϕL−1, To get the reconstruction feature map of the next layer ψ L − 1 \psi^{L-1} ψL−1.

Two 、 elaboration

multi-scale decoder

An adversarial training loss

The total loss function consists of ：Generator + Discriminator

The definition of the loss function ：

model analysis

Pyramid L1 Loss

ATN

边栏推荐

猜你喜欢

随机推荐

One 、 Reconstruct feature map from high-level semantics $\psi^L$ Fill in the next layer of feature map $\phi^{L-1}$ , To get the reconstruction feature map of the next layer $\psi^{L-1}$ .