当前位置:网站首页>Transpose convolution explanation
Transpose convolution explanation
2022-06-24 16:07:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Transpose convolution explanation
The previous article explained convolution , I feel that now that I have been reorganized , Just sort out the series concept as a whole , It can also be regarded as showing off all the things you know . Transposition convolution (Transposed Convolution) It was later called , At first we all call it deconvolution / deconvolution (Deconvolution), This concept is proposed in the task of image segmentation , Image segmentation requires pixel by pixel operation , Do a segmentation for each pixel , Classify them into different objects . It is natural for us to use convolutional neural network to complete this task , Then we have to use convolution neural network to extract features , But there are two main components in convolutional neural networks , Convolution layer and down sampling layer will reduce the size of the image . This is not consistent with the pixel by pixel classification , Because pixel by pixel segmentation requires that the output and input sizes be consistent . In response to this question , It is proposed to use convolution kernel to extract features layer by layer , Then the feature map is gradually restored to the size of the original map by up sampling . The upsampling is realized by deconvolution at the beginning . If we say that the process characteristic graph of sampling under convolution kernel is smaller , Then the characteristic graph should become larger after upsampling . We should be familiar with the output size formula of convolution o u t = ( F − K + 2 P ) / s + 1 out=(F-K+2P)/s+1 out=(F−K+2P)/s+1, among F Indicates the size of the input feature drawing ,K Represents the size of the convolution kernel ,P Express padding,S Represents the step size of the convolution . We all use this formula to calculate the size of the output characteristic graph of convolution . Give an example of , One 4×4 Input characteristic diagram of , Convolution kernels for 3×3, If not used paddng, In steps of 1, Then it will be carried into the calculation o u t = ( 4 − 3 ) / 1 + 1 out=(4-3)/1+1 out=(4−3)/1+1 by 2. We've been im2col The implementation of convolution is explained in the introduction of algorithm , In fact, this step is accomplished by multiplying two matrices , We might as well remember as y = C x y=Cx y=Cx, If you want to upsample , We want to multiply the output characteristic graph by a parameter matrix , Then restore the size , According to the knowledge of Mathematics , We give the characteristic graph matrix y y y Take a left {C^T}, You can get C T y = C T C x C^Ty=C^TCx CTy=CTCx, C C C The number of columns is equal to x x x The number of rows , C T C C^TC CTC The number of rows and columns of are equal to x The number of rows , After finishing , The result is the sum of x x x The same shape . This is the source of transposed convolution names . Some of the work is really done in this way . We can also naturally draw a conclusion , We do not need to left multiply the output characteristic graph C T C^T CT, Obviously, as long as it has the same shape as this matrix , The output result is the same size as the original feature map , And this operation can also be realized by convolution , Then we just need to make sure that the shape is consistent , Then we can train by ourselves , The problem of this size has been solved , And the correspondence of features also has , It can be trained , Kill two birds with one stone . im2col Content of explanation , Convolution is ( C o u t , C i n ∗ K h ∗ K w ) (C_{out},C_{in}*K_h*K_w) (Cout,Cin∗Kh∗Kw) Convolution kernel multiplication of ( C i n ∗ K h ∗ K w , H N ∗ W N ) (C_{in}*K_h*K_w,H_N*W_N) (Cin∗Kh∗Kw,HN∗WN) Characteristic graph , obtain ( C o u t , H N ∗ W N ) (C_{out},H_N*W_N) (Cout,HN∗WN) Result . Now do a transpose on the convolution kernel ( C i n ∗ K h ∗ K w , C o u t ) (C_{in}*K_h*K_w,C_{out}) (Cin∗Kh∗Kw,Cout) ride ( C o u t , H N ∗ W N ) (C_{out},H_N*W_N) (Cout,HN∗WN) Get one ( C i n ∗ K h ∗ K w , H N ∗ W N ) (C_{in}*K_h*K_w,H_N*W_N) (Cin∗Kh∗Kw,HN∗WN) Characteristic graph . In addition to the above, here are some other things that need to be added , For example caffe In addition to im2col Outside of the function , Another function is col2im, That is to say im2col The inverse operation of . So for the above results caffe It's through col2im To convert into a characteristic graph . however col2im Function for im2col Just the inverse function of the shape , in fact , If you execute... For a feature graph first im2col Re execution col2im The result obtained is not equal to the original . And in the tensorflow and pytorch in , This is different , Both are transpose convolution operations based on eigengraph expansion , Both of them inflate the feature graph by filling , There may be another one after that crop operation . The reason why you need to fill , Because we want to realize transpose convolution directly through convolution operation , Just fill in some values , In this way, the size of the convoluted feature map is naturally larger . But both of them can not restore the original convolution , It's just a shape restoration . Finally, we can discuss the calculation of shape , Transpose convolution is the shape inverse operation of convolution , So the shape calculation is the inverse function of the original calculation method . o u t = ( F − K + 2 P ) / s + 1 out=(F-K+2P)/s+1 out=(F−K+2P)/s+1 The inverse function of this function is solved inversely , among mod Is the division of the upper formula s The remainder of .
This formula is used to calculate the output of deconvolution , because mod It's in addition to s Remainder obtained , in other words mod Less than s Of , When s=1 I can only be 0, A solution , When s > 1 s>1 s>1 when , There will be multiple solutions , This is the time , After specifying the parameters ,pytorch and tensorflow The feature map will be filled around , Then do convolution . There is a multiplication in this formula s, How does this work , Is to expand the characteristic graph . In the characteristic diagram, two cell Between , Fill in s-1 It's worth , After expansion, the characteristic graph will become ( o u t − 1 ) s + 1 (out-1)s+1 (out−1)s+1, At this time, it is necessary to base on mod To fill the outside . Make a convolution , Then, in fact, it will continue to fill according to the output size of the feature map , This calculation is more complicated , But my main purpose is to make inflation clear here , Here is a diagram to illustrate this expansion , As shown below s=2 The expansion of .
You can combine this formula with tf perhaps pytroch The parameters of the interface of correspond to , Because I don't want to talk about interfaces , So I won't repeat it here . For further understanding , I also drew the following picture .
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/151928.html Link to the original text :https://javaforall.cn
边栏推荐
- Summer Challenge harmonyos - to do list with date effect
- How to expand disk space on AWS host
- 2021-04-25: given an array arr and a positive number m, the
- Using alicloud RDS for SQL Server Performance insight to optimize database load - first understanding of performance insight
- 个人常用的高效工具
- 【云原生 | Kubernetes篇】Kubernetes基础入门(三)
- Installer la Bibliothèque imagemagick 7.1 et l'extension imagick de PHP
- Several common DoS attacks
- 我与“Apifox”的网络情缘
- Nifi from introduction to practice (nanny level tutorial) - environment
猜你喜欢

Mongodb introductory practical tutorial: learning summary directory

Why is it easy for enterprises to fail in implementing WMS warehouse management system

Understanding openstack network

The equipment is connected to the easycvr platform through the national standard gb28181. How to solve the problem of disconnection?

I just came back from the Ali software test. I worked for Alibaba P7 in 3+1, with an annual salary of 28*15

Still worried about missing measurements? Let's use Jacobo to calculate the code coverage

Using oasis to develop a hop by hop (I) -- Scene Building

C. Three displays(动态规划)Codeforces Round #485 (Div. 2)

CAP:多重注意力机制,有趣的细粒度分类方案 | AAAI 2021

【云原生 | Kubernetes篇】Kubernetes基础入门(三)
随机推荐
How to open a futures account safely? Which futures companies are more reliable?
How to implement SQLSERVER database migration in container
B. Terry sequence (thinking + greed) codeforces round 665 (Div. 2)
[log service CLS] Tencent cloud log4j/logback log collection best practices
Understanding openstack network
Cap: multiple attention mechanism, interesting fine-grained classification scheme | AAAI 2021
Detailed explanation of estab of Stata regression table output
Linux record -4.22 MySQL 5.37 installation (supplementary)
Ascinema with asciicast2gif for efficient command line terminal recording
ZOJ - 4104 sequence in the pocket
2021-04-28: force buckle 546, remove the box. Give some boxes of different colors
The penetration of 5g users of operators is far slower than that of 4G. The popularity of 5g still depends on China Radio and television
C. K-th Not Divisible by n(数学+思维) Codeforces Round #640 (Div. 4)
Several common DoS attacks
Remote connection raspberry pie in VNC Viewer Mode
Product level design of a project in SAP mm
What is a framework?
对深度可分离卷积、分组卷积、扩张卷积、转置卷积(反卷积)的理解
实现领域驱动设计 - 使用ABP框架 - 领域逻辑 & 应用逻辑
clang: warning: argument unused during compilation: ‘-no-pie‘ [-Wunused-command-line-argument]