当前位置:网站首页>Transpose convolution explanation
Transpose convolution explanation
2022-06-24 16:07:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Transpose convolution explanation
The previous article explained convolution , I feel that now that I have been reorganized , Just sort out the series concept as a whole , It can also be regarded as showing off all the things you know . Transposition convolution (Transposed Convolution) It was later called , At first we all call it deconvolution / deconvolution (Deconvolution), This concept is proposed in the task of image segmentation , Image segmentation requires pixel by pixel operation , Do a segmentation for each pixel , Classify them into different objects . It is natural for us to use convolutional neural network to complete this task , Then we have to use convolution neural network to extract features , But there are two main components in convolutional neural networks , Convolution layer and down sampling layer will reduce the size of the image . This is not consistent with the pixel by pixel classification , Because pixel by pixel segmentation requires that the output and input sizes be consistent . In response to this question , It is proposed to use convolution kernel to extract features layer by layer , Then the feature map is gradually restored to the size of the original map by up sampling . The upsampling is realized by deconvolution at the beginning . If we say that the process characteristic graph of sampling under convolution kernel is smaller , Then the characteristic graph should become larger after upsampling . We should be familiar with the output size formula of convolution o u t = ( F − K + 2 P ) / s + 1 out=(F-K+2P)/s+1 out=(F−K+2P)/s+1, among F Indicates the size of the input feature drawing ,K Represents the size of the convolution kernel ,P Express padding,S Represents the step size of the convolution . We all use this formula to calculate the size of the output characteristic graph of convolution . Give an example of , One 4×4 Input characteristic diagram of , Convolution kernels for 3×3, If not used paddng, In steps of 1, Then it will be carried into the calculation o u t = ( 4 − 3 ) / 1 + 1 out=(4-3)/1+1 out=(4−3)/1+1 by 2. We've been im2col The implementation of convolution is explained in the introduction of algorithm , In fact, this step is accomplished by multiplying two matrices , We might as well remember as y = C x y=Cx y=Cx, If you want to upsample , We want to multiply the output characteristic graph by a parameter matrix , Then restore the size , According to the knowledge of Mathematics , We give the characteristic graph matrix y y y Take a left {C^T}, You can get C T y = C T C x C^Ty=C^TCx CTy=CTCx, C C C The number of columns is equal to x x x The number of rows , C T C C^TC CTC The number of rows and columns of are equal to x The number of rows , After finishing , The result is the sum of x x x The same shape . This is the source of transposed convolution names . Some of the work is really done in this way . We can also naturally draw a conclusion , We do not need to left multiply the output characteristic graph C T C^T CT, Obviously, as long as it has the same shape as this matrix , The output result is the same size as the original feature map , And this operation can also be realized by convolution , Then we just need to make sure that the shape is consistent , Then we can train by ourselves , The problem of this size has been solved , And the correspondence of features also has , It can be trained , Kill two birds with one stone . im2col Content of explanation , Convolution is ( C o u t , C i n ∗ K h ∗ K w ) (C_{out},C_{in}*K_h*K_w) (Cout,Cin∗Kh∗Kw) Convolution kernel multiplication of ( C i n ∗ K h ∗ K w , H N ∗ W N ) (C_{in}*K_h*K_w,H_N*W_N) (Cin∗Kh∗Kw,HN∗WN) Characteristic graph , obtain ( C o u t , H N ∗ W N ) (C_{out},H_N*W_N) (Cout,HN∗WN) Result . Now do a transpose on the convolution kernel ( C i n ∗ K h ∗ K w , C o u t ) (C_{in}*K_h*K_w,C_{out}) (Cin∗Kh∗Kw,Cout) ride ( C o u t , H N ∗ W N ) (C_{out},H_N*W_N) (Cout,HN∗WN) Get one ( C i n ∗ K h ∗ K w , H N ∗ W N ) (C_{in}*K_h*K_w,H_N*W_N) (Cin∗Kh∗Kw,HN∗WN) Characteristic graph . In addition to the above, here are some other things that need to be added , For example caffe In addition to im2col Outside of the function , Another function is col2im, That is to say im2col The inverse operation of . So for the above results caffe It's through col2im To convert into a characteristic graph . however col2im Function for im2col Just the inverse function of the shape , in fact , If you execute... For a feature graph first im2col Re execution col2im The result obtained is not equal to the original . And in the tensorflow and pytorch in , This is different , Both are transpose convolution operations based on eigengraph expansion , Both of them inflate the feature graph by filling , There may be another one after that crop operation . The reason why you need to fill , Because we want to realize transpose convolution directly through convolution operation , Just fill in some values , In this way, the size of the convoluted feature map is naturally larger . But both of them can not restore the original convolution , It's just a shape restoration . Finally, we can discuss the calculation of shape , Transpose convolution is the shape inverse operation of convolution , So the shape calculation is the inverse function of the original calculation method . o u t = ( F − K + 2 P ) / s + 1 out=(F-K+2P)/s+1 out=(F−K+2P)/s+1 The inverse function of this function is solved inversely , among mod Is the division of the upper formula s The remainder of .
This formula is used to calculate the output of deconvolution , because mod It's in addition to s Remainder obtained , in other words mod Less than s Of , When s=1 I can only be 0, A solution , When s > 1 s>1 s>1 when , There will be multiple solutions , This is the time , After specifying the parameters ,pytorch and tensorflow The feature map will be filled around , Then do convolution . There is a multiplication in this formula s, How does this work , Is to expand the characteristic graph . In the characteristic diagram, two cell Between , Fill in s-1 It's worth , After expansion, the characteristic graph will become ( o u t − 1 ) s + 1 (out-1)s+1 (out−1)s+1, At this time, it is necessary to base on mod To fill the outside . Make a convolution , Then, in fact, it will continue to fill according to the output size of the feature map , This calculation is more complicated , But my main purpose is to make inflation clear here , Here is a diagram to illustrate this expansion , As shown below s=2 The expansion of .
You can combine this formula with tf perhaps pytroch The parameters of the interface of correspond to , Because I don't want to talk about interfaces , So I won't repeat it here . For further understanding , I also drew the following picture .
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/151928.html Link to the original text :https://javaforall.cn
边栏推荐
- 山金期货安全么?期货开户都是哪些流程?期货手续费怎么降低?
- April 30, 2021: there are residential areas on a straight line, and the post office can only be built on residential areas. Given an ordered positive array arr
- 2021-04-18: given a two-dimensional array matrix, the value in it is either 1 or 0,
- Understanding openstack network
- Mysql之Binlog
- Linux record -4.22 MySQL 5.37 installation (supplementary)
- Some experiences of K project: global template highlights
- How to open a futures account safely? Which futures companies are more reliable?
- leetcode 139. Word Break 单词拆分(中等)
- 2021-04-29: given an array arr, it represents a row of balloons with scores. One for each blow
猜你喜欢
Solution to the problem that FreeRTOS does not execute new tasks
Logging is not as simple as you think
I just came back from the Ali software test. I worked for Alibaba P7 in 3+1, with an annual salary of 28*15
用 Oasis 开发一个跳一跳(一)—— 场景搭建
【附下载】汉化版Awvs安装与简单使用
Wechat official account debugging and natapp environment building
CAP:多重注意力机制,有趣的细粒度分类方案 | AAAI 2021
Linux record -4.22 MySQL 5.37 installation (supplementary)
nifi从入门到实战(保姆级教程)——环境篇
【应用推荐】最近大火的Apifox & Apipost 上手体验与选型建议
随机推荐
B. Terry sequence (thinking + greed) codeforces round 665 (Div. 2)
一文详解JackSon配置信息
2021-04-18: given a two-dimensional array matrix, the value in it is either 1 or 0,
MySQL Advanced Series: locks - locks in InnoDB
不忘初心
April 26, 2021: the length of the integer array arr is n (3 < = n < = 10^4), and each number is
找出隐形资产--利用Hosts碰撞突破边界
nifi从入门到实战(保姆级教程)——环境篇
C. K-th Not Divisible by n(数学+思维) Codeforces Round #640 (Div. 4)
How does the effective date of SAP PP ECM affect the work order?
[application recommendation] the hands-on experience and model selection suggestions of apifox & apipost in the recent fire
Global and Chinese market of music synthesizer 2022-2028: Research Report on technology, participants, trends, market size and share
Mongodb Getting started Practical Tutoriel: Learning Summary Table des matières
Ascinema with asciicast2gif for efficient command line terminal recording
如何轻松实现在线K歌房,与王心凌合唱《山海》
Istio FAQ: failed to resolve after enabling smart DNS
嵌入式开发基础之线程间通信
#夏日挑战赛# HarmonyOS - 实现带日期效果的待办事项
2021-04-22: given many line segments, each line segment has two numbers [start, end],
Implement Domain Driven Design - use ABP framework - domain logic & application logic