当前位置:网站首页>Transpose convolution learning notes
Transpose convolution learning notes
2022-06-24 16:06:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
List of articles
- 1. Transpose convolution definition
- 2. Custom transpose convolution
- 3. Transposition convolution
- 4. Think about transposed convolution from the perspective of convolution [ a key ]
- 5. Initialization of transpose convolution
- 6. Transpose convolution image application
1. Transpose convolution definition
In the prediction process of semantic segmentation , We need to detect each pixel , Then there is a problem , First, we compress the input image by two-dimensional convolutional neural network , Finally, we get a prediction , But if we need to recognize each pixel , It is necessary to deduce the category in each pixel by prediction . for instance , When we recognize cats and dogs , We don't just have to identify where the cat is , Also identify each pixel about the cat , So we need to use transpose convolution . Transpose convolution can make the image larger and larger , Make the generated image have the same size as the original image , Then we can easily perform semantic segmentation .
2. Custom transpose convolution
The concrete is , With convolution kernel K=torch.tensor([[0,1],[2,3]]), Keep following every element in the input x i , j x_{i,j} xi,j Multiply , Finally, all the elements are added to get the output convolution , The details are shown in the figure above .
- Code
# Import database
# 1. Import database
import torch
from torch import nn
# 2. Define the input matrix x
x = torch.Tensor([[0, 1], [2, 3]])
k = torch.Tensor([[0, 1], [2, 3]])
# 3. Define transpose convolution function
def tran_conv(x, k):
h, w = k.shape
y = torch.zeros((x.shape[0] + h - 1, x.shape[1] + w - 1))
for i in range(x.shape[0]):
for j in range(x.shape[1]):
y[i:i + h, j:j + w] += x[i, j] * k
return y
# 4. Define the input tensor X , Transposed convolution kernel K
X = torch.Tensor([[0, 1], [2, 3]])
K = torch.Tensor([[0, 1], [2, 3]])
# 5. Output Y
Y = tran_conv(X, K)
print(f'Y={
Y}')
# 6. take X,K It becomes a four-dimensional tensor , Convenient convolution calculation
X_conv = X.reshape(1, 1, 2, 2)
K_conv = K.reshape(1, 1, 2, 2)
# 7. Define two-dimensional transpose convolution operation , Input channel 1, Output channel 1, Convolution kernel K by 2 X 2 , Unbiased
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, bias=False)
# 8. take The value of the transposed convolution kernel is assigned to K_conv
tconv.weight.data = K_conv
# 9. Input tensor (X_conv) -> Transposition convolution (tconv+K_conv) -> Output tensor
Y_conv = tconv(X_conv)
# 10. To check whether the calculation result is consistent with our customized value , Batch to 1 Remove the dimension of
Y_conv_squeeze = Y_conv.squeeze()
print(f'Y_conv={
Y_conv}')
print(f'Y_conv_squeeze={
Y_conv_squeeze}')
# 11. Judge whether the value of the user-defined transpose convolution function is consistent with the value of the official calling function
print(f'Y == Y_conv_squeeze:{
Y == Y_conv_squeeze}')- result
Y=tensor([[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]])
Y_conv=tensor([[[[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]]]], grad_fn=<SlowConvTranspose2DBackward>)
Y_conv_squeeze=tensor([[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]], grad_fn=<SqueezeBackward0>)
Y == Y_conv_squeeze:tensor([[True, True, True],
[True, True, True],
[True, True, True]])3. Transposition convolution
- padding: Acting in the output tensor , Subtract... From rows and columns padding Row and column
- stride: In the intermediate matrix
- Code
# 1. Import database
import torch
from torch import nn
# 2. Define the input matrix x
x = torch.Tensor([[0, 1], [2, 3]])
k = torch.Tensor([[0, 1], [2, 3]])
# 3. Define transpose convolution function
def tran_conv(x, k):
h, w = k.shape
y = torch.zeros((x.shape[0] + h - 1, x.shape[1] + w - 1))
for i in range(x.shape[0]):
for j in range(x.shape[1]):
y[i:i + h, j:j + w] += x[i, j] * k
return y
# 4. Define the input tensor X , Transposed convolution kernel K
X = torch.Tensor([[0, 1], [2, 3]])
K = torch.Tensor([[0, 1], [2, 3]])
# 5. Output Y
Y = tran_conv(X, K)
print(f'Y={
Y}')
# 6. take X,K It becomes a four-dimensional tensor , Convenient convolution calculation
X_conv = X.reshape(1, 1, 2, 2)
K_conv = K.reshape(1, 1, 2, 2)
# 7. Define two-dimensional transpose convolution operation , Input channel 1, Output channel 1, Convolution kernel K by 2 X 2 , Unbiased
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, bias=False)
# 8. take The value of the transposed convolution kernel is assigned to K_conv
tconv.weight.data = K_conv
# 9. Input tensor (X_conv) -> Transposition convolution (tconv+K_conv) -> Output tensor
Y_conv = tconv(X_conv)
# 10. To check whether the calculation result is consistent with our customized value , Batch to 1 Remove the dimension of
Y_conv_squeeze = Y_conv.squeeze()
print(f'Y_conv={
Y_conv}')
print(f'Y_conv_squeeze={
Y_conv_squeeze}')
# 11. Judge whether the value of the user-defined transpose convolution function is consistent with the value of the official calling function
print(f'Y == Y_conv_squeeze:{
Y == Y_conv_squeeze}')
# 12.padding = 1 , In transpose convolution, the output is subtracted padding=1 Rows and columns of
tconv_padding_1 = nn.ConvTranspose2d(1, 1, kernel_size=2, padding=1, bias=False)
tconv_padding_1.weight.data = K_conv
Y_conv_padding_1 = tconv_padding_1(X_conv)
print(f'Y_conv_padding_1={
Y_conv_padding_1}')
# 13.stride = 2 , Transpose convolution is a step expansion in the middle , Convolution kernel K In the input X Sliding in two steps ,
# Input is [2,2], Transposed convolution kernel [2,2],stride=2 The output Y The size is 4 = (2-1)*2+2-1+1
tconv_stride_2 = nn.ConvTranspose2d(1, 1, kernel_size=2, stride=2, bias=False)
tconv_stride_2.weight.data = K_conv
Y_conv_stride_2 = tconv_stride_2(X_conv)
print(f'Y_conv_stride_2={
Y_conv_stride_2}')- result
Y=tensor([[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]])
Y_conv=tensor([[[[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]]]], grad_fn=<SlowConvTranspose2DBackward>)
Y_conv_squeeze=tensor([[ 0., 0., 1.],
[ 0., 4., 6.],
[ 4., 12., 9.]], grad_fn=<SqueezeBackward0>)
Y == Y_conv_squeeze:tensor([[True, True, True],
[True, True, True],
[True, True, True]])
Y_conv_padding_1=tensor([[[[4.]]]], grad_fn=<SlowConvTranspose2DBackward>)
Y_conv_stride_2=tensor([[[[0., 0., 0., 1.],
[0., 0., 2., 3.],
[0., 2., 0., 3.],
[4., 6., 6., 9.]]]], grad_fn=<SlowConvTranspose2DBackward>)4. Think about transposed convolution from the perspective of convolution [ a key ]
4.1 explain
Transpose convolution is a kind of convolution
- It rearranges the inputs and cores
- The same convolution is generally different from the down sampling , It is usually used as up sampling
- If convolution will be input from (h,w) become (h’,w’), Under the same super parameter, it will (h’,w’) become (h,w)
4.2 Fill in with 0, The stride is 1
4.3 Fill in with p, The stride is 1
4.4 Fill in with p, The stride is s
5. Initialization of transpose convolution
Transpose convolution is the same as ordinary convolution , We all need to initialize the convolution kernel , Bilinear interpolation is commonly used to initialize convolution kernel. The specific code is as follows ;
- Code
# -*- coding: utf-8 -*-
# @Project: zc
# @Author: zc
# @File name: os_test
# @Create time: 2022/1/4 8:38
import torch
def bilinear_kernel(in_channels, out_channels, kernel_size):
""" :param in_channels: Enter the number of channels :param out_channels: Number of output channels :param kernel_size: Convolution kernel size :return: """
# factor = 2
factor = (kernel_size + 1) // 2
# center = 1.5
if kernel_size % 2 == 1:
center = factor - 1
else:
center = factor - 0.5
# Create a Yuanzu og[0] = tensor[4,1];og[1]=tensor[1,4]
og = (torch.arange(kernel_size).reshape(-1, 1),
torch.arange(kernel_size).reshape(1, -1))
# Perform interpolation calculation , Generate 4 x 4 Matrix
filt = (1 - torch.abs(og[0] - center) / factor) * (1 - torch.abs(og[1] - center) / factor)
# Generate an all for 0 Matrix (in_chanels,out_channels,kernel_size,kernel_size)
# Will be initialized filt The value is put into weight Inside
weight = torch.zeros((in_channels, out_channels,
kernel_size, kernel_size))
# weight The shape of the , take filt The overall value is assigned diagonally
# [[filt],[0],[0]]
# [0],[filt],[0]
# [0],[0],[filt]]
weight[range(in_channels), range(out_channels), :, :] = filt
return weight
# y :[3,3,4,4]
y = bilinear_kernel(3, 3, 4)
print(f'y={
y}')
print(f'y_shape={
y.shape}')
print(f'y0={
y[0]}')
print(f'y1={
y[1]}')
print(f'y2={
y[2]}')- result
y=tensor([[[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]]],
[[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]]],
[[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]]]])
y_shape=torch.Size([3, 3, 4, 4])
y0=tensor([[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]]])
y1=tensor([[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]]])
y2=tensor([[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000]],
[[0.0625, 0.1875, 0.1875, 0.0625],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.1875, 0.5625, 0.5625, 0.1875],
[0.0625, 0.1875, 0.1875, 0.0625]]])6. Transpose convolution image application
We can double the height and width of an image by transpose convolution .
- Code
# -*- coding: utf-8 -*-
# @Project: zc
# @Author: zc
# @File name: os_test
# @Create time: 2022/1/4 8:38
import os
import matplotlib.pyplot as plt
import torch
import torchvision.transforms
from torch import nn
from d2l import torch as d2l
def bilinear_kernel(in_channels, out_channels, kernel_size):
""" :param in_channels: Enter the number of channels :param out_channels: Number of output channels :param kernel_size: Convolution kernel size :return: """
# factor = 2
factor = (kernel_size + 1) // 2
# center = 1.5
if kernel_size % 2 == 1:
center = factor - 1
else:
center = factor - 0.5
# Create a Yuanzu og[0] = tensor[4,1];og[1]=tensor[1,4]
og = (torch.arange(kernel_size).reshape(-1, 1),
torch.arange(kernel_size).reshape(1, -1))
# Perform interpolation calculation , Generate 4 x 4 Matrix
filt = (1 - torch.abs(og[0] - center) / factor) * (1 - torch.abs(og[1] - center) / factor)
# Generate an all for 0 Matrix (in_chanels,out_channels,kernel_size,kernel_size)
# Will be initialized filt The value is put into weight Inside
weight = torch.zeros((in_channels, out_channels,
kernel_size, kernel_size))
# weight The shape of the , take filt The overall value is assigned diagonally
# [[filt],[0],[0]]
# [0],[filt],[0]
# [0],[0],[filt]]
weight[range(in_channels), range(out_channels), :, :] = filt
return weight
conv_trans = nn.ConvTranspose2d(3, 3, kernel_size=4, padding=1, stride=2, bias=False)
conv_trans.weight.data.copy_(bilinear_kernel(3, 3, 4))
path = os.path.join(os.getcwd(), 'img', 'banana.jpg')# path='D:\\zc\\img\\banana.jpg'
print(f'path={
path}')
# in_img = (3,256,256)
in_img = torchvision.transforms.ToTensor()(d2l.Image.open(path))
# X = (1,3,256,256)
X = in_img.unsqueeze(0)
# Y = (1,3,512,512) Transposed convoluted stride=2 So the size is enlarged 2 times
Y = conv_trans(X)
# out_img = [512,512,3]
out_img = Y[0].permute(1, 2, 0).detach()
d2l.set_figsize()
# in_img.permute(1,2,0) = [256,256,3]
print('input image shape:', in_img.permute(1, 2, 0).shape)
# d2l.plt.imshow(in_img.permute(1, 2, 0))
print('output_image_shape:', out_img.shape)
d2l.plt.imshow(out_img)
print('output_image_shape_after:',out_img.shape)
plt.show()- result
path=D:\zc\img\banana.jpg
input image shape: torch.Size([256, 256, 3])
output_image_shape: torch.Size([512, 512, 3])
output_image_shape_after: torch.Size([512, 512, 3])Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/151948.html Link to the original text :https://javaforall.cn
边栏推荐
- 一文详解JackSon配置信息
- 找出隐形资产--利用Hosts碰撞突破边界
- Software test [high frequency] interview questions sorted out by staying up late (latest in 2022)
- [my advanced OpenGL learning journey] learning notes of OpenGL coordinate system
- Mongodb Getting started Practical Tutoriel: Learning Summary Table des matières
- Paper: Google TPU
- Global and Chinese market of music synthesizer 2022-2028: Research Report on technology, participants, trends, market size and share
- Goby+AWVS 实现攻击面检测
- Remain true to our original aspiration
- great! The novel website project is completely open source
猜你喜欢

我与“Apifox”的网络情缘

Solution to the problem that FreeRTOS does not execute new tasks

Logging is not as simple as you think

【应用推荐】最近大火的Apifox & Apipost 上手体验与选型建议

The penetration of 5g users of operators is far slower than that of 4G. The popularity of 5g still depends on China Radio and television

Intelij 中的 Database Tools可以连接但是无法显示SCHEMA, TABLES

I just came back from the Ali software test. I worked for Alibaba P7 in 3+1, with an annual salary of 28*15

60 divine vs Code plug-ins!!
![clang: warning: argument unused during compilation: ‘-no-pie‘ [-Wunused-command-line-argument]](/img/f0/42f394dbc989d381387c7b953d2a39.jpg)
clang: warning: argument unused during compilation: ‘-no-pie‘ [-Wunused-command-line-argument]

Linux record -4.22 MySQL 5.37 installation (supplementary)
随机推荐
【云原生 | Kubernetes篇】Kubernetes基础入门(三)
我与“Apifox”的网络情缘
One article explains Jackson configuration information in detail
2021-04-24: handwriting Code: topology sorting.
[cloud native | kubernetes chapter] Introduction to kubernetes Foundation (III)
Pytorch 转置卷积
Why is it easy for enterprises to fail in implementing WMS warehouse management system
Global and Chinese market of inverted syrup 2022-2028: Research Report on technology, participants, trends, market size and share
一文详解JackSon配置信息
Global and Chinese market for commercial barbecue smokers 2022-2028: Research Report on technology, participants, trends, market size and share
中国产品经理的没落:从怀恋乔布斯开始谈起
Three solutions for Jenkins image failing to update plug-in Center
Paper: Google TPU
clang: warning: argument unused during compilation: ‘-no-pie‘ [-Wunused-command-line-argument]
Some experiences of project K several operations in the global template
用 Oasis 开发一个跳一跳(一)—— 场景搭建
Global and Chinese markets of natural insect repellents 2022-2028: Research Report on technology, participants, trends, market size and share
April 23, 2021: there are n cities in the TSP problem, and there is a distance between any two cities
[log service CLS] Tencent cloud log4j/logback log collection best practices
April 26, 2021: the length of the integer array arr is n (3 < = n < = 10^4), and each number is