当前位置：网站首页>Self taught neural network series - 9 convolutional neural network CNN

Self taught neural network series - 9 convolutional neural network CNN

2022-06-26 09:10:00 【ML_ python_ get√】

Convolutional neural networks

1 Convolutional neural network background
2 Basic knowledge of convolutional neural network
3 Convolution neural network learning
- 3.1 Derivative of convolution
- 3.2 Back propagation algorithm
4 Other convolutions

1 Convolutional neural network background

CNN（Convolution Neural Network） Is a convolution layer 、 Feedforward neural network of pooling layer , It is mainly used to process image information 、 Text information .
Convolution neural network is used to process images , It aims to solve the problem of too many parameters of fully connected neural network . For example, a 1000 Pixel color image expansion 1000×1000×3=3M Dimension vector , Suppose the size of the hidden layer is 1000, Then the first full connection has to be trained 3B Parameters , This will make it difficult to train the neural network , Therefore, convolution kernel is introduced to realize the local connection between the layers of neural network , Reduce parameters .
Because the picture has translation invariance, that is, the target pixel is located anywhere in the picture, human beings can recognize , Location factors should not affect the recognition of target pixels . Therefore, it is reasonable to use the same convolution kernel to extract the information of any position in the picture .
The convolution layer is composed of multiple convolution kernels , Involves filling 、 Stride 、 The concept of access . The common convolutional neural networks mainly include LeNet、AlexNet、VGG、GoogleNet、ResNet、DenseNet etc. . This article serves as a personal study note , Only the concept of convolutional neural network is reviewed , Do not delve into the characteristics of various convolutional neural networks .

2 Basic knowledge of convolutional neural network

2.1 Convolution

Convolution can be seen as a custom operator , The result is about x The function of is generally referred to as the following integral operation ：
$\int_\tau f(\tau)g(x-\tau)d\tau$
In signal processing , Convolution is usually used to calculate the cumulative delay of a signal , among $f(\tau)$ Generate a function for the signal , $g (\cdot)$ Is the signal attenuation function , Also called filter or convolution kernel , When x distance $\tau$ The farther away, the worse the attenuation . When the signal function encounters a filter , The signal is filtered , It measures the size of the real signal that exists .
Different from the traditional one-dimensional convolution , The convolution kernel in convolution neural network uses two-dimensional convolution to extract the real information in the picture , Two dimensional convolution is defined as follows ：
$y_{ij} = \sum_{u=1}^U\sum_{v=1}^Vw_{uv}x_{i+1-u,j+1-v}$

chart 1 Two dimensional convolution 《 Neural networks and deep learning 》 Qiu Xipeng

chart 1 By calculating the two-dimensional convolution, we find , Convolution is a bottom-up sweep of data , Sweep from right to left to calculate the , The convolution kernel in convolution neural network is the transpose of two-dimensional convolution , This operation is also called cross correlation , Often used in time series , Analyze the correlation of different variables at different times . Specially , When the parameter in the convolution kernel is a learnable parameter , Whether it is two-dimensional convolution operation or cross-correlation operation , The neural network will automatically learn the proper parameters .
feature extraction ： Convolution kernel ( filter ) Its main function is feature extraction
- Gaussian filter ： Denoise and smooth the picture
- Edge filter ： Extract edge features
- wait

chart 2 Different convolution kernels 《 Neural networks and deep learning 》 Qiu Xipeng

2.2 Convolutional neural network structure

chart 3 Convolutional neural networks （ Source network ）

Convolution layer
- Convolution operation using convolution kernel
- Input ：36×36 Gray image or convolution kernel output
- Convolution kernel ：9×9 The size of the convolution kernel （ Different convolution layers are different 、 The same convolution layer can also be different ）
- Output ： The output through the convolution kernel is called a characteristic graph , Represent different features of images extracted by different convolution kernels
Pooling layer
- Feature selection using convolution kernel
- The convolution kernel is smaller than that of the picture , So the number of neurons in the convolution layer is still very large , Easy to overfit
- Feature selection is achieved by using maximum pooling layer or average pooling layer , Reduce the probability of over fitting
- The tendency of the pool layer to the scanning of the characteristic map can overlap or not overlap
- The pooled layer can not only down sample but also pass through max()、mean() Dimension reduction , Dimension reduction can also be realized by nonlinear activation function
Feel the field ： The size of all elements that affect the feature map or output is called receptive field , With the forward propagation of neural network , Each element represents a growing receptive field , That is, you can see a larger field of view of the original image .
The output of convolution kernel becomes smaller and smaller , The deeper the network , The more convolution kernels per layer , Prevent information loss
Convolutional neural networks are getting wider and wider , The output is getting smaller and smaller
The final output of convolutional neural network is generally expanded into a one-dimensional vector

2.3 Filling and stride of convolution kernel

fill ： To keep the input and output the same size , The input matrix is often filled
Stride ： The number of rows and columns that the convolution kernel slides from the input matrix each time is called stride , Generally high and wide Equal strides .
When calculating the output of convolution kernel, it is necessary to determine the filling and step size , Suppose the input size is $n_h$ , The convolution kernel size is $k_h$ , Fill the upper and lower sides $p_h$ , Fill the left and right sides $p_w$ , The up and down steps are $s_h$ , The left and right strides are $s_w$ , The output shape is ：
$shape = [(n_h-k_h+p_h)/s_h+1]×[(n_w-k_w+p_w)/s_w+1]$
Double your stride , The output will be halved

3 Convolution neural network learning

3.1 Derivative of convolution

The derivative of the loss function with respect to the parameters of the convolution kernel is equal to the convolution of the derivative of the loss function with respect to the output with respect to the input .
Consistency between derivative and derivative ： Multiplication corresponds to multiplication , Convolution operation corresponds to convolution operation
${\partial L(y, \hat y) \over \partial w_{ij}} = \sum_{i=1}^{n_h-k_h+1}\sum_{j=1}^{n_w-k_w+1}{\partial L(y, \hat y) \over \partial y_{ij}} {\partial y_{ij}\over \partial w_{ij}} =\sum_{i=1}^{n_h-k_h+1}\sum_{j=1}^{n_w-k_w+1}{\partial L(y, \hat y) \over \partial y_{ij}}x_{k_h+i-1,k_w+j-1}$
The input size is $n_h$ , The convolution kernel size is $k_h$
The convolution of the input data to the convolution kernel is transformed into the convolution of the input data to the error term of the layer

3.2 Back propagation algorithm

chart 4 Complete structure of convolutional neural network 《 Neural networks and deep learning 》 Qiu Xipeng

Forward propagation ： Input layer - Convolution layer - Activation function - Convergence layer
The error back propagation algorithm first passes through the pooling layer , And then to the activation function 、 Convolution operation back propagation
Back propagation algorithm
- Because the pooling layer performs the down sampling operation , Therefore, it is necessary to propagate through upsampling
- Maximum pool layer ： The error of only one element is propagated forward , During back propagation, the error term is transmitted to the neuron corresponding to the maximum output value of the upper layer , The other elements are set to 0
- Average pooling layer ： Forward propagation , All elements affect the error term , Therefore, the error term is evenly distributed to each neuron .
- $l$ The layers are convolutions
  ${\partial L(y, \hat y) \over \partial w_{ij}} = \sum_{i=1}^{n_h-k_h+1}\sum_{j=1}^{n_w-k_w+1}{\partial L(y, \hat y) \over \partial y_{ij}} {\partial y_{ij}\over \partial w_{ij}} =\sum_{i=1}^{n_h-k_h+1}\sum_{j=1}^{n_w-k_w+1}{\partial L(y, \hat y) \over \partial y_{ij}}x_{k_h+i-1,k_w+j-1}$
  - Convolution of input data to the error term of the layer
- $l + 1$ The layer is a pool layer
  $\delta^{l,p}={\partial L(y, \hat y) \over \partial y^{l,p}} = {\partial L(y, \hat y) \over \partial y^{l+1,p}} {\partial y^{l+1,p} \over \partial h^{l,p}} {\partial h^{l,p} \over \partial y^{l,p}} =\delta^{l+1,p}up(·){\partial h^{l,p} \over \partial y^{l,p}}$
  - $={\partial X^{l+1,p} \over \partial h^{l,p}}$ Represents the upsampling operation from the pooled layer to the hidden layer .
- $l + 1$ The layers are convolutions
  $\delta^{l}={\partial L(y, \hat y) \over \partial y^{l}} ={\partial L(y, \hat y) \over \partial h^{l}} {\partial h^{l} \over \partial y_{l}} = {\partial L(y, \hat y) \over \partial x^{l+1}} {\partial h^{l} \over \partial y_{l}}$
- according to W And X The symmetry of ${\partial L(y, \hat y) \over \partial x^{l+1}}$ It's No $l + 1$ The layer error term is about $l + 1$ Convolution of layer convolution kernel , So that we can

Recursively find ${\partial L(y, \hat y) \over \partial y^{l}}$ , And then find out ${\partial L(y, \hat y) \over \partial w_{ij}}$ .

4 Other convolutions

Transposition convolution ： Used to upgrade dimension , The two ends are filled with more 0
Micro step convolution ： Used to upgrade dimension , Strides less than 1
Cavity convolution ： Increase receptive field size , Insert a hole in the convolution kernel , Disguised expanded convolution kernel
The influencing factors of receptive field ： The layer number 、 Convolution kernel size 、 Converge

原网站

版权声明
本文为[ML_ python_ get√]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202170553131302.html