当前位置:网站首页>Convolutional neural networks for machine learning -- an introduction to CNN

Convolutional neural networks for machine learning -- an introduction to CNN

2022-06-28 16:13:00 Hua Weiyun

Convolutional neural networks –CNN

1. Convolution neural network introduction

Convolutional neural networks (Convolutional Neural Networks,CNN) It's a kind of bag
A feedforward neural network with convolution computation and depth structure , It is one of the representative algorithms of deep learning .
common CNN The Internet has LeNet-5、VGGNet、GoogleNet、ResNet、
DenseNet、MobileNet etc. .
CNN Main application scenarios : Image classification 、 Image segmentation 、 object detection 、 Natural Language Division
Science and other fields .

2. Basic structure and principle of convolutional neural network

The basic structure of convolutional neural network

 Insert picture description here

CNN The basic structure :INPUT -> Convolution -> Activate -> Pooling -> Full connection ->OUTPUT

Convolution layer

The input image data and convolution kernel are convoluted to extract the high-order features of the image
Several parameters of convolution process
1、 depth (depth): Number of convolution kernels , Also known as the number of neurons , Determine the number of output characteristic graphs .
 Insert picture description here

2、 step (stride): The size of the convolution kernel sliding once , Decide how many steps you can take to reach the edge .
 Insert picture description here
 Insert picture description here

3、 Fill value (padding): Add at the outer edge 0 The number of layers .

Convolution process

 Insert picture description here

Two main characteristics of convolutional networks
1、 Local awareness
2、 Weight sharing
Activation layer 、Relu function

Pooling layer

Down sampling (downsamples), Compress the input feature graph ;
On the one hand, make the feature map smaller , Simplify network computing complexity , Effectively control over fitting ;
On the other hand, feature compression , Extract the main features .
Pooling , The scale is usually 2*2, Operation generally includes 2 Kind of :

  • Maximum pooling (Max Pooling). take 4 The maximum of points . This is the most common pooling method .
  • Mean pooling (Mean Pooling). take 4 The mean of the points .
  •  Insert picture description here

Fully connected layer

Connect all features , Send the output value to the classifier , Implementation classification .
 Insert picture description here

3. pytorch Implementation of convolution in

Convolution layer

torch.nn.Conv2d()
Parameter description
in_channels: Enter the number of channels ( depth )
out_channels: Number of output channels ( depth )
kernel_size: filter ( Convolution kernel ) size
stride: Represents the step size of filter sliding
padding: Whether to zero fill
bias: The default is True, Indicates the use of offset
groups: Control packet convolution , Do not group by default , by 1 Group .
dilation: Convolution space between inputs , The default is True

Activation layer

torch.nn.ReLU()
Parameter description
inplace: Whether to operate on the original data , The default is False

Pooling layer

torch.nn.MaxPool2d()
torch.nn.AvgPool2d()
Parameter description
kernel_size : Indicates the window size for maximum pooling
stride: step
padding: Whether to zero fill
dilation: Convolution space between inputs , The default is True

Fully connected layer

torch.nn.Linear()
Parameter description
in_features : Enter the number of features ;
out_features: Output characteristic number ;
bias: The default is True, Indicates the use of offset

4. Introduction to classical convolutional neural networks

Lenet-5

LeNet5 Convolutional neural network comes from Yann LeCun stay 1998 A paper published in 1987 :Gradient-
based Learning Applied to Document Recognition, It is used for handwritten numeral recognition
Convolutional neural networks .
LeNet-5 yes CNN The most famous network model in network architecture , It is the beginning of convolutional neural network
do .

AlexNet

2012 year , AlexNet Born in the sky .AlexNet send ⽤ Convolution nerves ⽹ Collateral , And it's very ⼤ The best of
Shi wins ImageNet 2012 Image recognition challenge champion .
Alexnet Model from 5 Convolutions and 3 Multiple pooling Pooling layer , Among them are 3 A fully connected hierarchy
become .AlexNet Follow LeNet The structure is similar to , But make ⽤ More convolution layers and more ⼤ The parameter space of
close ⼤ Scale datasets ImageNet. It's a superficial nerve ⽹ Collaterals and deep nerves ⽹ The dividing line of collaterals .

cifar10 Data is introduced

CIFAR-10 By Hinton Of the students Alex Krizhevsky and Ilya Sutskever An arrangement
A small data set for identifying universal objects . It includes 10 Category RGB Color picture slice : fly
machine ( a Kowtow lane )、 automobile ( automobile )、 birds ( bird )、 cat ( cat )、 deer
( deer )、 Dog ( dog )、 Frogs ( frog )、 Horse ( horse )、 ship ( ship ) And trucks
( truck ). The size of the picture is 32×32 , There are... In the data set 50000 Zhang training prison film and
10000 Test pictures  Insert picture description here

VGGNet

VGGNet By the visual geometry group at Oxford University (Visual Geometry Group, VGG) carry
A deep convolution network structure , They are in 7.32% The error rate has won 2014 year ILSVRC branch
Runner up for class missions .
VGGNet The relationship between the depth of convolutional neural network and its performance is explored , Successfully constructed
16~19 Deep convolution neural network , It is proved that increasing the depth of the network can affect the network
Final performance , Make a big drop in error rate , At the same time, it has a strong expansibility , Migrate to other picture data
Generalization is also very good . up to now ,VGG Still used to extract image features .
VGG It can be seen as a deeper version of AlexNet. All are conv layer + FC layer

GoogleNet

GoogleNet yes 2014 year Google A new deep learning structure proposed by the team , Won
2014 year ILSVRC The champion of classified tasks .
GoogLeNet It is the first classical model using parallel network structure , This is in the development of deep learning
The process is of pioneering significance .
GoogLeNet The most basic network block is Inception, It is a parallel network block , Through constant
Iterative optimization , Developed Inception-v1、Inception-v2、Inception-v3、Inception-v4、
Inception-ResNet common 5 A version .
Inception The iterative logic of the family is to improve the generalization ability of the model through structural optimization 、 Reduced model
Parameters .

ResNet

ResNet( Residual network ) The Internet is in 2015 year By hekaiming and other great gods in Microsoft lab
Put forward , Capture the year ImageNet The first place in the classification task in the competition , First in target detection . get COCO
First place in target detection in data set , First place in image segmentation .
It uses a connection called “shortcut connection”, seeing the name of a thing one thinks of its function ,shortcut Just
yes “ Take a shortcut ” It means . Insert picture description here
ResNet block There are two kinds of , A two-layer structure , A three-layer structure  Insert picture description here

MobileNet

MobileNet It's Google. 2017 in , Focus on lightweight in mobile terminals or embedded devices
level CNN The Internet .
MobileNet The basic unit of is deep separable convolution , It can be broken down into two smaller operations :
depthwise convolution and pointwise convolution.

原网站

版权声明
本文为[Hua Weiyun]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/179/202206281549221913.html