当前位置:网站首页>Classic model - Nin & googlenet

Classic model - Nin & googlenet

2022-06-26 03:28:00 On the right is my goddess

List of articles

NiN

The problem of full connection layer : Contains a large number of parameters . It's easy to over fit .

Usually transport Enter into through Avenue Count × chart image ruler " × transport Out ruler degree Enter the number of channels \times Image size \times Output scale transport Enter into through Avenue Count × chart image ruler " × transport Out ruler degree

NiN The idea is : Don't connect all layers at all ;

One NiN block :
 Please add a picture description

The convolution layer is followed by two 1x1 Convolution of , The stride is 1, No filling , The output shape is the same as the convolution output . It acts as a full connection layer ( Per pixel ).

NiN The architecture of :

  1. No full connection layer ;
  2. Use alternately NiN Blocks and strides are 2 The largest pool layer of ( Gradually reduce the height and width and increase the number of channels );
  3. Finally, the output is obtained by using the global average pooling layer ( The number of input channels is the number of categories );

If we want to get 1000 Class words , Finally, there is 1000 Channels , The confidence of the corresponding class of this channel is obtained by global average pooling .

summary :

  1. NiN Blocks use convolution layers +2 individual 1x1 Convolution layer , The latter adds nonlinearity to each pixel ;
  2. Global average pooling replaces VGG and AlexNet The full connection layer of , Few parameters , It's not easy to over fit .

 Please add a picture description
Parameter used Alex That set , But I added some 1x1 Convolution of .

GoogleNet

How to choose the best super parameter ?
Convolution kernel 、 Pooling ways 、 The channel number ?

Inception block : Every convolution has to , Last concatenation( Height width unchanged , Channel number connection ).

 Please add a picture description
You can see , The function of white blocks is to reduce the complexity of the model by changing the number of channels ( That is, the parameter quantity ). The blue block is used to extract information .

The design idea of first decreasing and then increasing is bottleneck The feeling of .

Inception A block is compared to a single 3x3 or 5x5 Compared to convolution , It has fewer parameters and computational complexity .

meanwhile Inception Blocks also increase the diversity of information learned from them .

 Please add a picture description
Stage1 and Stage2 and VGG Agreement .GoogleNet Used a lot of NiN Thought , Use... In large quantities 1x1 Convolution reduces the amount of parameters .

 Please add a picture description
Compared with AlexNet,GoogleNet The convolution kernel of is relatively small , This allows spatial information not to be compressed very quickly , Support information learning when the number of subsequent channels increases .

meanwhile , Spatial information is compressed , I think it is also a helpless move to increase the number of channels , The purpose is to reduce the number of parameters .

 Please add a picture description
The third stage , You can see that the number of channels is still increasing , But every one of them Inception The parameters of the blocks are different . It is worth mentioning that ,3x3 Convolution is always the most allocated , This is because its parameters are not large , The effect of extracting information is also OK .

Inception There are many variants of the block follow-up ,V2 Joined the BN、V3 Modified convolution size 、V4 Residual connection is added .

原网站

版权声明
本文为[On the right is my goddess]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/177/202206260240556143.html