当前位置:网站首页>Tensorflow introductory tutorial (38) -- V2 net
Tensorflow introductory tutorial (38) -- V2 net
2022-07-24 17:33:00 【51CTO】
Today we will share Unet An improved model of U2-Net, The improved model comes from 2020 Year paper 《U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection》, By understanding the idea of the model , stay VNet On the basis of this, we can make the same improvement .
One 、U2-Net The advantages of
1、U2-Net It is a simple but powerful deep learning network structure , For significant object detection . It consists of two closely connected U Structural . The design has the following advantages (1)、 residual U structure (RSU) Mixed with receptive fields of different sizes , Therefore, more contextual information can be captured from different scales .(2)、 Because of these RSU Pooling operation is used in the module , Therefore, the depth of the entire network can be increased without significantly increasing the computing cost .
2、 For two questions U2-Net The Internet ,(1)、 Whether we can design a new network to train from scratch , The final result or model performance is better than that of the existing and pre training models ?(2)、 With the deepening of the network, can we maintain high-resolution feature map , While maintaining low memory and computing consumption .
Two 、U2-Net Network structure
1、 residual U modular
The following figure shows a common convolution module , however a To c Only local features can be obtained , Because the convolution kernel size is too small , Unable to capture global characteristics . In order to obtain the global information of high-resolution feature map , The most direct idea is to expand the receptive field , Pictured d Shown , Using hole convolution to expand receptive field to extract local and nonlocal features , But this requires more computing power and memory consumption . according to Unet Thought , The residual error U modular , It consists of three parts :a、 Enter the build-up layer , This is a conventional convolution layer used to extract local features ,b、 Be similar to Unet Structure coding - Decoding structure network , The input is the output of the conventional convolution , It can be used to extract and encode multi-scale context information ,L Indicates the level depth , The bigger the value is. , The larger the range of receptive field , There will be richer local and global features . Gradually pool downsampling from the input feature map and convolute to extract multi-scale features , Then through continuous and step-by-step sampling , Mosaic and convolution encode it into high-resolution feature map ,c、 Residual connection is the fusion of local features and multi-scale features .
The design idea is to extract multi-scale features directly from each residual module , because U Very small structure , Most operations are on downsampling , It's very efficient .

2、U2-Net structure
U2-Net By 11 The three hierarchies are similar U Structured network . Each hierarchy is the residual of personalized configuration U Structural modules (RSU). So nest U Structure can extract multi-scale features within the hierarchy more effectively , Aggregate multi-scale features between levels .U2-Net It consists of three parts ,a、 Six levels of encoder modules , Residuals are used U Module structure , among L It is determined according to the resolution of the input feature map , The first four of these six levels are in the pooled layer version RSU, And the latter two levels adopt the hole convolution version RSU, Because the later the resolution is, the lower , Using pooling layer will lose information .b、 Five levels of decoder module , It has the same structure as the corresponding encoder module , Each module input is the result of the previous module output after up sampling and the output result of the corresponding encoder at the same level .c、 The output characteristic probability map fusion module of the encoder module , Six characteristic probability graphs pass 3x3 Convolution sum sigmoid Function to produce , Then the six feature probability maps are reduced to the size of the original image , And carry out splicing operation , after 1x1 Convolution sum sigmoid Function to generate the fused probability map .

3、 ... and 、 Experimental details and results
1、 The evaluation index :PR curve , Maximum F measurement , Mean error of absolute value , weighting F measurement , Structural measurement , Boundary related F measurement .
2、 Training process , The original image is first uniformly scaled to 320x320 size , Then randomly flip and cut 288x288 size . All convolution layer weights are initialized using Xavier. Use the deep supervision mode to train the model , The cross entropy method is used for the output result of each decoder and the final fusion output result and the gold standard result , And give different weights as the loss function , In the paper, the author sets all loss The weights are all 1, use Adam Optimizer and take default parameters . The image is scaled by bilinear interpolation .
3、 Results comparison
U2-Net And 20 Comparison of two methods , On six data sets , It is the best result in both qualitative and quantitative measurement .
边栏推荐
- 20 -- validate palindrome string
- Keyboard input operation
- An example of using viewthatfits adaptive view in swiftui 4.0
- One article of quantitative framework backtrader: understand indicator indicators
- The orders in the same city are delivered in the same city, and the order explosion is still handy!
- AutoCAD - join merge command
- Is computer monitoring true? Four experiments to find out
- 数论整除分块讲解 例题:2021陕西省赛C
- Coldplay weekly issue 10
- 2022 Yangtze River Delta industrial automation exhibition will be held in Nanjing International Exhibition Center in October
猜你喜欢

It's time to consider slimming down your app

portmap 端口转发

【GNN报告】腾讯AI lab 徐挺洋:图生成模型及其在分子生成中的应用

hcip第三天

Baidu PaddlePaddle easydl x wesken: see how to install the "eye of AI" in bearing quality inspection

Nearly 30 colleges and universities were named and praised by the Ministry of education!

Getaverse,走向Web3的远方桥梁

2022 Yangtze River Delta industrial automation exhibition will be held in Nanjing International Exhibition Center in October

地表最强程序员装备“三件套”,你知道是什么吗?

Array learning navigation
随机推荐
Still using xshell? You are out, recommend a more modern terminal connection tool!
Getaverse,走向Web3的远方桥梁
[how to optimize her] teach you how to locate unreasonable SQL? And optimize her~~~
PAT甲级——签到与签出
Natbypass port forwarding
What is fuzzy theory, foundation and process
What should we pay attention to in the resume of software testing?
Yolopose practice: one-stage human posture estimation with hands + code interpretation
Transformer structure analysis -- learning notes
DHCP relay of HCNP Routing & Switching
ufw 端口转发
二维卷积——torch.nn.conv2d的使用
socat 端口转发
[GNN report] Tencent AI Lab Xu TingYang: graph generation model and its application in molecular generation
Kyligence attended the Huawei global smart finance summit to accelerate the expansion of the global market
Reptiles and counter crawls: an endless battle
详解 Apache Hudi Schema Evolution(模式演进)
Still developing games with unity? Then you're out. Try unity to build an answer system
Iftnews | Christie's launched its venture capital department, aiming at Web3 and metauniverse industries
2022 牛客暑期多校 K - Link with Bracket Sequence I(线性dp)