当前位置：网站首页>Saccadenet: use corner features to fine tune the two stage prediction frame | CVPR 2020

Saccadenet: use corner features to fine tune the two stage prediction frame | CVPR 2020

2022-06-24 08:08:00 【VincentLee】

SaccadeNet Based on the characteristics of the center point, the preliminary target location is carried out , Then the corner features and center features of the preliminary prediction box are used to fine tune the prediction box , The whole idea is similar to two-stage Target detection algorithm , Transform the area feature of prediction box in the second stage into point feature .SaccadeNet It's remarkable in accuracy and speed , The whole idea is very good undefined

source ： Xiaofei's algorithm Engineering Notes official account

The paper : SaccadeNet: A Fast and Accurate Object Detector

Address of thesis ：https://arxiv.org/abs/2003.12125
Paper code ：https://github.com/voidrank/SaccadeNet

Introduction

In neurology , Humans don't always look at the scene when they're targeting , It's looking around for information rich areas to help locate the target . Inspired by this mechanism , The paper proposes that SaccadeNet, Be able to efficiently focus on information rich target key points , Target location from coarse-grained to fine-grained .

SaccadeNet The structure of is shown in the figure 2 Shown , First, the center position and corner position of the target are predicted preliminarily , Then, the feature of the four corner positions and the center position is used for regression optimization ,SaccadeNet There are four modules ：

Center Attentive Module(Center-Attn), Predict the center position and category of the target .
Attention Transitive Module(Attn-Trans), The corner position corresponding to each center position is preliminarily predicted .
Aggregation Attentive Module (Aggregation-Attn), The feature of center position and corner position is used to optimize the prediction box .
Corner Attentive Module(Corner-Attn), It is used to enhance the target boundary features of the backbone network .

SaccadeNet My overall thinking is very good , It's kind of like two-stage The scheme of target detection based on , The prediction box regression in the second stage is transformed from regional feature to point feature .

Center Attentive Module

Center-Attn The module contains two simple convolution layers , Transform the feature map of backbone network output into the center point heat map , The heat map can be used to predict the center position and category of all targets in the picture . The GT Follow CornerNet The settings are the same , Use the Gauss kernel $e^{\frac{||X-X_k||^2}{2{\sigma}^2}}$ take GT Position scattering ,$\sigma$ Is the radius of 1/3, The radius is determined by the size of the target , Make sure that the points within the radius can produce IOU At least for 0.3 The prediction box of . in addition , The module's loss function combines focal loss：

$p{i,j}$ It's the location on the heat map $(i,j)$ The scores of ,$y{i,j}$ For the corresponding GT value .

Attention Transitive Module

Attn-Trans The output size of the module is $wf\times h_f\times 2$, The width and height of the prediction box corresponding to each position are predicted , And then according to the location of its center point $(i,j)$ Calculate the corresponding corner position $(i-w{i,j}/2, j-h{i,j}/2)$,$(i-w{i,j}/2, j+h{i,j}/2)$,$(i+w{i,j}/2, j-h{i,j}/2)$,$(i+w{i,j}/2, j+h_{i,j}/2)$, Use L1 Return to the loss and train . be based on Center-Attn Module and Attn-Trans modular ,SaccadeNet It can preliminarily predict the detection result of the target . Besides , The source code of this paper provides additional prediction of the offset value of the center point in this module , Aiming at the misalignment problem caused by down sampling , The offset value also uses L1 Return to the loss and train , This is on by default .

Aggregation Attentive Module

Aggregation-Attn It's a lightweight module , Used to fine tune the prediction box , Output more accurate prediction box .Aggregation-Attn Module from Attn-Trans Module and Center-Attn Get the corner and center of the target in the module , And from the characteristic graph output from the backbone network , Use bilinear interpolation to sample the feature of the corresponding position , Finally, the correction values of width and height are regressed , The whole module uses L1 I'm going to train .

Corner Attentive Module in Training

In order to extract corner features rich in information , The paper adds extra... To the training Corner-Attn Branch , Transform the backbone network features into four channel heat map , Corresponding to the four corners of the target respectively . similarly , The branch is based on focal loss Training with Gauss heat map , The branch is class agnostic . This module can carry out fine tuning iteratively for many times , similar Cascade R-CNN like that , The paper also makes a comparison in the experimental part .

Relation to existing methods

The current target detection methods based on key points can be divided into edge-keypoint-based detectors and center-keypoint-based detectors,SaccadeNet The advantages of the two methods are integrated .

Edge-keypoint-based detectors Usually the corner or pole is detected first , Then the key point combination is used to locate the target , But this kind of algorithm usually can't get the global information of the target ：a) Corner feature itself contains less target information , We need to add additional central features for feature enhancement . b) Corners are usually on the background pixels , Contains less information than other key points . Even though SaccadeNet Corner points are also used for target prediction , but SaccadeNet Target prediction directly from the central key point , In this way, we can get the global information of the target , And avoid the time-consuming combination of key points .

Center-keypoint-based detectors Target prediction through central key points , Output the center point heat map and directly return to the boundary . But the center point is usually far away from the target boundary , It may be difficult to predict an accurate target boundary , Especially for big goals . in addition , The corner key is closest to the boundary , Contains a lot of local accurate information , The lack of corner information may be detrimental to the prediction results , and SaccadeNet It just fills this gap , More accurate boundary prediction .

Experiments

And SOTA Target detection algorithms are compared .

Attn-Trans Module and Aggregation-Attn The contrast experiment of the module .

Corner-Attn Comparison of module iterations .

Conclusion

SaccadeNet Based on the characteristics of the center point, the preliminary target location is carried out , Then the corner features and center features of the preliminary prediction box are used to fine tune the prediction box , The whole idea is similar to two-stage Target detection algorithm , In the second stage, the prediction box calls the regional features into the point features .SaccadeNet It's remarkable in accuracy and speed , The whole idea is very good .

If this article helps you , Please give me a compliment or watch it ～undefined More on this WeChat official account 【 Xiaofei's algorithm Engineering Notes 】

原网站

版权声明
本文为[VincentLee]所创，转载请带上原文链接，感谢
https://yzsam.com/2021/06/20210628163530220y.html