当前位置:网站首页>Saccadenet: use corner features to fine tune the two stage prediction frame | CVPR 2020
Saccadenet: use corner features to fine tune the two stage prediction frame | CVPR 2020
2022-06-24 08:08:00 【VincentLee】
SaccadeNet Based on the characteristics of the center point, the preliminary target location is carried out , Then the corner features and center features of the preliminary prediction box are used to fine tune the prediction box , The whole idea is similar to two-stage Target detection algorithm , Transform the area feature of prediction box in the second stage into point feature .SaccadeNet It's remarkable in accuracy and speed , The whole idea is very good undefined
source : Xiaofei's algorithm Engineering Notes official account
The paper : SaccadeNet: A Fast and Accurate Object Detector
- Address of thesis :https://arxiv.org/abs/2003.12125
- Paper code :https://github.com/voidrank/SaccadeNet
Introduction
In neurology , Humans don't always look at the scene when they're targeting , It's looking around for information rich areas to help locate the target . Inspired by this mechanism , The paper proposes that SaccadeNet, Be able to efficiently focus on information rich target key points , Target location from coarse-grained to fine-grained .
SaccadeNet The structure of is shown in the figure 2 Shown , First, the center position and corner position of the target are predicted preliminarily , Then, the feature of the four corner positions and the center position is used for regression optimization ,SaccadeNet There are four modules :
- Center Attentive Module(Center-Attn), Predict the center position and category of the target .
- Attention Transitive Module(Attn-Trans), The corner position corresponding to each center position is preliminarily predicted .
- Aggregation Attentive Module (Aggregation-Attn), The feature of center position and corner position is used to optimize the prediction box .
- Corner Attentive Module(Corner-Attn), It is used to enhance the target boundary features of the backbone network .
SaccadeNet My overall thinking is very good , It's kind of like two-stage The scheme of target detection based on , The prediction box regression in the second stage is transformed from regional feature to point feature .
Center Attentive Module
Center-Attn The module contains two simple convolution layers , Transform the feature map of backbone network output into the center point heat map , The heat map can be used to predict the center position and category of all targets in the picture . The GT Follow CornerNet The settings are the same , Use the Gauss kernel $e^{\frac{||X-X_k||^2}{2{\sigma}^2}}$ take GT Position scattering ,$\sigma$ Is the radius of 1/3, The radius is determined by the size of the target , Make sure that the points within the radius can produce IOU At least for 0.3 The prediction box of . in addition , The module's loss function combines focal loss:
$p{i,j}$ It's the location on the heat map $(i,j)$ The scores of ,$y{i,j}$ For the corresponding GT value .
Attention Transitive Module
Attn-Trans The output size of the module is $wf\times h_f\times 2$, The width and height of the prediction box corresponding to each position are predicted , And then according to the location of its center point $(i,j)$ Calculate the corresponding corner position $(i-w{i,j}/2, j-h{i,j}/2)$,$(i-w{i,j}/2, j+h{i,j}/2)$,$(i+w{i,j}/2, j-h{i,j}/2)$,$(i+w{i,j}/2, j+h_{i,j}/2)$, Use L1 Return to the loss and train . be based on Center-Attn Module and Attn-Trans modular ,SaccadeNet It can preliminarily predict the detection result of the target . Besides , The source code of this paper provides additional prediction of the offset value of the center point in this module , Aiming at the misalignment problem caused by down sampling , The offset value also uses L1 Return to the loss and train , This is on by default .
Aggregation Attentive Module
Aggregation-Attn It's a lightweight module , Used to fine tune the prediction box , Output more accurate prediction box .Aggregation-Attn Module from Attn-Trans Module and Center-Attn Get the corner and center of the target in the module , And from the characteristic graph output from the backbone network , Use bilinear interpolation to sample the feature of the corresponding position , Finally, the correction values of width and height are regressed , The whole module uses L1 I'm going to train .
Corner Attentive Module in Training
In order to extract corner features rich in information , The paper adds extra... To the training Corner-Attn Branch , Transform the backbone network features into four channel heat map , Corresponding to the four corners of the target respectively . similarly , The branch is based on focal loss Training with Gauss heat map , The branch is class agnostic . This module can carry out fine tuning iteratively for many times , similar Cascade R-CNN like that , The paper also makes a comparison in the experimental part .
Relation to existing methods
The current target detection methods based on key points can be divided into edge-keypoint-based detectors and center-keypoint-based detectors,SaccadeNet The advantages of the two methods are integrated .
Edge-keypoint-based detectors Usually the corner or pole is detected first , Then the key point combination is used to locate the target , But this kind of algorithm usually can't get the global information of the target :a) Corner feature itself contains less target information , We need to add additional central features for feature enhancement . b) Corners are usually on the background pixels , Contains less information than other key points . Even though SaccadeNet Corner points are also used for target prediction , but SaccadeNet Target prediction directly from the central key point , In this way, we can get the global information of the target , And avoid the time-consuming combination of key points .
Center-keypoint-based detectors Target prediction through central key points , Output the center point heat map and directly return to the boundary . But the center point is usually far away from the target boundary , It may be difficult to predict an accurate target boundary , Especially for big goals . in addition , The corner key is closest to the boundary , Contains a lot of local accurate information , The lack of corner information may be detrimental to the prediction results , and SaccadeNet It just fills this gap , More accurate boundary prediction .
Experiments
And SOTA Target detection algorithms are compared .
Attn-Trans Module and Aggregation-Attn The contrast experiment of the module .
Corner-Attn Comparison of module iterations .
Conclusion
SaccadeNet Based on the characteristics of the center point, the preliminary target location is carried out , Then the corner features and center features of the preliminary prediction box are used to fine tune the prediction box , The whole idea is similar to two-stage Target detection algorithm , In the second stage, the prediction box calls the regional features into the point features .SaccadeNet It's remarkable in accuracy and speed , The whole idea is very good .
If this article helps you , Please give me a compliment or watch it ~undefined More on this WeChat official account 【 Xiaofei's algorithm Engineering Notes 】
边栏推荐
- Écouter le réseau d'extension SWIFT (source)
- Los Angeles p1051 who won the most Scholarships
- Oracle advanced SQL qualified query
- GraphMAE----论文快速阅读
- 3-list introduction
- The first exposure of Alibaba cloud's native security panorama behind the only highest level in the whole domain
- 模型效果优化,试一下多种交叉验证的方法(系统实操)
- Application of JDBC in performance test
- These dependencies were not found: * core JS / modules / es6 array. Fill in XXX
- Signature analysis of app x-zse-96 in a Q & a community
猜你喜欢

软件工程导论——第二章——可行性研究

Basics of reptile B1 - scrapy (learning notes of station B)

Swift Extension NetworkUtil(網絡監聽)(源碼)

单片机STM32F103RB,BLDC直流电机控制器设计,原理图、源码和电路方案

Pipeline concept of graphic technology

The first exposure of Alibaba cloud's native security panorama behind the only highest level in the whole domain
![[nilm] non intrusive load decomposition module nilmtk installation tutorial](/img/d0/bc5ea1cbca9ee96a2fe168484ffec4.png)
[nilm] non intrusive load decomposition module nilmtk installation tutorial

Coordinate transformation of graphic technology

1-4metaploitable2 introduction

Installation and use of selenium IDE
随机推荐
基于Distiller的模型压缩工具简介
Sql语句内运算问题
On the H5 page, the Apple phone blocks the content when using fixed to locate the bottom of the tabbar
Oracle advanced SQL qualified query
Random number remarks
5g industrial router Gigabit high speed low delay
Configure your own free Internet domain name with ngrok
Methods of vector operation and coordinate transformation
The monthly salary of two years after graduation is 36K. It's not difficult to say
You get in Anaconda
Shader common functions
Swift Extension NetworkUtil(網絡監聽)(源碼)
一文理解同步FIFO
没有专业背景,还有机会成为机器学习工程师吗?
The two most frequently asked locks in the interview
The first exposure of Alibaba cloud's native security panorama behind the only highest level in the whole domain
Redolog and binlog
L1-019 谁先倒 (15 分)
Model effect optimization, try a variety of cross validation methods (system operation)
Gossip: what happened to 3aC?