当前位置:网站首页>Extremenet: target detection through poles, more detailed target area | CVPR 2019
Extremenet: target detection through poles, more detailed target area | CVPR 2019
2022-06-24 10:53:00 【VincentLee】
ExtremeNet Detect the four poles of the target , And then combine them in a geometric way for target detection , The performance is comparable to other traditional detection algorithms .ExtremeNet It's very unique , But there are many post-processing methods , So there's a lot of room for improvement , If you are interested, you can go to the error analysis part of the paper experiment undefined
source : Xiaofei's algorithm Engineering Notes official account
The paper : Bottom-up Object Detection by Grouping Extreme and Center Points
- Address of thesis :https://arxiv.org/abs/1901.08043
- Paper code :https://github.com/xingyizhou/ExtremeNet
Introduction
In target detection , A common method is to define the target as a rectangle , This usually brings a lot of background information that hinders detection . So , The paper proposes ExtremeNet, By detecting the four poles of the target to locate the target , Pictured 1 Shown . The whole algorithm is based on CornerNet We need to improve our thinking , Five heat maps are used to predict the four poles and the central region of the target , Combine the poles of different heat maps , Whether the combination meets the requirements can be judged by the value of the geometric center of the combination on the heat map of the central point . in addition ,ExtremeNet The poles of detection can match DEXTR The network carries on the target segmentation information forecast .
ExtremeNet for Object detection
ExtremeNet Use HourglassNet Key point detection of class knowability , follow CornerNet Training steps for 、 Loss function and offset prediction , The prediction of offset value is class agnostic , The center point does not contain an offset value . The backbone network outputs $5\times C$ Zhang retu ,$4\times 2$ Offset value characteristic graph ,$C$ Is the number of categories , The overall structure and output are shown in the figure 3 Shown . When the poles are extracted , Combine them according to the geometry .
Center Grouping
The poles are in different directions of the target , It's very complicated to combine , I think it's like CornerNet Use that way embedding The combination of vectors will lack global information , So I put forward Center Grouping Make pole combinations .
Center Grouping The flow of the algorithm 1 Shown , First, we get the peaks on the heat map of the four poles , Two things need to be met at the peak :1) Its value should be greater than the threshold value $\taup$ 2) Is the local maximum , The peak value should be greater than the surrounding eight points , The process of getting the peak is called ExtrectPeak. After getting the peaks on each heat map , Traversing the combination of peaks , For combinations of peaks satisfying geometric relations ($t$,$b$,$r$,$l$), Calculate its geometric center point $c=(\frac{l_x+t_x}{2}, \frac{t_y+b_y}{2})$, If the value of the geometric center satisfies $\hat{Y}^{(c)}{c_x, c_y} \ge \tau_c$, It is considered that the peak combination meets the requirements .
Ghost box suppression
In the case of three equally spaced targets of the same size ,Center Grouping There may be misjudgments with high confidence . here , There are two possible scenarios for the middle goal , One is correct prediction , The second is to merge the output with the object next door by mistake , The paper calls the prediction frame of the second case as ghost box . To solve this situation , The paper adds soft-NMS Post processing method , If the sum of the confidence levels of the prediction boxes contained in a prediction box is more than three times that of the prediction box , Divide the confidence by two , And then we can move on NMS operation .
Edge aggregation
The pole is not the only one sometimes , If the target has a horizontal or vertical boundary , Then all the points on the edge are poles , And the prediction value of the network to the points on the boundary will be smaller , It may lead to pole missing detection .
This paper uses edge aggregation (edge aggregation) To solve this scenario , For the local maxima of left and right heat maps , Fractional aggregation in the vertical direction , The local maxima of the upper and lower heat maps are fractional aggregated in the horizontal direction . Aggregate the monotone decreasing fractions in the corresponding direction , Until it reaches the local minimum in the direction of aggregation . hypothesis $m$ Is the local maximum point ,$N^{(m)}i=\hat{Y}{mx+i, m_y}$ It's a point in the horizontal direction , Definition $i_0 < 0$ and $0<i_1$ Is the nearest local minimum on both sides , namely $N^{(m)}{i0-1} > N^{(m)}{i0}$ and $N^{(m)}{i1} < N^{(m)}{i1+1}$, Then the peak value of edge aggregation is updated to $\tilde{Y}_m=\hat{Y}_m+\lambda{aggr}{\sum}^{i1}{i=i0}N^{(m)}_i$, among $\lambda{aggr}$ Aggregate weights for , Set to 0.1, The overall effect is shown in the picture 4.
Extreme Instance Segmentation
Pole ratio bbox Contains more target information , After all, there's twice as much tagging information (8 vs 4). Based on the sum of four poles bbox, This paper proposes a simple method to obtain the target's mask Information , First of all, with the pole as the center, we extend 1/4 bbox The line of the length of the boundary , If the line exceeds bbox Then cut off , Then connect the four lines end to end to get an octagon , Pictured 1 Shown . Finally using DEXTR(Deep Extreme Cut) Method to further obtain mask Information ,DEXTR Network can transform pole information into segmentation information , Here you can input the octagonal screenshot directly to the pre training DEXTR In the network .
Experiments
The comparative experiment of each module , In addition, the paper is right ExtremeNet Error analysis , Replace the output of each module with GT, Finally, we can achieve 86.0AP.
And others SOTA Method comparison .
Instance segmentation effect .
Conclusion
ExtremeNet Detect the four poles of the target , And then combine them in a geometric way for target detection , The performance is comparable to other traditional detection algorithms .ExtremeNet It's very unique , But there are many post-processing methods , So there's a lot of room for improvement , If you are interested, you can go to the error analysis part of the paper experiment .
If this article helps you , Please give me a compliment or watch it ~undefined More on this WeChat official account 【 Xiaofei's algorithm Engineering Notes 】
边栏推荐
- 喜歡就去行動
- SF Technology Smart logistics Campus Technology Challenge (June 19, 2022) [AK]
- Illustration miscellaneous [for archiving to prevent loss]
- What is the function of the graphics card driver? Do you want to update the graphics card driver
- Which map navigation is easy to use and accurate?
- Suddenly I thought of the wooden house in my hometown
- Many of my friends asked me what books and online classes I recommended. This time, I contributed all the materials that I had been hiding for a long time (Part 1)
- [latest in the whole network] how to start the opentsdb source code in the local ide run
- 283.移动零
- Use the process monitor tool to monitor process operations on registries and files
猜你喜欢

【毕业季·进击的技术er】绕树三匝,何枝可依?

抓包工具charles實踐分享

Hbuilder makes hero skin lottery games

服乔布斯不服库克,苹果传奇设计团队解散内幕曝光

88. merge ordered arrays

283.移动零

Fashionable pop-up mode login registration window

Quick completion guide for mechanical arm (II): application of mechanical arm

使用Process Monitor工具监测进程对注册表和文件的操作
![[JS reverse sharing] community information of a website](/img/71/8b77c6d229b1a8301a55dada08b74f.png)
[JS reverse sharing] community information of a website
随机推荐
喜歡就去行動
Does the depth system work?
机械臂速成小指南(一):机械臂发展概况
Shape change loader loads jsjs special effect code
Practice sharing of packet capturing tool Charles
Thread operation principle
24. image mosaic operation
[Qianfan 618 countdown!] IAAs operation and maintenance special preferential activities
Distributed transaction principle and solution
MYSQL_ Elaborate on database data types
Differences among cookies, session, localstorage and sessionstorage
[IEEE publication] International Conference on natural language processing and information retrieval in 2022 (ecnlpir 2022)
Svg+js drag slider round progress bar
Several stacks of technology sharing: product managers' Online Official answers to several stacks of knowledge
使用Process Monitor工具监测进程对注册表和文件的操作
What is a compressed file? What are the advantages of different methods of compressing files?
Outils de capture de paquets
进程与多线程
System design: load balancing
[ei sharing] the 6th International Conference on ship, ocean and Maritime Engineering in 2022 (naome 2022)