当前位置:网站首页>Detector: detect objects with recursive feature pyramid and switchable atolos convolution
Detector: detect objects with recursive feature pyramid and switchable atolos convolution
2022-07-23 16:48:00 【TJMtaotao】
Abstract
Many modern target detectors use “ Think twice before ” Mechanism , It shows excellent performance . This paper applies this mechanism to the trunk design of target detection . At the macro level , We propose a recursive feature pyramid , It combines additional feedback connections from the feature pyramid network into a bottom-up backbone layer . At the micro level , We propose a switchable antitrust convolution , The convolution is characterized by convolution at different antitrust rates , And use the switch function to collect the results . Combine them together to form a detector , The performance of target detection is greatly improved . stay COCO On the test development platform , The detector realizes target detection 54.7% Of box-AP state , Instance segmentation 47.1% Of mask-AP state , Panoramic segmentation 49.6% Of PQ state .https://github.com/joe-siyuan-qiao/DetectoRS
1. Introduction
To detect objects , Human visual perception transmits high-level semantic information through feedback connection , Selectively enhance and inhibit the activation of neurons [2,19,20]. Inspired by the human visual system , The mechanism of secondary vision and secondary thinking in computer vision has been instantiated , And show excellent performance [5,6,58]. Many popular two-stage target detectors , Such as fast R-CNN[58], First, output the target suggestion , Then, according to these suggestions, regional features are extracted to detect the target . In the same direction ,Cascade R-CNN[5] A multistage detector is developed , In this detector , The subsequent detector head is trained to be a more selective example . The success of this design idea inspired us to explore it in the neural network backbone design of target detection . especially , We have adopted this mechanism at both macro and micro levels , Thus, our proposed detector greatly improves the current most advanced target detector HTC[7] Performance of , At the same time, the reasoning speed remains unchanged , As shown in Table 1 .

At the macro level , We propose a recursive feature pyramid (RFP) It is based on the feature pyramid network (FPN) Above [44], It will come from FPN The additional feedback connections of the layer are merged into the bottom-up backbone layer , Pictured 1a Shown . Expand the recursive structure into sequential implementation , We got the trunk of a target detector , It can observe two or more images . Similar to cascade R-CNN Cascade detector head in , our RFP Recursively enhance FPN To generate an increasingly powerful representation . A network similar to deep monitoring [36], The feedback connection brings the features of the gradient received directly from the detector head back to the low level of the bottom-up trunk , To speed up training and improve performance . We propose RFP It realizes a design of two consecutive searches and thinking , Bottom up backbone and FPN Run multiple times , Its output characteristics depend on the characteristics in the previous steps .
At the micro level , We propose a switchable atolos convolution (SAC), It convolutes the same input characteristics at different atolos rates [11,30,53], And use the switch function to collect the results . chart 1b Show SAC An illustration of the concept of . The switching function is spatially related , That is, each position of feature mapping may have different switches to control SAC Output . For use in detectors SAC, We will the standards in the bottom-up backbone 3x3 All convolution layers are converted to SAC, The performance of the detector is greatly improved . Some previous methods used conditional convolution , for example [39,74], It also combines the results of different convolutions into a single output . Different from those architectural requirements
To train from scratch ,SAC Provides a mechanism , The pre trained standard convolutional network can be easily converted ( for example ImageNet pretrained[59] checkpoint ). Besides , stay SAC A new weight locking mechanism is used in , In addition to the trainable differences , The weight of different materials is the same .
Combined with the suggested RFP and SAC The result is in our detector . In order to prove its validity , We are in a challenging COCO Data sets [47] The detector is incorporated into the most advanced HTC[7]. stay COCO In test development , We report on... For object detection box AP[22]、 For instance segmentation mask AP[26] And for panoramic segmentation PQ[34]. With ResNet-50[28] The detector for the trunk is significantly improved HTC[7]7.7% Of box-AP and 5.9% Of mask-AP. Besides , Equip our detector with ResNeXt-101-32x4d[71] Can achieve the most advanced 54.7% Box type AP and 47.1% Mask AP. add DeepLabv3+[14] With Wide-ResNet-41[10] Material prediction for the backbone , The detector creates for panoramic segmentation 49.6% Of PQ New record .

2. Related Works
object detection . There are two main types of target detection methods : First level method , Such as [45、50、56、60、80、81] And multilevel methods , Such as [5、7、9、25、27、58]. Multistage detectors are usually more flexible than primary detectors 、 More precise , But it's also more complicated . In this paper , We use a multistage detector HTC[7] As a baseline , And compared with these two kinds of detectors .
Multiscale features . Our recursive feature pyramid is based on the feature pyramid network (FPN)[44], An effective target detection system using multi-scale features . before , Many target detectors directly use multi-scale features extracted from the backbone network [4,50], and FPN The top-down path is used to sequentially combine the features of different scales .PANet[49] stay FPN Add another bottom-up path to the top of .STDL[82] The cross scale characteristics of scale conversion module are proposed .G-FRNet[1] Use the gating unit to add feedback .NAS-FPN[24] and Auto-FPN[73] Using neural structure search [87] To find the best FPN structure .EfficientDet[66] Suggest repeating a simple BiFPN layer . Unlike them , The recursive feature pyramid we proposed is enriched by a bottom-up trunk FPN The ability to express . Besides , We will use the pyramid pool of atorus space (ASPP)[13,14] Integrate to FPN in , With rich functions , Similar to seamless mini DeepLab Design [55].
Recursive convolution network . In order to solve different types of computer vision problems , Many recursive methods have been proposed , Such as [32,42,65]. lately ,CBNet[51] A recursive target detection method is proposed , It cascades multiple backbone networks , Output features as FPN The input of . by comparison , our RFP Use a that contains a valid fusion module 、 Rich in ASPP Of FPN Perform recursive computation .
The conditional convolution network adopts dynamic kernel 、 Width or depth , for example [16,39,43,48,74,77]. The difference is , We propose a switchable antitrust convolution (SAC) Without changing any pre training model , An effective conversion mechanism from standard convolution to conditional convolution . therefore ,SAC Is a plug and play module , Backbone for many pre training . Besides ,SAC Using global context information and a new weight locking mechanism , Make it more effective .
3. Recursive feature pyramid
3.1 Characteristic pyramid network


among x0 It's the input image ,fS+1=0. be based on FPN The target detector adopts fi Carry out detection and calculation .
3.2 Recursive feature pyramid

We are right. ResNet[28] Backbone network B Made changes , To allow it to accept x and R(f) As input .ResNet There are four stages , Each stage consists of several similar blocks . We only change the first block of each stage , Pictured 3 Shown . This block calculation 3 Layer features and add them to the features calculated by shortcut . In order to use features R(f), We added another convolution layer , Its kernel size is set to 1. The weight of this layer is initialized to 0, To ensure that loading weights from pre trained checkpoints does not have any practical effect .
3.3. ASPP as the Connecting Module

We don't have a convolution that follows the cascade feature , Because here R The final output used in intensive forecasting tasks is not generated . Be careful , Each of these four branches produces a feature , The number of channels is the of input characteristics 1/4, Connecting them will produce a connection with R.In Sec Input features of the same size .5, We showed with and without ASPP Modular RFP Performance of .
3.4 The output of the fusion module is updated

4. Switchable Atrous Convolution


边栏推荐
猜你喜欢

Priyanka Sharma, general manager of CNCF Foundation: read CNCF operation mechanism

Surface family purchase reference

Go interface: go deep into internal principles

20220722挨揍记录

Talk about the memory layout of JVM

SSD: Single Shot MultiBox Detector

Complete knapsack explanation of dynamic programming knapsack problem

UiPath Studio Enterprise 22.4 Crack

The new business form of smart civil aviation has emerged, and Tupo digital twin has entered the civil aviation flight network of the Bureau

百度编辑器上传图片设置自定义目录
随机推荐
国内生产总值(GDP)数据可视化
Go interface: go deep into internal principles
go语言的基础语法(变量、常量、基本数据类型,for、switch,case、数组、slice(切片)、make和new、map)
竞赛大佬在华为:网络专家出身斯坦福物理系,还有人“工作跟读博差不多”...
使用“soup.h1.text”爬虫提取标题会多一个\
动态规划背包问题之多重背包详解
【Redis】redis安装与客户端redis-cli的使用(批量操作)
How to choose fluorescent dyes in laser confocal
ESP8266-NodeMCU——从苏宁API获取实时天气
SurFace家族选购参照
O3DF执行董事Royal O’Brien:开源没有边界,所有共享的声音都会变成实际方向
Nifi 1.16.3 集群搭建+kerberos+用户认证
pytest接口自动化测试框架 | 多进程运行用例
Introduction to Huawei's new version of datacom certification
20220722 beaten record
动态规划背包问题之01背包详解
IIS 部署.NetCore
15001. System design scheme
机器狗背冲锋枪射击视频火了,网友瑟瑟发抖:stooooooooppppp!
pytest接口自动化测试框架 | 如何获取帮助