当前位置:网站首页>CVPR 2022 oral | a new motion oriented point cloud single target tracking paradigm
CVPR 2022 oral | a new motion oriented point cloud single target tracking paradigm
2022-06-22 10:55:00 【Zhiyuan community】

Project address :
https://ghostish.github.io/MM-Track/
Address of thesis :
https://arxiv.org/abs/2203.01730
GitHub:
https://github.com/Ghostish/Open3DSOT
Reading guide
In this paper, the research team of the Chinese University of Hong Kong proposed a single target tracking paradigm based on motion modeling . It is different from that in LiDAR Widely used in scenes Siamese normal form , The method :
- fast : Speed up 1.67 times , achieve 57FPS;
- accurate : Numerous interference items 、 It shows significant robustness when the target appearance is extremely missing ;
- malicious : Use only the most basic vanilla pointnet As backbone, Refresh significantly under three large scene datasets SOTA.
For the subsequent LiDAR SOT The program provides a new perspective —— Consider the relative motion of the target .
contribution
LiDAR Scene point cloud single target tracking , Good progress has been made in recent years .
The existing point cloud single target tracking scheme , They all use one Siamese-based The paradigm of :

Siamese-based The paradigm of Look match To search for potential targets in a specific area . stay 2D Scene , Look match It is an important means of single target tracking . however LiDAR Lack of texture information on the point cloud , And due to shielding and other reasons , The appearance of the same object often changes dramatically between two adjacent frames . This seriously affects the accuracy of appearance matching . therefore , The single target tracking scheme based on appearance matching is very sensitive to interference . As shown in the figure below , When there are interference items around the target , Based on appearance matching BAT Will incorrectly identify interference items as targets .

actually , Interference term The existence of is an important factor that affects the performance of point cloud single target tracking . We found that , stay KITTI Of Car Category , The latest methods have achieved considerable performance . however , When the amount of data is equal Pedestrian Category , The performance of existing methods is far less than Car Category . These methods are used in complex scenarios ( such as NuScenes Data sets ) There will also be a significant decline in the performance of .
The following figure visually shows the interference term in KITTI and NuScenes Distribution differences on . You can see , For vehicles ,KITTI The interference term on is very few , But in NuScenes On , Around the vehicle, there are often interference items with very similar shapes .

On the training set of three different data sets , We counted the area around the vehicle 2 The distribution of interference terms in the area of M , The statistical results also confirm our observations .KITTI There are almost no interference items around most of the vehicles on the , But in NuScenes and Waymo On , This is not the case .

Besides , We also made statistics KITTI Distribution of other interference items of uplink human beings :

You can see , For the pedestrian category , Interference term is very common .
Method
Since the interference term is widely distributed in the actual situation , And the appearance matching method is so sensitive to interference , So how to improve the existing single target tracking scheme ? We provide a new idea in this article —— Consider the relative motion of the target .
Although the appearance of the target object is highly similar to the interference term , But it often has its own unique trajectory . We think , Modeling the motion of the target is a very effective means to combat the interference . stay 2D scenario , Due to the influence of camera projection , It is difficult to explicitly model the three-dimensional motion of an object . But in 3D This can be done very simply , In this paper , We simply use goals Bounding Box The relative motion between them is used to approximate its own three-dimensional motion . Based on this observation, we propose a Based on 3D motion transformation Point cloud single target tracking paradigm :

The input of this normal form is two adjacent frame point clouds , And the position of the target in the previous frame . By explicitly learning the motion of each target between the front and back frames from massive data , This normal form can effectively predict the relative motion of the target in two adjacent frames , Based on this, the position of the target in the previous frame is transformed to the position of the current frame , To complete the tracking .
Based on this new paradigm , We propose a specific two-stage single objective scheme : \( M^2-Track \).

Through joint spatial-temporal Study , \( M^2-Track \) First, roughly segment the target points . In the first phase , \( M^2-Track \) The relative motion of the target in the front and back frames is explicitly predicted , And get the position of the target in the current frame through three-dimensional transformation .

In the second phase , In order to alleviate LiDAR The fragmentary nature of the point cloud , \( M^2-Track \) The target point cloud of the previous frame is transformed into the current frame according to the predicted relative motion , Get a complete target point cloud . And fine tune the target position obtained in the first stage according to the completed point cloud .

Different from the previous method, the complex backbone To extract point cloud features , \( M^2-Track \) Use only the most basic vanilla PointNet As backbone. Thanks to this , \( M^2-Track \) There is no need for users to make complex super parameter adjustment ( Compare the number of points sampled as follows 、3D The range of convolution, etc ). also \( M^2-Track \) The running speed of has also reached 57FPS, Faster than all previous public solutions .
experiment
There are many interference items NuScenes and Waymo Open Dataset On , \( M^2-Track \) Compared with the appearance matching scheme, it has a very significant improvement .

stay KITTI On , \( M^2-Track \) The advantage of is also very obvious , Especially for the category of pedestrians with many interference items :

For vehicle targets with few interference items , \( M^2-Track \) The improvement is not significant . For further verification \( M^2-Track \) Robustness to disturbance terms , We randomly in KITTI A number of vehicles with different interference items are added to the scene (60~100). And use this synthetic KTTI Data set to revalidate \( M^2-Track \) Match the previous appearance scheme . We can clearly see from the figure below , With the increase of interference items , \( M^2-Track \) Compared with the previous scheme, the advantages are more and more obvious .

In addition, we found that , Based on the appearance matching, it can often be compared with \( M^2-Track \) Show higher tracking accuracy . So we tried to make \( M^2-Track \) A simple series connection with the previous method . Experiments show that , When combined with a scheme based on appearance matching , \( M^2-Track \) The tracking performance of is further improved . It is believed that this will provide more ideas for the future single target tracking research program .

Visualization results :


Conclusion
This paper presents a Point cloud single target tracking paradigm based on target motion . This paradigm has a good physical definition , Intuitive and effective . Whether in terms of speed or accuracy , Based on this paradigm \( M^2-Track \) Have greatly refreshed the existing SOTA. It is believed that this paradigm can be used in the future and Siamese Paradigms are organically combined , Further improve the performance of point cloud single target tracking and recognition .
Reference resources
https://ghostish.github.io/MM-Track/
https://mp.weixin.qq.com/s/VP7rcGz7qlgKgIEpxT0gIw
https://mp.weixin.qq.com/s/sqzxcyKovJqZVem-rd_16w
https://www.techbeat.net/grzytrkj?id=22384
https://mp.weixin.qq.com/s/Xv8Cr4eA_uTdlhjmsybdBg
https://mp.weixin.qq.com/s/7QNSgVyD5oMJw1QucrcaaA
边栏推荐
- 2022-06-09 工作记录--yarn/npm-Error-EPERM: operation not permitted, uv_cwd
- 儋州清洁级动物实验室建设细节说明
- heidisql插入记录,总是出错,要怎么改?
- Good news - agile technology was selected into the 2022 China top 100 Digital Security Report
- 线程常用调度方法
- Construction details of Danzhou clean animal laboratory
- Yolov3 target detection
- 2022陕西省安全员B证操作证考试题库及在线模拟考试
- Pytorch实现波阻抗反演
- [graduation season · advanced technology Er] youth never ends
猜你喜欢
![[graduation season · advanced technology Er] youth never ends](/img/de/b17460d1a702d56cf67d9df1f8f284.png)
[graduation season · advanced technology Er] youth never ends

Start from the principle of MVC and knock on an MVC framework to bring you the pleasure of being a great God

AQS的初步了解

Inftnews | view: market cooling or opportunities for NFT applications

Thinkphp3.2.3 log inclusion analysis

Web Configuration of Visual Studio Code

等重构完这系统,我就提离职!

一次特殊的文件上传

Preliminary understanding of AQS

Denso China adopts Oracle HCM cloud technology solution to accelerate the digital transformation of human resources
随机推荐
投资交易管理
Free and easy to use, Tencent arm cloud instance evaluation - AI reasoning acceleration
普乐蛙5d飞行影院5d动感影院体验馆设备7d多人互动影院
2022-06-09 work record --yarn/npm-error-eperm: operation not permitted, UV_ cwd
Evaluation of scientific research award and entrepreneurship Award
Web Configuration of Visual Studio Code
今天,SysAK 是如何实现业务抖动监控及诊断?&手把手带你体验Anolis OS|第25-26期
Heidisql inserts records. There are always errors. How do you change them?
【jenkins】shell脚本调jenkins api接口
Learn to view object models with VisualStudio developer tools
2022 Shaanxi Provincial Safety Officer C certificate examination question bank simulated examination platform operation
中信证券app叫什么?股票开户安全吗?
TCP 3次握手的通俗理解
Path Join() and path The difference between resolve()
在 Laravel 中使用计算列
libevent的使用
Investment transaction management
7-1 框架发布 - 通过npm发布框架
Redis常用命令
WordPress like hooks and filters in laravel