当前位置：网站首页>Pointnet/pointnet++ learning

Pointnet/pointnet++ learning

2022-06-26 02:07:00 【Master Ma】

One 、 Application of point cloud
Insert picture description here

Two 、 The expression of point cloud

3、 ... and 、Pointnet

Four 、Pointnet++

Insert picture description here

Insert picture description here

Pointnet++ summary
Although this article is called PointNet++, But and PointNet There is still a big improvement . The core of this paper is to propose a multi-level feature extraction structure . Specifically, select some points in the input point set as the center point , Then select the surrounding points around each center point to form a region , Then each area acts as PointNet An input sample of , Get a set of features , This characteristic is the characteristic of this region . After that, the center point remains unchanged , Expand the area , Enter the features obtained in the previous step as input PointNet, And so on , This process is to continuously extract local features , Then expand the local scope , Finally, we get a set of global features , And then sort it out . The paper also proposes a multi-scale method to solve the problem of uneven samples , These methods do not contribute to the accuracy of classification , But it can make the model more robust when the sample is very sparse .

1. Abstract

PointNet One drawback is that local features cannot be obtained , This makes it difficult to analyze complex scenes . stay PointNet++ in , The author makes improvement through two main methods , Make the network can better extract local features . First of all , Take advantage of space distance （metric space distances）, Use PointNet Iterative feature extraction of local region of point set , So that it can learn more and more local scale features . second , Because the distribution of point sets is often uneven , If the default is uniform , It will make the network performance worse , Therefore, the author proposes a feature extraction method based on adaptive density . Through the above two methods , Be able to learn features more efficiently , And more robust .

2. Introduction

stay PointNet++ in , The author uses the distance measure of the space to divide the point set （partition） For local areas with overlap （ It can be understood as patch）. On this basis , Firstly, local features are extracted from geometric structures in a small scale （ Shallow features ）, And then expand the scope , On the basis of these local features, higher-level features are extracted , Know the global features extracted to the whole point set . You can find , This process and CNN The process of network feature extraction is similar to , First, low-level features are extracted , As the receptive field increases , Extracted features level Higher and higher .

PointNet++ Two key issues need to be addressed ： First of all , How to divide a point set into different regions ; second , How to use the feature extractor to obtain the local features of different regions . These two questions are actually related , If you want to use the feature extractor to extract features from different regions , Each partition needs to have the same structure . The same analogy can be made here CNN To understand the , stay CNN in , Convolution block as a basic feature extractor , The corresponding areas are n*n Pixel area of . And in the 3D In point set , It is also necessary to find sub regions with the same structure , And the corresponding region feature extractor .

In this paper , The author used PointNet As a feature extractor , Another problem is how to divide the point set to produce regions with the same structure . The author uses neighborhood sphere to define partition , Or it can be called patch, Each area can be determined by the central coordinates and radius . Selection of central coordinates , The author uses a fast sampling algorithm to complete （farthest point sampling (FPS) algorithm）. The choice of the region radius is a challenging thing , Because the input point set is not uniform , At the same time, regional features may overlap or be forgotten . Although in VGG It says ,CNN It is better to use a small convolution kernel , But this is because the image is gridded , Each area is very regular , If it were PointNet++ Use a small radius , The network performance is poor . Here you can visually imagine , The neighborhood ball is too small , May mean that you may not see enough complete local features . This process can also use KNN Realization .

3. Network structure

PointNet++ yes PointNet Extension of , stay PointNet A multi-level structure is added on the basis of （hierarchical structure）, It enables the network to provide higher-level features in larger and larger areas .

Insert picture description here

Each group of the network set abstraction layers It mainly includes 3 Parts of ：Sampling layer, Grouping layer and PointNet layer.

·Sample layer： It mainly samples the input points , Select several center points from these points （ problem ： How to choose , How many points to choose ？）
Grouping layer： The point set is divided into several regions by using the center point obtained in the previous step ;
PointNet layer： Is to encode each region obtained above , Become eigenvectors .

The input for each set of extraction layers is N*(d + C), among N Is the number of input points ,d Is a coordinate dimension ,C It's a feature dimension . The output is N’*(d + C’), among N’ Is the number of output points ,d Is the coordinate dimension unchanged ,C’ Is a new feature dimension . The function and implementation process of each layer are described in detail below .

1). Sample layer

Use farthest point sampling choice N’ A little bit , As for why you chose to use this method to select points , Compared with random sampling , This method can better cover the whole point set . How many center points are selected , How to determine the quantity , It is designated by people .

2). Grouping layer

This layer uses Ball query Method generation N’ A local area , According to the meaning of the paper , There are two variables here , One is the number of points in each area K, The other is the radius of the ball . The radius here should be dominant , Will find a point in the ball with a certain radius , The upper limit is K. The radius of the ball and the number of midpoints in each area are specified . This step can also be used KNN To carry out , And both of them have little effect on the results .

Insert picture description here
3). PointNet layer

This floor is PointNet, Accept N’×K×(d+C) The input of . The output is N’×(d+C). It should be noted that , Before entering into the network , It will change the points in this area into relative coordinates around the central point . The author mentioned , In this way, we can get the relationship between points （ Doubt this , But the feeling is limited, like Batch Norm？）.

4). For the treatment of non-uniform point cloud

When the point cloud is uneven , In each sub region, if the same ball radius is used in zoning , This will cause the sampling points in some sparse areas to be too small . Insert your own ideas into this place , From one point of view , Whether the density of the point cloud can be regarded as a part of the sample attribute ？ In this sense, this is not a shortcoming to be overcome . If you are worried that the sampling points in some areas are too small , Can I add a lower threshold .

The author mentioned that this problem needs to be solved , And two methods are proposed ：Multi-scale grouping (MSG) and Multi-resolution grouping (MRG). The following is the schematic diagram of the paper .

Insert picture description here
Here are two methods . The first multiscale grouping （MSG）, For the same central point , If you use 3 Two different scales , Just look for the dots around each center 3 Regions , The radius of each region and the number of points inside are different . For the same central point , Areas of different scales are sent into different PointNet Feature extraction , after concat, Characteristic of this central point . in other words MSG In fact, it is equivalent to connecting several in parallel hierarchical structure, The number of points in the center of each structure is the same , But the area is different （ It can be understood as receptive field ？）,PointNet The input and output sizes are also different , Then several structures of different scales are PointNet There is one Concat.
Insert picture description here
The other is multiresolution grouping （MRG).MSG Obviously, it will reduce the operation speed , So I put forward MRG, This method should be different level Of grouping Made a concat, But due to different scales , about low level First put one pointnet Process and then high level In the process of concat. Sense and ResNet The hop connection in is a bit similar .

In this part , The author also mentioned a kind of random input dropout（DP） Methods , Just before entering the point cloud , Random... On the point set Dropout, The scale uses 95%, That is to say 95% Resampling of . Sort of like data enhancement , It is also a way to improve the model robustness. How effective are these methods , Let's see .

It can be seen from the classification experiment results in this paper , Multiscale （MSG,MRG) Compared with a single scale （SSG） There is no improvement in the accuracy of classification , One advantage is that if the point cloud is very sparse , Use MSG Can keep a good robustness. about robustness effect random input dropout（DP） In fact, the contribution is greater .

Insert picture description here

From the segmentation experimental results in this paper , Use （MSG+DP） After that, it was really better than SSG As a result, it improved , On the non-uniform point cloud, the difference will be larger , But the author does not give MSG and DP Comparison of individual contributions to effect improvement , So it's hard for us to be sure MSG still DP Played a role in this .

4. Understand core structure through code

Understand through the core code PointNet++ Medium hierarchical structure（ Also called set abstraction layers） How does this work , Above, 3 layer set abstraction layers（ With SSG（ Single scale ） For example ）.

We take the first floor set abstraction layers For example , Corresponding line9 Code （PointNet Set Abstraction (SA) Module）. Suppose the input point cloud data is （16,1024,3）, That is, a sample 1024 A little bit , Only xyz coordinate . Send it to the first floor set abstraction layers. Set parameters ：

l0_xyz： < A point that contains only coordinates > l0_points： < Contains not only coordinates , It also contains the features extracted after each point passes through the previous layer , So the first floor doesn't have >
npoint = 512： <Sample layer look for 512 A point as the center point , This is hand selected , By experience or by experiment >
radius=0.2： <Grouping layer in ball quary The radius of the sphere is 0.2, Note that this is the normalized scale of the coordinates >
nsample=32： < Samples are taken around each center point within a specified radius sphere , The upper limit is 32 individual ; Radius dominates > mlp=[64,64,128]：<PointNet layer Yes 3 layer , The feature dimension changes are 64,64,128> # There are other parameters , It doesn't matter , There is no saying

Take a closer look at how each layer is implemented , Focus on the data transmission form .

SA(512,0.2,[64,64,128]) -> SA(128,0.4,[128,128,256]) -> SA([256,512,1024]) ->
FC1 -> FC2 -> FC(K)

The data is processed first sampling and grouping, Corresponding to the following code , Let's see how this function is implemented .

Insert picture description here
This function input is passed in from above , Explain the output .

new_xyz: after sampling after , Got 512 The coordinates of the two central points idx： Is the index of points in each region
grouped_xyz： The grouped point set , It's a four-dimensional vector （batch_size, 512 Regions , For each area 32 A little bit , Each point 3 A coordinate ）
new_points： It is also the point set after grouping , But there are characteristics in it , If it's the first time , Is equal to grouped_xyz, You can choose to convolute the coordinates and features concat Post convolution .

An important point after sampling is partitioning , These are the above two functions , If using KNN Partition , Because just take a fixed number of points around each center point （ As mentioned above 32）,idx Is the index of these coordinates , The number of points is （32*512）, You can find , The original point cloud is 1024 individual , This will inevitably lead to regional overlap , No problem , This is the desired effect .

ball query After that, I got idx, and pts_cnt, Because it is the priority basis radius Partition , The number of points per region is uncertain （ Maximum 32）, therefore pts_count It's counting , How many points are there in each area , Convenient idx Separate .

The picture below is ball query(a) and KNN(b) Schematic diagram , One is mainly radius , One is the number of points .

Insert picture description here
Get the index of points in each region and group them , The result is a four bit vector （batch_size, 512 Regions , For each area 32 A little bit , Each point 3 A coordinate ）, Here's the picture . There is a loophole here , If it is to use ball query Got , The number of points in each group is not 32, But there is no introduction here pts_cnt Value , Then how do you know how to distribute idx？
Insert picture description here
The following term PointNet layer

grouping The point set after convolution , You can pay attention , As we said above ,new_points It's a 4 Dimension vector <（batch_size,512, 32, 3）——（batch_size, 512 Regions , For each area 32 A little bit , Features corresponding to each point ）>512 Regions , Every area 32 A little bit . For each area 32 A dot goes by PointNet Convolution kernel pooling of , Integrated into a set of features , This set of features belongs to the central point of each region .

reference ：https://zhuanlan.zhihu.com/p/88238420

原网站

版权声明
本文为[Master Ma]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/177/202206260013102897.html

当前位置：网站首页>Pointnet/pointnet++ learning

Pointnet/pointnet++ learning

边栏推荐

猜你喜欢

随机推荐