当前位置:网站首页>Pointnet/pointnet++ learning
Pointnet/pointnet++ learning
2022-06-26 02:07:00 【Master Ma】
One 、 Application of point cloud 







Two 、 The expression of point cloud 

















3、 ... and 、Pointnet























Four 、Pointnet++


















Pointnet++ summary
Although this article is called PointNet++, But and PointNet There is still a big improvement . The core of this paper is to propose a multi-level feature extraction structure . Specifically, select some points in the input point set as the center point , Then select the surrounding points around each center point to form a region , Then each area acts as PointNet An input sample of , Get a set of features , This characteristic is the characteristic of this region . After that, the center point remains unchanged , Expand the area , Enter the features obtained in the previous step as input PointNet, And so on , This process is to continuously extract local features , Then expand the local scope , Finally, we get a set of global features , And then sort it out . The paper also proposes a multi-scale method to solve the problem of uneven samples , These methods do not contribute to the accuracy of classification , But it can make the model more robust when the sample is very sparse .
1. Abstract
PointNet One drawback is that local features cannot be obtained , This makes it difficult to analyze complex scenes . stay PointNet++ in , The author makes improvement through two main methods , Make the network can better extract local features . First of all , Take advantage of space distance (metric space distances), Use PointNet Iterative feature extraction of local region of point set , So that it can learn more and more local scale features . second , Because the distribution of point sets is often uneven , If the default is uniform , It will make the network performance worse , Therefore, the author proposes a feature extraction method based on adaptive density . Through the above two methods , Be able to learn features more efficiently , And more robust .
2. Introduction
stay PointNet++ in , The author uses the distance measure of the space to divide the point set (partition) For local areas with overlap ( It can be understood as patch). On this basis , Firstly, local features are extracted from geometric structures in a small scale ( Shallow features ), And then expand the scope , On the basis of these local features, higher-level features are extracted , Know the global features extracted to the whole point set . You can find , This process and CNN The process of network feature extraction is similar to , First, low-level features are extracted , As the receptive field increases , Extracted features level Higher and higher .
PointNet++ Two key issues need to be addressed : First of all , How to divide a point set into different regions ; second , How to use the feature extractor to obtain the local features of different regions . These two questions are actually related , If you want to use the feature extractor to extract features from different regions , Each partition needs to have the same structure . The same analogy can be made here CNN To understand the , stay CNN in , Convolution block as a basic feature extractor , The corresponding areas are n*n Pixel area of . And in the 3D In point set , It is also necessary to find sub regions with the same structure , And the corresponding region feature extractor .
In this paper , The author used PointNet As a feature extractor , Another problem is how to divide the point set to produce regions with the same structure . The author uses neighborhood sphere to define partition , Or it can be called patch, Each area can be determined by the central coordinates and radius . Selection of central coordinates , The author uses a fast sampling algorithm to complete (farthest point sampling (FPS) algorithm). The choice of the region radius is a challenging thing , Because the input point set is not uniform , At the same time, regional features may overlap or be forgotten . Although in VGG It says ,CNN It is better to use a small convolution kernel , But this is because the image is gridded , Each area is very regular , If it were PointNet++ Use a small radius , The network performance is poor . Here you can visually imagine , The neighborhood ball is too small , May mean that you may not see enough complete local features . This process can also use KNN Realization .
3. Network structure
PointNet++ yes PointNet Extension of , stay PointNet A multi-level structure is added on the basis of (hierarchical structure), It enables the network to provide higher-level features in larger and larger areas .

Each group of the network set abstraction layers It mainly includes 3 Parts of :Sampling layer, Grouping layer and PointNet layer.
- ·Sample layer: It mainly samples the input points , Select several center points from these points ( problem : How to choose , How many points to choose ?)
- Grouping layer: The point set is divided into several regions by using the center point obtained in the previous step ;
- PointNet layer: Is to encode each region obtained above , Become eigenvectors .
The input for each set of extraction layers is N*(d + C), among N Is the number of input points ,d Is a coordinate dimension ,C It's a feature dimension . The output is N’*(d + C’), among N’ Is the number of output points ,d Is the coordinate dimension unchanged ,C’ Is a new feature dimension . The function and implementation process of each layer are described in detail below .
1). Sample layer
Use farthest point sampling choice N’ A little bit , As for why you chose to use this method to select points , Compared with random sampling , This method can better cover the whole point set . How many center points are selected , How to determine the quantity , It is designated by people .
2). Grouping layer
This layer uses Ball query Method generation N’ A local area , According to the meaning of the paper , There are two variables here , One is the number of points in each area K, The other is the radius of the ball . The radius here should be dominant , Will find a point in the ball with a certain radius , The upper limit is K. The radius of the ball and the number of midpoints in each area are specified . This step can also be used KNN To carry out , And both of them have little effect on the results .

3). PointNet layer
This floor is PointNet, Accept N’×K×(d+C) The input of . The output is N’×(d+C). It should be noted that , Before entering into the network , It will change the points in this area into relative coordinates around the central point . The author mentioned , In this way, we can get the relationship between points ( Doubt this , But the feeling is limited, like Batch Norm?).
4). For the treatment of non-uniform point cloud
When the point cloud is uneven , In each sub region, if the same ball radius is used in zoning , This will cause the sampling points in some sparse areas to be too small . Insert your own ideas into this place , From one point of view , Whether the density of the point cloud can be regarded as a part of the sample attribute ? In this sense, this is not a shortcoming to be overcome . If you are worried that the sampling points in some areas are too small , Can I add a lower threshold .
The author mentioned that this problem needs to be solved , And two methods are proposed :Multi-scale grouping (MSG) and Multi-resolution grouping (MRG). The following is the schematic diagram of the paper .

Here are two methods . The first multiscale grouping (MSG), For the same central point , If you use 3 Two different scales , Just look for the dots around each center 3 Regions , The radius of each region and the number of points inside are different . For the same central point , Areas of different scales are sent into different PointNet Feature extraction , after concat, Characteristic of this central point . in other words MSG In fact, it is equivalent to connecting several in parallel hierarchical structure, The number of points in the center of each structure is the same , But the area is different ( It can be understood as receptive field ?),PointNet The input and output sizes are also different , Then several structures of different scales are PointNet There is one Concat.
The other is multiresolution grouping (MRG).MSG Obviously, it will reduce the operation speed , So I put forward MRG, This method should be different level Of grouping Made a concat, But due to different scales , about low level First put one pointnet Process and then high level In the process of concat. Sense and ResNet The hop connection in is a bit similar .
In this part , The author also mentioned a kind of random input dropout(DP) Methods , Just before entering the point cloud , Random... On the point set Dropout, The scale uses 95%, That is to say 95% Resampling of . Sort of like data enhancement , It is also a way to improve the model robustness. How effective are these methods , Let's see .
It can be seen from the classification experiment results in this paper , Multiscale (MSG,MRG) Compared with a single scale (SSG) There is no improvement in the accuracy of classification , One advantage is that if the point cloud is very sparse , Use MSG Can keep a good robustness. about robustness effect random input dropout(DP) In fact, the contribution is greater .

From the segmentation experimental results in this paper , Use (MSG+DP) After that, it was really better than SSG As a result, it improved , On the non-uniform point cloud, the difference will be larger , But the author does not give MSG and DP Comparison of individual contributions to effect improvement , So it's hard for us to be sure MSG still DP Played a role in this .
4. Understand core structure through code
Understand through the core code PointNet++ Medium hierarchical structure( Also called set abstraction layers) How does this work , Above, 3 layer set abstraction layers( With SSG( Single scale ) For example ).
We take the first floor set abstraction layers For example , Corresponding line9 Code (PointNet Set Abstraction (SA) Module). Suppose the input point cloud data is (16,1024,3), That is, a sample 1024 A little bit , Only xyz coordinate . Send it to the first floor set abstraction layers. Set parameters :
- l0_xyz: < A point that contains only coordinates > l0_points: < Contains not only coordinates , It also contains the features extracted after each point passes through the previous layer , So the first floor doesn't have >
- npoint = 512: <Sample layer look for 512 A point as the center point , This is hand selected , By experience or by experiment >
- radius=0.2: <Grouping layer in ball quary The radius of the sphere is 0.2, Note that this is the normalized scale of the coordinates >
- nsample=32: < Samples are taken around each center point within a specified radius sphere , The upper limit is 32 individual ; Radius dominates > mlp=[64,64,128]:<PointNet layer Yes 3 layer , The feature dimension changes are 64,64,128> # There are other parameters , It doesn't matter , There is no saying
Take a closer look at how each layer is implemented , Focus on the data transmission form .
SA(512,0.2,[64,64,128]) -> SA(128,0.4,[128,128,256]) -> SA([256,512,1024]) ->
FC1 -> FC2 -> FC(K)
The data is processed first sampling and grouping, Corresponding to the following code , Let's see how this function is implemented .

This function input is passed in from above , Explain the output .
- new_xyz: after sampling after , Got 512 The coordinates of the two central points idx: Is the index of points in each region
- grouped_xyz: The grouped point set , It's a four-dimensional vector (batch_size, 512 Regions , For each area 32 A little bit , Each point 3 A coordinate )
- new_points: It is also the point set after grouping , But there are characteristics in it , If it's the first time , Is equal to grouped_xyz, You can choose to convolute the coordinates and features concat Post convolution .

An important point after sampling is partitioning , These are the above two functions , If using KNN Partition , Because just take a fixed number of points around each center point ( As mentioned above 32),idx Is the index of these coordinates , The number of points is (32*512), You can find , The original point cloud is 1024 individual , This will inevitably lead to regional overlap , No problem , This is the desired effect .
ball query After that, I got idx, and pts_cnt, Because it is the priority basis radius Partition , The number of points per region is uncertain ( Maximum 32), therefore pts_count It's counting , How many points are there in each area , Convenient idx Separate .
The picture below is ball query(a) and KNN(b) Schematic diagram , One is mainly radius , One is the number of points .

Get the index of points in each region and group them , The result is a four bit vector (batch_size, 512 Regions , For each area 32 A little bit , Each point 3 A coordinate ), Here's the picture . There is a loophole here , If it is to use ball query Got , The number of points in each group is not 32, But there is no introduction here pts_cnt Value , Then how do you know how to distribute idx?
The following term PointNet layer 
grouping The point set after convolution , You can pay attention , As we said above ,new_points It's a 4 Dimension vector <(batch_size,512, 32, 3)——(batch_size, 512 Regions , For each area 32 A little bit , Features corresponding to each point )>512 Regions , Every area 32 A little bit . For each area 32 A dot goes by PointNet Convolution kernel pooling of , Integrated into a set of features , This set of features belongs to the central point of each region .
reference :https://zhuanlan.zhihu.com/p/88238420
边栏推荐
- 启牛推荐的证券账户安全吗?
- 记录一个诡异的图片上传问题
- 反向输出一个整数
- One minute to understand the difference between synchronous, asynchronous, blocking and non blocking
- 关于VS scanf出现‘scanf‘: This function or variable may be unsafe. Consider usi问题的解决方法
- Redis-SDS
- Two indicators for determining the value of points to the business
- LeetCode 41 ~ 50
- Raspberry pie + AWS IOT introductory experiment
- buffer
猜你喜欢

Shell learning record (I)

SDRAM controller -- implementation of arbitration module

Raspberry pie + AWS IOT introductory experiment

前置++,后置++与前置--与后置--(++a,a++与--a,a--)

SDRAM Controller - add read / write FIFO

Sweet girl lisixia was invited to be the little host of the global finals of the sixth season perfect child model

Abnova CMV CISH probe solution

图形渲染管线

readv & writev

Scala 基础 (二):变量和数据类型
随机推荐
影响个人成长的三个因素
Raspberry pie + AWS IOT introductory experiment
Use of static library and dynamic library
The first intimate contact of caching technology
About vs scanf, 'scanf' appears: this function or variable may be unsafe Solutions to the problem of consumer usi
Steps of program compilation precompile compilation assembly connection
連接投影儀
keda 2.7.1 scaledJob 代码简要分析
qtvtkvs2015测试代码
@Query 疑难杂症
记录一个诡异的图片上传问题
Reverse output an integer
Gun make (3) Rules for makefile
如何高效的完成每日的任务?
其他代码,,vt,,,k
Mot - clé C facile à comprendre statique
Application and chemical properties of elastase
Detailed explanation of WiFi related knowledge
工作一年闲记
Disruptor(一)Sequence