当前位置:网站首页>SIFT feature point extraction

SIFT feature point extraction

2022-06-23 05:34:00 Everything wins L

Sift Algorithm

Introduction to the algorithm

Scale invariant feature transformation, i.e SIFT (Scale-invariant feature transform).
1. it ⽤ To detect and describe local features in images , It looks for extreme points on a spatial scale , And extract its location 、 scale 、 Rotation invariants .

2. The description and detection of local image features can help identify objects ,SIFT Features are based on objects ⼀ Some local appearance points of interest ⽽ With the image ⼤⼩ And rotation ⽆ Turn off .

3.SIFT The essence of the algorithm is to find key points in different scale spaces ( Characteristic point ), And calculate the ⽅ towards .SIFT The key points found are ⼀ some ⼗ Sub protrusion , Not because of the light , Affine transformation and noise ⾳ Other factors ⽽ Points of change , Like the corner 、 Edge point 、 Bright spot in dark area and dark spot in bright area .
 Insert picture description here

Algorithm operation steps

Image pyramid

The pyramid of Gauss

1. Images ⾼ Si ⾦ Word tower (Gaussian Pyramid) It's mining ⽤⾼ S function to the image ⾏ Blur and downsampling processing result in .
 Insert picture description here
Process a picture into groups 6 Zhang , Each group has the same size but different ambiguity coefficients . The values of fuzzy coefficient are as follows :
 Insert picture description here Downsampling :
sigma Take all points Convolution
2sigma Take points every other point Convolution
When sigma The coefficient of is 2,4,8…… when , Start downsampling
The meaning of the second formula in the above figure : First use 0.5 Gaussian kernel convolution , Reuse 1.52 Gaussian kernel convolution and direct use 1.6 The effect of Gaussian kernel convolution is the same !!
 Insert picture description here
S: The layer number
n: Number of pictures to be featured in advance
 Insert picture description here
In the vertical direction : Scale direction

Gaussian function and image convolution

according to 3σ principle , send ⽤NxN The template in the image every ⼀ Operation at pixels , among N=[(6σ+1)] And take the nearest odd number up .
 Insert picture description here

 Two dimensional Gaussian convolution

Separate Gaussian convolution

1. Directly convolute with the image , Speed ⽐ slower , At the same time, the image edge information will also be lost seriously . You can make ⽤ The separation of ⾼ S convolution ( First of all ⽤1xN The template along X⽅ Convolute the image ⼀ Time , then ⽤Nx1 The template along Y⽅ Re convolute the image ⼀ Time , among N=[(6σ+1)] And take the nearest odd number up ), This saves time and time ⼩ The serious loss of image edge information caused by direct convolution .
 Insert picture description here

Gauss gold tower source code analysis

 Insert picture description here

Gaussian difference pyramid

Image translation scale normalization is to eliminate the influence of translation and scaling on the image through transformation .

Establishment of differential pyramid

Difference ⾦ The word tower is in ⾼ Si ⾦ Operating on the basis of a tower , Its construction ⽴ The process is : stay ⾼ Si ⾦⼦ The adjacent two floors in each group in the tower are subtracted ( Next ⼀ Layer minus ⼀ layer ) Just ⽣ become ⾼ S difference ⾦ Word tower .

Each layer of the same group is subtracted , Get the Gaussian difference pyramid .

doubt : There are six layers in each group in front of the difference , Why become 5 The layer ?
 Insert picture description here

Differential pyramid source code analysis

 Insert picture description here

Spatial extremum ( Key points ) testing ( The most critical step )

The key point is by DOG It is composed of local extreme points of space , The preliminary exploration of key points is through the same ⼀ Each in the group DoG Between two adjacent layers of images ⽐ More complete .

Extreme point detection process

Schematic diagram of extreme point detection

 Insert picture description here
n: Number of pictures to be featured in advance
If the absolute value is too small, it may be noise , So don't keep these points
 Insert picture description here

Extreme point detection source code analysis

 Insert picture description here

Key positioning

above ⽅ The extreme point detected by the method is the extreme point of discrete space .
The following three-dimensional ⼆ Function to accurately determine the location and scale of key points , At the same time, remove the low pair ⽐ Key points of degree and unstable edge response points ( because DoG count ⼦( Edge extraction ) Will produce ⽣ Strong edge response ), To enhance matching stability 、 carry ⾼ Anti noise performance ⼒.

Precise positioning of key points

The extreme point of discrete space is not the real extreme point . benefit ⽤ Known discrete space point interpolation obtained by continuous space extreme point ⽅ The law is called ⼦ Pixel interpolation .

 Insert picture description here
In order to raise ⾼ Stability of key points , Need to scale space DoG Function into ⾏ Curve interpolation . benefit ⽤DoG Function in scale space Taylor Expansion ( The interpolation function ) by :( Then I can't understand )
( Taylor's role here is probably to reduce the error infinitely )
 Insert picture description here

 Insert picture description here
Find the offset at the extreme point , When it is in office ⼀ Offset in dimension ⼤ On 0.5 when ( namely x or y or σ), It means that in interpolation ⼼ Has been offset to its adjacent point , So you have to change the position of the current key . At the same time, the interpolation is repeated at the new position until it converges ; It may also exceed the set number of iterations or the range of image boundaries , At this point, such points should be deleted .
 Insert picture description here
Because it's probably noise .

Eliminate edge response

Some understanding of edge response :
 Insert picture description here

 Insert picture description here

⼀ A poorly defined ⾼ S difference calculation ⼦ The extreme value of is across the edge ⽅ There is a comparison ⼤ Principal curvature of ,⽽ On the vertical edge ⽅ Xiang Youjiao ⼩ Principal curvature of .DOG count ⼦ Will produce ⽣ Strong edge response , Unstable edge response points need to be eliminated .
 Insert picture description here
The last step of the above formula is one square less .
α、β Is the eigenvalue of the determinant

Taylor interpolation source code analysis in precise positioning

 Insert picture description here

Assign key directions

In order to make the descriptor rotation invariant , Need interest ⽤ The local feature of the image is given to each ⼀ Key point assignment ⼀ Benchmark ⽅ towards . send ⽤ The method of image gradient is used to obtain the stable direction of local structure .
 Insert picture description here
Look in the Gauss pyramid !!
 Insert picture description here

Statistical gradient direction and gradient amplitude , Then vote in the gradient area .( actually , Every time 10° Calculate a direction , common 36 Argument value , The highest vote is in the main direction , The second is the secondary direction , The secondary direction shall be greater than the primary direction 80%, Otherwise no )
The feature points with two directions follow : Two feature point processing .
 Insert picture description here

Feature point descriptor

For each ⼀ A key point , Have three messages : Location 、 Scale and ⽅ towards . The next step is to build for each key point ⽴⼀ A descriptor , So that it does not change with all kinds of changes ⽽ change ,⽐ Such as light change 、 Depending on the ⾓ Change and so on . And the descriptor should have a larger value ⾼ The uniqueness of , So as to facilitate the collection of ⾼ Probability of correct matching of feature points .

Match key algorithm :KNN.
Key matching requires descriptors to match .
 Insert picture description here

128 Dimension vector :
 Insert picture description here
 Insert picture description here

Interpolation calculates each kind of ⼦ spot ⼋ individual ⽅ Gradient to
Sift Descriptors have rotation invariant properties : So you need to rotate the coordinate axis of the key point to the main direction , Then the stable descriptors are counted .
My classmate gave me an explanation ::
 Insert picture description here

 Insert picture description here

Questions in this chapter

doubt 1: There are six layers in each group in front of the difference , Why become 5 The layer ?
doubt 2: The Taylor transform operation of a vector cannot be understood .
doubt 3: Trilinear interpolation

原网站

版权声明
本文为[Everything wins L]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206230227497925.html