当前位置:网站首页>Binocular 3D perception (I): preliminary understanding of binocular
Binocular 3D perception (I): preliminary understanding of binocular
2022-06-25 15:36:00 【anthony-36】
Binocular 3D perception ( One ): A preliminary understanding of binocular
advantage :
- Monocular 3D Perception depends on prior knowledge and geometric constraints
- Deep learning algorithms are very dependent on the size of the data set 、 Quality and diversity
- The binocular system solves the ambiguity caused by perspective transformation
- Binocular perception does not depend on the results of object detection , It is effective for any obstacle
Inferiority :
- Hardware : The camera needs to be accurately registered , The correctness of registration shall also be maintained during vehicle operation
- Software : The algorithm needs to process data from two cameras at the same time , High computational complexity
Binocular depth estimation
The basic principle
1. Concepts and formulas
B: Baseline length ( The distance between two cameras )
f: The focal length of the camera
d: parallax ( The same one on the left and right images 3D The distance between the points )
f and B Is constant , Required solution depth z, Just estimate the parallax d(xl-xr)
root According to the phase like 3、 ... and horn shape , have to To { f / z = x l / x f / z = x r / x − B only Yes x and z yes not know change The amount According to the similar triangle , obtain \begin{cases} f/z=xl/x\\ f/z=xr/x-B \end{cases} \\ Only x and z It's an unknown variable root According to the phase like 3、 ... and horn shape , have to To { f/z=xl/xf/z=xr/x−B only Yes x and z yes not know change The amount
We get the following formula :
Z = f B / d Z=fB/d Z=fB/d
2. Disparity estimation : For each pixel in the left figure . You need to find the matching point in the right figure .
- For each possible parallax ( Limited scope ), Calculate matching error , Therefore, the obtained three-dimensional error data is called Cost Volume.、
- When calculating the matching error, consider the local area near the pixel , For example, sum the differences of all corresponding pixel values in the local area .
- adopt Cost Volume You can get the parallax at each pixel ( Corresponding to the minimum matching error ), So we can get the depth value .
PSMNET
1. The shared convolution network is used for feature extraction on the left and right images
- Including down sampling , Pyramid structure and hole convolution are used to extract multi-resolution information and expand receptive field
2. Left and right feature map construction Cost Volume
3.3D Convolution is used to extract information between left and right feature maps and different parallax levels
4. Upsampling to original resolution , Find the parallax value with the smallest matching error
5. The process
6. Result analysis (KITTI Data sets )
- There is an error between the object and the background
Cause analysis : Although features contain neighborhood information , But it lacks the supervisory signal of high-level semantic information , Unable to understand the scene .
How to improve : The results of object detection and semantic segmentation are used for post-processing , Or multiple tasks
- Error due to long distance
distance | 0-10m | 10-30m | 30-60m | 60-inf | 0-inf |
---|---|---|---|---|---|
Depth error (RMSE) | 0.268 | 1.203 | 6.056 | 16.604 | 2.605 |
Cause analysis : The parallax value at a long distance is small , It is difficult to distinguish between discrete image pixels
Z = f B / d Z=fB/d Z=fB/d
How to improve :① Improve the spatial resolution of the image ( long-focus ), It makes the distant objects have more pixel coverage
② Increase baseline length , Thus increasing the range of parallax
- Areas of low texture or low light , The error of depth estimation is large
Cause analysis : Features cannot be effectively extracted in this region , Used to calculate the matching error
How to improve : Improve the dynamic range of the camera , Or use a sensor that can measure distance
The specific simulation process is recorded in the next chapter .
边栏推荐
- MySQL field truncation principle and source code analysis
- 双目3D感知(一):双目初步认识
- Boost listening port server
- Semaphore function
- 55 specific ways to improve program design (1)
- Errno perrno and strerrno
- 0703 interface automation - MySQL database connection, encapsulation, adding database verification in use cases
- semget No space left on device
- Reflection - learning notes
- Architecture evolution of high-performance servers -- Suggestions
猜你喜欢
Agent and classloader
How to convert a recorded DOM to a video file
剑指 Offer 03. 数组中重复的数字
双目3D感知(一):双目初步认识
Summary of regularization methods
QT source code online view
Highly concurrent optimized Lua + openresty+redis +mysql (multi-level cache implementation) + current limit +canal synchronization solution
剑指 Offer 05. 替换空格
(2) Relational database
Install Kali extension 1: (kali resolution problem)
随机推荐
Solve valueerror: invalid literal for int() with base 10
Kali SSH Remote Login
Sampling method and descriptive statistical function in R language
QT pattern prompt box implementation
Distributed transaction solution
Principle and implementation of MySQL master-slave replication (docker Implementation)
Finally, we can figure out whether the binding event in the tag is bracketed or not
AB string interchange
CV pre training model set
MySQL修改字段语句
Record the time to read the file (the system cannot find the specified path)
Using R language in jupyter notebook
CPU over high diagnosis and troubleshooting
MySQL修改字段語句
[C language] implementation of magic square array (the most complete)
If a thread overflows heap memory or stack memory, will other threads continue to work
Brief object memory layout
Detailed summary of reasons why alertmanager fails to send alarm messages at specified intervals / irregularly
iconv_ Open returns error code 22
QT animation loading and closing window