当前位置:网站首页>[paper notes] street view change detection with deconvolutional networks
[paper notes] street view change detection with deconvolutional networks
2022-06-25 15:14:00 【m0_ sixty-one million eight hundred and ninety-nine thousand on】
The paper

Thesis title :Street-View Change Detection with Deconvolutional Networks
Address of thesis :https://www.researchgate.net/publication/304533064_Street-View_Change_Detection_with_Deconvolutional_Networks
primary coverage
Propose a system , It is used to detect the structural change of street view video taken by vehicle mounted monocular camera . Multi sensor fusion SLAM And fast and dense 3D Rebuild the pipe connections , The roughly registered image pair is provided to the depth deconvolution network , For pixel level change detection .
Put forward CDNet, An efficient method based on stack shrink and expansion block CNN Architecture to detect changes between image pairs . The parameters are 140 m , Relatively compact , Strike a balance between performance and model size , Suitable for mobile platforms , It is not easy to over fit in small data sets .
contribution :
- A deep deconvolution architecture is proposed , The performance of street scene change detection task is remarkable ( Better than manually designed descriptors ), At the same time, the embedded device (1.4M Parameters ) Keep the appropriate lightweight .
- Propose a new data set , Used for urban scene change detection , Contains challenging seasonal and lighting changes .
- A multi-sensor fusion system is designed SLAM System , The system is combined with rapid and intensive reconstruction of pipelines , For approximate alignment of image pairs , To achieve change detection across time .
Process Overview


- (a) Use multi-sensor fusion SLAM System processing t1 and t2 Video sequence of time , in consideration of GPS、 Inertial ranging and RGB Image data , To generate vehicle motion and sparse 3D reconstruction ;
- (b) By approximate GPS Positioning and powerful feature matching and binding adjustment , Cross time registration of sequences ;
- (c) A new slope smoother method is used to effectively densify the reconstruction ; Depth maps are used to re project (π) To align the image ;
- (d) A deconvolution network is used to predict the alignment RGB Changes between images ;
- (e) The predicted changes of the network are shown in red . Interference due to lighting and seasonal changes is handled correctly .
Network architecture (CDNet)

CDNet,4 Compressed blocks (contraction block) By CONV、BNORM、ReLU and max-pooling layers .4 Extents (expansion block) Each of them is guided by a solution pool ( From the corresponding contraction blcok Storage pooling metrics for )、CONV、BNORM and ReLU layers . The last layer is a linear arithmetic unit , There's a softmax classifier . As a preprocessing step , For the input RGB The image is normalized in the channel direction .
constitute Contraction network Of 4 individual block Used to create rich representations ; constitute Expansion network Of 4 individual block Improve the location and division of change areas . The final change decision is made by a softmax Linear classifiers operate intensively on each pixel .
- Every contraction block By a 7*7 The convolution layer consists of , Having a fixed number of 64 Features . Before nonlinear activation , The output is normalized in batch (batch normalization,BN), To reduce the shift of internal covariates during training (internal covariate shift), Improve convergence .BN There is no royalty statistics calculation for the parameter of , It is learned as an additional parameter . The nonlinearity is activated by a standard rectifier linear unit (Rectified Linear Units,ReLU) produce , And by the 2*2 Of max-pooling layer , In steps of 2, To reduce the spatial dimension . After this operation , The most responsive metrics are stored , So that later in the corresponding expansion block Use in , Perform a clean upsampling of the data .
- Every expansion block First, use the non pooling layer to upsample its input . This layer uses previously stored indices to generate an upsampled version of the input , The activation of the edge position is preserved , And other high-frequency features . This operation is followed by a 7×7 Convolution of , There is a fixed number of 64 Features . Same as before , stay ReLU Before , Pre activation is processed using batch normalization BN. This stacking of expansion and contraction blocks makes the network structure completely symmetrical in the number of features .
notes :
1. The two channel network training method is adopted
2.EXPANSION NETWORK in unpool The upper sampling layer parameters of are stored in maxpool Parameters in
3. The batch standardized parameters are merged into the network parameter group for optimization
Dense 3D Reconstruction

How to train :
Both convolution block and deconvolution block are initialized randomly . Using the default parameters Adam optimizer Training .
Faster convergence , stay 200epoch within , Every epoch 150 individual batches,batch The size is 10 To image .
Loss function : Weighted cross entropy , Select the weight according to the inverse frequency of the class in the training set
experiment

CL-CMU-CD dataset:

PCD dataset:

visualization :

边栏推荐
- Generation method and usage of coredump
- Cross compilation correlation of curl Library
- QT source code online view
- Boost listening port server
- 3. Sequential structure multiple choice questions
- 2. operator and expression multiple choice questions
- Is it safe to open an online stock account? Who knows
- Flexible layout (display:flex;) Attribute details
- Gif动画怎么在线制作?快试试这款gif在线制作工具
- AB string interchange
猜你喜欢

Common dynamic memory errors

如何裁剪动图大小?试试这个在线照片裁剪工具

QT pattern prompt box implementation

Afterword of Parl intensive learning 7-day punch in camp

Std:: vector minutes

搭建极简GB28181 网守和网关服务器,建立AI推理和3d服务场景,然后开源代码(一)

Review of arrays and pointers triggered by a topic

Learning notes on February 18, 2022 (C language)

From 408 to independent proposition, 211 to postgraduate entrance examination of Guizhou University

How to make GIF animation online? Try this GIF online production tool
随机推荐
Real variable instance
Statistical analysis - data level description of descriptive statistics
Automatic correlation between QT signal and slot
Power automatic test system nsat-8000, accurate, high-speed and reliable power test equipment
@Font face fonts only work on their own domain - @font-face fonts only work on their own domain
Core mode and immediate rendering mode of OpenGL
New title of PTA
System Verilog — interface
1090.Phone List
Is it safe to open a stock account online?
QT database connection
Build a minimalist gb28181 gatekeeper and gateway server, establish AI reasoning and 3D service scenarios, and then open source code (I)
55 specific ways to improve program design (2)
Weka download and installation
5 connection modes of QT signal slot
2022年广东高考分数线出炉,一个几家欢喜几家愁
QT database connection deletion
User defined data type - structure
Explanation of dev/mapper
Yolov4 coco pre train Darknet weight file