当前位置:网站首页>1+1<2 ?! Interpretation of hesic papers
1+1<2 ?! Interpretation of hesic papers
2022-06-26 23:56:00 【Shengsi mindspire】
01
Research background
HESIC It is mainly aimed at the joint compression of binocular images , Using the content correlation of binocular images , The main eye first encodes and decodes to guide the other eye to reduce the repeated encoding of redundant information in the encoding process , To optimize 1+1<2 The effect of .
The scientific research team is the xumai teacher group of Beijing University of Aeronautics and Astronautics , Around computer vision and image and video compression coding, etc low level Direction for scientific research .
02
A brief introduction to the main contents of the thesis
Binocular image joint compression , On the one hand, we need to optimize the image compression network , The other is the extraction and utilization of binocular mutual information , Only by combining the two organically can we give better play to 1+1<2 The effect of . and HESIC Network is a binocular end-to-end image compression algorithm based on deep learning , It can make full and effective use of the mutual information of binocular images to reduce the storage cost of each pair of pictures . Aiming at many characteristics of binocular images ,HESIC The network uses the homography image transformation of traditional image processing for reference to improve the coding efficiency of binocular images 、 Save storage bits , A basic network architecture based on self encoder is adopted . For the entropy coding part , The model based on Gaussian mixture distribution and the entropy coding model based on autoregression can adapt to two different entropy coding models with different advantages and disadvantages , And in InStereo2K and KITTI Better results on datasets .
03
Code link
Code link :
https://github.com/ywz978020607/HESIC
https://gitee.com/ywzsunny/HESIC-Mindspore-Migration
Thesis link :
https://openaccess.thecvf.com/content/CVPR2021/papers/Deng_Deep_Homography_for_Efficient_Stereo_Image_Compression_CVPR_2021_paper.pdf
04
Key points of algorithm framework technology
The main frame is as above , The basic encoding and decoding functions are realized through the respective encoding and decoding networks of the binocular , At the entrance and exit, the left eye is used as the main eye to encode and decode independently , The left eye is transformed into the right eye through homography to encode and decode the redundant information . Besides , After decoding, the homography transformation matrix , The left and right eye images can be Bi directionally transformed , The cross quality is enhanced by simple convolution after merging with the other channel , Further improve the model effect .
In the entropy model section ,HESIC The model based on Gaussian mixture distribution , Taking into account the parallel optimization speed, the prediction accuracy is improved . Besides , For different entropy models , We also use a method based on Joint Binocular entropy coding structure of autoregressive , Further enhance the effect , Write it down as HESIC+, Compared with HESIC, The disadvantage is that it is not conducive to parallel optimization , The advantage is to make better use of the encoded / Decoded message , Improve coding efficiency .
05
experimental result
The paper model is in Instereo2k and KITTI Experimental results of data sets or comparative experimental results , Include PSNR and SSIM The comparison of the two indicators under different compression ratios .
chart :HESIC stay Instereo2k and KITTI Objective effect after average on
chart BD-BR Effect comparison
Subjective renderings
06
MindSpore Code implementation
https://gitee.com/ywzsunny/HESIC-Mindspore-Migration
The code is mainly divided into binocular image homography ( This part can be replaced by traditional feature matching , It has little effect on the results )、 Feature change 、 quantitative + Entropy model predicts bpp、 Feature reconstruction part . The main structure of codec is still feature extraction and inverse transformation , The predicted codeword bits can be calculated directly in the derivation process of neural network through entropy model prediction , Without really serializing , So as to speed up the training process . On the one hand, the loss function of the training process includes the estimated bit rate , On the other hand, it includes image loss , Such as PSNR, Pass both lambda weighting , Adjust compression ratio , Thus, model training and testing under different compression rates can be realized .
07
Summary and prospect
For binocular image compression , Better use of mutual information , And the compression efficiency can be further improved by the deep integration with the compression network . Looking forward to the future , The homography of binocular image and the relationship between the front and back frames of video have their own characteristics , The low-cost image content can be roughly registered according to homography transformation , And integrate it into other tasks .
边栏推荐
- MindSpore新型轻量级神经网络GhostNet,在ImageNet分类、图像识别和目标检测等多个应用场景效果优异!
- 不会写免杀也能轻松过defender上线CS
- A simple and crude method for exporting R language list to local
- 新型冠状病毒变异Delta毒株的模拟(MindSPONGE应用)
- Where is it safer to open an account to buy funds
- 【界面】pyqt5和Swin Transformer对人脸进行识别
- How to open an account on the mobile phone? Is it safe to open an account online and speculate in stocks
- Smartbi gives you a piece to play with Boston matrix
- Leetcode 718. Longest repeating subarray (violence enumeration, to be solved)
- An article takes you to learn container escape
猜你喜欢
Target tracking shooting? Target occlusion shooting? With 1.9 billion installed petal apps, what unique features attract users?
运用物理信息神经网络求解流体力学方程
技术干货|极速、极智、极简的昇思MindSpore Lite:助力华为Watch更加智能
Understanding of "the eigenvectors corresponding to different eigenvalues cannot be orthogonalized"
目标追踪拍摄?目标遮挡拍摄?拥有19亿安装量的花瓣app,究竟有什么别出心裁的功能如此吸引用户?
Let agile return to its original source -- Some Thoughts on reading the way of agile neatness
Can't write to avoid killing and can easily go online CS through defender
敲重点!最全大模型训练合集!
[微服务]Nacos
运筹说 第66期|贝尔曼也有“演讲恐惧症”?
随机推荐
go语言的爬虫和中间件
Pinpoint attackers with burp
Is it reliable to open an account for stock trading on the mobile phone? Is it safe to open an account for stock trading on the Internet
Introduction to message queuing
Openpyxl module
我的c语言进阶学习笔记 ----- 关键字
当Transformer遇见偏微分方程求解
Simple test lightweight expression calculator fly
Thesis study -- Analysis of the influence of rainfall field division method on rainfall control rate
手机上炒股开户可靠吗 网上开户炒股安全吗
Technical dry goods | top speed, top intelligence and minimalist mindspore Lite: help Huawei watch become more intelligent
ubuntu上安装mysql
客户端实现client.go客户端类型定义连接
Amway! How to provide high-quality issue? That's what Xueba wrote!
Electronic Society C language level 1 31. Calculate line segment length
安利!如何提优质的ISSUE?学霸是这样写的!
Is it reliable to open an account on a stock trading mobile phone? Is it safe to open an account online and speculate in stocks
ASP. Net core create MVC project upload file (buffer mode)
Why does EDR need defense in depth to combat ransomware?
[微服务]Nacos