当前位置:网站首页>Research on depth image compression in YUV420 color space
Research on depth image compression in YUV420 color space
2022-06-25 21:52:00 【User 1324186】
source :SPIE Optical Engineering + Applications, 2021 Speaker :Changyue Ma Content arrangement : Feng Donghui In this paper , The author puts forward two methods to adjust to RGB Image design depth image compression framework to compress YUV420 Images ; Based on lightweight framework , The adjustment is further studied YUV The influence of channel training distortion weight on coding performance .
Catalog
- brief introduction
- Proposed method
- Training and testing details
- experimental result
- Conclusion
brief introduction
at present , Most depth image compression methods are designed to compress RGB Image in color space . However, the traditional video coding standards , Is mainly designed to compress YUV420 Images in color space . In this study , The author first studies how to adjust RGB Image depth compression framework , To compress YUV420 Images . Then the adjustment is studied YUV The influence of channel training distortion weight on coding performance , The experimental results are compared with HEVC and VVC AI Configuration for comparison . The proposed method is suitable for intra coding of image compression and video compression .
Image compression plays a key role in image storage and transmission systems . In the last few decades , A large number of companies and institutions in the world have been committed to image compression , And released several famous image coding standards , Such as widely used JPEG1 and JPEG20002 standard , Video coding standard Main Still Picture profiles, Such as H.265/HEVC3 And recently finalized H.266/VVC4, To support efficient image compression . In all these standards , One includes internal forecasts 、 Transformation 、 The hybrid coding framework of quantization and entropy coding is used to realize efficient compression by using various redundancies in the image . However , Because the modules in the hybrid coding framework are usually designed separately , It becomes more and more difficult to improve the coding performance based on the basic framework .
lately , Depth image compression shows a trend of rapid development , And achieved gratifying results . Compared with traditional image compression methods , Depth image compression can optimize all modules in its compression framework in an end-to-end manner . at present , Among all the depth image compression methods , Transform coding and context adaptive entropy model are the most representative methods , Can achieve the best coding performance . However , Most deep compression frameworks are designed for compression only RGB Image in color space , Without paying attention to YUV Image compression in color space .
in consideration of YUV There are many images in color space , and H.265/HEVC and H.266/VVC And other video coding standards Main Still Picture Compression is supported in the configuration file YUV Images in color space , There have been some attempts to apply a deep compression framework to compress YUV Images in color space . The proposal JVET-T0122 This paper studies the application of the same depth compression framework to compression RGB Color space and YUV444 Color space images , And VVC AI Configuration compared to , Changes in coding performance . Besides , The proposal JVET-T0123 Studied how to make RGB The depth compression framework of image design is used for compression YUV420 Images in color space , Three different depth image compression frameworks are proposed , To compare with HEVC and VVC AI Configured encoding performance .
In this paper , The author studies how to adjust to RGB Image design depth compression framework to compress YUV420 Image in color space . Based on depth image compression platform CompressAI Medium cheng2020-attn Model , author Two depth image compression frameworks are proposed to encode YUV420 Images in color space . Besides , The author studies the relation between VVC and HEVC AI Configuration compared to , When adjusting Y、U and V The training distortion weight of the channel , The impact of coding performance .
Proposed method
Based on depth image compression platform CompressAI Medium mbt2018 Model , The proposal JVET-T0123 Three different frameworks are proposed to compress YUV420 Color space video . In their first approach , The luminance and chrominance channels pass through separate convolution layers and GDN layer , And merge before the second convolution layer . In their second method , Using a mbt2018 The independent neural network codec encodes the luminance and chrominance channels respectively . In their third method , Luminance channels are downsampled in each dimension 2 times , To get 4 Brightness channels . Brightness channel and 2 Chroma channels (6 Channel input ) superposition , And by the mbt2018 Codec processing .
Experimental results show that , In three ways , The first method can achieve the best coding performance . The reason may be the second way for them ,Y and UV Correlations between channels cannot be exploited , because Y and UV It is optimized separately ; And for their third method , Due to the down sampling operation , The correlation between adjacent pixels in the brightness channel is reduced .
In this paper , The authors jointly optimize in a depth image compression framework Y and UV passageway , And keep Y and UV The resolution of the channel does not change . chart 1 Two proposed deep compression frameworks are shown , Used in depth image compression platform CompressAI Based on cheng2020-attn The model of compression YUV420 Image in color space . In the first framework proposed , The luminance and chrominance channels pass through separate convolution and activation layers , And combine before down sampling . In the second framework proposed , The chrominance channel is first upsampled through a simple convolution layer , Then merge with the brightness channel .
chart 1: Two proposed YUV420 Depth image compression framework .
For training depth image compression framework , The training objective is to minimize the weighted sum of distortion and bit rate . For distortion , The author tries to understand the YUV Channels use different distortion weights , Such as 1:1:1、2:1:1、4:1:1、6:1:1 and 8:1:1. As shown below :
YUV Channel weighted distortion .
Training and testing details
DIV2K Data set and UCID Data set as training set , Randomly cut to 256×256 The image block of . Internet use Adam Training , The batch size is set to 16. The initial learning rate is set to 1e-4 And iterate about 7e5 Time , Then the learning rate is reduced to 5e-5, The final iteration is about 3e5 Time . The training of network adopts distortion measure MSE. Trained 4 A model ,λ Value is set to 0.005、0.01、0.025、0.1, The corresponding number of latent variable channels is 128、128、192、192.
Kodak Data set containing 24 Zhang uncompressed 768×512 Images , Converted to YUV420 Format and as a test set . To evaluate rate distortion performance , In bits per pixel (bpp) To measure the bit rate , use PSNR To measure distortion . Bit rate - The distortion (RD) Curves are used to compare the coding performance of different methods . In addition, use BD-rate Reduce to evaluate the specific coding performance value .
experimental result
First , The author compares the two proposed deep compression frameworks in YUV420 Coding performance on images . The two depth image compression frameworks are based on YUV Distortion weight 8:1:1 Training . As shown in the figure below , The coding performance of the two frameworks is very similar in all channels . Compared with the second framework , The first frame is in Y、U and V The channel implements 0.7%、1.24% and -0.36% Of BD-rate gain . However , The minor coding performance improvement of the first framework is to increase 17% Network parameters and 28% At the cost of testing time . therefore , Here we choose the second framework as the research YUV Reference for different distortion weights of the channel .
chart 2: The two proposed frameworks are in Kodak On dataset RD curve .
chart 3 Is the second depth image compression framework proposed in YUV Channel with different distortion weights RD curve , And VVC Testing software VTM-11.0 and HEVC Testing software HM-16.22 stay YUV Comparison of channels . From the picture 3 It can be seen that , As it gradually increases Y The distortion weight of the channel , The proposed depth image compression framework is implemented in Y The coding performance of the channel is improved , And in the U and V The coding performance of the channel decreases , This is consistent with intuition . Besides , surface 1 And table 2 The proposed depth image compression framework in YUV In the channel VTM11.0 and HM16.22 Of BD-rate gain , Where the negative number represents the coding gain . From the table 1 And table 2 It can be seen that , stay YUV420 In the color space , Depth image compression framework and VTM-11.0 There is still a gap in coding performance , But in all YUV In the passage , The coding performance of the deep image compression framework has exceeded HM-16.22.
chart 3:Kodak On dataset YUV Different distortion weights of the channel RD curve .
surface 1: In different YUV Distortion index ,Framework2 comparison VTM-11.0 Overall performance of .
surface 2: In different YUV Distortion index ,Framework2 comparison HM-16.22 Overall performance of .
Besides , You can use different YUV Distortion weights handle different bit rate points . From the picture 3 It can be seen that ,Framework2-611 And VTM-11.0 stay U and V There is a large coding performance gap between the two lowest bit rate points of the channel . It can be used Framework2-211 The lowest bit rate point of 、Framework2-411 The second low bit rate point of Framework2-611 The two highest bit rate points of , And VTM11.0 and HM16.22 Compare , Corresponding RD Curves and BD-rate The gain is shown in Figure 4 And table 3.
chart 4:Kodak Envelope curve on data set .
surface 3: Compared with the envelope curve VTM-11.0 and HM-16.22 The overall performance of .
Conclusion
In this paper , author Two methods are proposed to adjust to RGB Image design depth image compression framework to compress YUV420 Images , The proposed method is suitable for intra coding in image compression and video compression . Based on lightweight framework , The adjustment is further studied YUV The influence of channel training distortion weight on coding performance . Experimental results show that , The latest depth image compression framework and H.265/HEVC Compare the test model , stay YUV420 Better coding performance can be achieved in color space , But with H.266/VVC Compare the test model , There is still a gap in coding performance , Depth image compression needs more advanced technology to go beyond YUV420 The latest video coding standard of color space VVC.
Finally, the video of the speech is attached :
http://mpvideo.qpic.cn/0bc3qeab6aaaieahftqdw5rfbaodd6aqahya.f10002.mp4?dis_k=9831d5787faa089145ae4db57f15fb7e&dis_t=1645153536&vid=wxv_2261562038395289603&format_id=10002&support_redirect=0&mmversion=false
边栏推荐
- Win11录屏数据保存在哪里?Win11录屏数据保存的位置
- PHP runtime and memory consumption statistics code
- HNU network counting experiment: experiment I application protocol and packet analysis experiment (using Wireshark)
- Finger collar pin exclusive Medal
- C language dynamic memory allocation
- Win11无法删除文件夹怎么办?Win11无法删除文件夹的解决方法
- How to solve the problem of flash write protection in STM32?
- Please enter an integer and output it as several digits, and output each digit in reverse order.
- 挖财证券开户安全嘛?
- Milan video technology exchange meeting sharing
猜你喜欢

How does idea package its own projects into jar packages
![[nailing scenario capability package] company / Park Digital canteen](/img/72/eb3df1945532c4e7813e15f9cf90c5.jpg)
[nailing scenario capability package] company / Park Digital canteen

Win11录屏数据保存在哪里?Win11录屏数据保存的位置

Finger collar pin exclusive Medal

Command 'GCC' failed with exit status 1 when PIP install mysqlclient

Using two stacks to realize the function of one queue?

How to solve the problem of flash write protection in STM32?

idea怎么把自己的项目打包成jar包

Bear market guide | some essential lessons and specific survival rules

HNU数据库系统概论 ODBC
随机推荐
Oracle case: does index range scan really not read multiple blocks?
IAAs, PAAS, SaaS, baas, FAAS differences
熊市指南|一些本质的教训与具体的生存法则
Robotframework rewrite framework add case control
Jmeter- (IV) regular expression for interface testing
智云健康上市在即:长期亏损,美年健康俞熔已退出,未来难言乐观
Top in the whole network, it is no exaggeration to say that this Stanford machine learning tutorial in Chinese notes can help you learn from the beginning to the mastery of machine learning
Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing
HNU network counting experiment: Experiment 4 application layer and transport layer protocol analysis (packettracer)
How to solve the problem of flash write protection in STM32?
Virtualenvwrapper solves the installation error, and virtualenvwrapper is permanently effective
Send a more awesome website, which can convert curl commands into code in any language
Big end and small end
Legal mix of settlements (utf8mb4_0900_ai_ci, implicit) and (utf8mb4_general_ci, implicit) error resolution
ITU AI and multimedia Seminar: exploring new areas and cross SDO synergy
Bat script simple command
Is it safe to open an account with qiniu securities?
"Exclusive interview with IDC people" Suzhou Shengwang: the road of IDC transformation in the post epidemic Era
Pat 1073 scientific notation (20 points) (18 points not finished)
InfiniBand& RDMA