当前位置:网站首页>Mastering quantization technology is the key to video compression
Mastering quantization technology is the key to video compression
2022-06-25 21:50:00 【User 1324186】
source :IBC2021 be the speaker :J. Le Tanou @ MediaKind translate : Zhong Hongcheng The lecture first reviewed the basic knowledge of scalar quantization and rate distortion theory , Then we discuss how to jointly optimize different levels of quantization to improve coding efficiency . Two quantization techniques are introduced : Spatiotemporal dependent adaptive quantization (STAQ) And local quantization refinement (LQR), These methods can be used in HM and X265 Bring about 30% Compression performance .
Catalog
- Motivation and purpose
- Quantitative basis
- Quantization in modern video codec standards
- Picture/Slice Level
- Block/CU Level
- Coefficient Level
- Summary of this chapter
- Two quantification techniques
- Spatio-Temporal Adaptive Quantization (STAQ)
- Local QP Refinement (LQR)
- summary
Motivation and purpose
The challenge of video transmission in limited networks and storage .
- Shoot from the scene or make content , To the final customer delivery , Video content has undergone several stages of transformation .
- The entire transmission chain must rely on video compression to reduce its cost in terms of bandwidth and storage consumption .
- Compression is the main thing in the chain Damage treatment , With Weigh video quality against bit cost .
Video coding uses information redundancy of signals to reduce data rate . Lossless coding depends on : Differential prediction coding 、 Transformation 、 Entropy coding . Lossy coding by adding quantitative Process to further improve compression efficiency .
chart 1: Hybrid video coding framework
Hybrid video coding framework , Different modules have different degrees of freedom . Fixed processing steps :
- Entropy coding
- Inverse quantization 、 Reverse transformation
Restricted processing steps :
- Motion prediction
- In loop filtering
Completely free to process steps
- Transformation
- quantitative
Quantitative basis
The concept of quantification is simple . The goal is to map a set of values to a smaller number . Quantification is an irreversible process , Because it introduces data loss . This is the meaning of lossy compression : Select a representative for several values .
It's not about Vector quantization (Vector quantization, VQ), In the usual video coding scheme, only Scalar quantization (Scalar Quantization, SQ), Among them, the transformation (Transform) The process has been assumed to output decorrelated samples .
In video encoder , Scalar quantization operation ( With rounding ) Integer division implementation , The divisor is the quantization step (QStep). for instance , except 10 Rounding to the nearest integer operation will result in \{0,1,2,3,4\} Quantify to 0, \{5,6...13,14\} Quantify to 1. The larger the quantization step , The higher the quantization strength and signal loss . Inverse quantification is defined by the standard , Basically, the quantized signal is multiplied by the quantization step size .
chart 2: Scalar quantization / Inverse quantization
VVC The latest development of quantitative syntax coding . actually , By adding memory to the coefficient encoding Syntax ,“ Relevant quantification (Dependent Quantization)” The implementation uses a single syntax element to encode two possible quantized values . The inverse quantization at the decoder depends on the path of the coefficients decoded by the previous state machine . For all that , The basic mechanism is still scalar quantization .
Quantization in modern video codec standards
The quantization process in modern video coding and decoding standards has multiple levels of control , Include a quantitative parameter (Quantization Parameter, QP) And several optional refinement steps . stay AVC/H.264 and HEVC/H.265 in ,QP The scope is [0, 51] . stay VVC/H.266 in ,QP The scope extends to [0, 63] . For these standards ,QP Used as an index to generate quantization steps ,QP Every increase 6,QStep Double the size .
chart 3: Quantitative control hierarchy
QP It can be adapted at the picture level , It can also be in the block ( Or coding unit CU) Levels are adapted with finer granularity , Pictured 3 Shown . Picture level QP Adaptations are usually used in GOP Satisfy the global rate constraint on ( The target bit rate ), That is to say GOP Internal adjustment to optimize RD performance . for example , Used as a given GOP It may be appropriate to spend more bits on the reference frame of the structure , Because its errors will propagate forward through prediction .
Picture/Slice Level
Picture/Slice QP Used to calculate Slice Of the first block of QP. If you do not further refine , This QP The value will represent the entire Slice( The whole picture or part of a picture ). Quantization matrix Allow frequency dependent quantization , As QP A supplement to , Signals are usually sent at the sequence or picture level . For each transform block size , The step size of the quantizer can be adjusted according to the position of the frequency coefficient . for example , In order to better match the perception of human visual system , Compared with high frequency coefficient , The low-frequency coefficients can be quantized with more precise quantization steps . stay MPEG Video standards , This is managed by scaling the matrix , These matrices are optionally in the sequence and picture parameter sets (SPS and PPS) Transmission of , And at the picture level, refer to ( Or not use ).
Block/CU Level
In the coding unit (CU) Level adjustment QP Is the cornerstone of compression efficiency . In most video codecs / In the standard , Allow blocks QP Value adaptation . Usually , To reduce the amount of data to be transmitted per coding block , For local QP Value for DPC code .QP Prediction is from adjacent blocks and the previous one in decoding order QP On going , So only delta QP, That is, the predicted value and the current QP The difference between values , Must be signaled .
Again , Only QP Prediction and signaling mechanisms are standardized , This gives us the freedom to develop with various optimization criteria 、 Various ranges ( Block level 、 Frame level 、GOP level ) And the adaptation of various complexities QP (AQP) Algorithm . Usually , block QP Values can be specific to the local characteristics of the signal , Or block correlation optimization quantization level in prediction scheme , To provide better visual quality . Use RDO Make partial QP Adapting to local features is an option to maximize compression efficiency . As mentioned earlier , The aim is to limit distortion with respect to the rate . Distortion and QStep relevant , That is, with QStep increase , Distortion increases . According to the rate of the quantization step, it is slightly more complicated , Because the rate depends on delta QP cost 、 The number and size of quantization coefficients .QStep The higher the , The lower the number and magnitude of the transformation coefficients , But maybe delta QP The cost of higher . The ideal solution is for GOP Each block in is determined to provide the best global RDO Of QP Combine .
AQP The algorithm usually aims to determine the best a priori for each block QP, To provide the best overall subjective or objective quality , At the same time, the rate constraint is satisfied . These algorithms can be designed to consider only spatial information ( That is, intra frame or intra block statistics ) To estimate the frame QP. Better algorithms usually consider time information ( That is, inter frame statistics ), For example, try to measure GOP Block persistence in . This algorithm , For example, time and space AQP By estimating all the temporal and spatial correlations between blocks to a GOP in , Successfully optimized the overall situation better R-D Balance .
As a priori AQP Supplement to the algorithm ( Based on estimates ), It can be used for local QP Perform a posteriori improvement , To adjust RDO. Such a posteriori algorithm , For example, local QP elaboration (LQR) or “ many QP Optimize ”, By minimizing local RDO To adjust a set of given blocks QP The candidate . If implemented carefully ,LQR The algorithm can be used without affecting the overall situation RDO Significantly improve the coding efficiency . The added value of coding efficiency is not based on estimation , It is based on real distortion and rate measurements , All dependencies between blocks are accurately considered .
Coefficient Level
The final quantization adjustment of each transform coefficient is also possible . It can help based on a given R-D Standard minimization to improve objective scores . But additional perception criteria ( For example, noise shaping 、 Coefficient filtering / Discard, etc ) It can also be used to reduce specific visual artifacts ( For example, strip 、 Ring, etc ). The main advantage of coefficient quantization optimization is that it does not introduce any additional syntax bit cost ; Only the quantized value is adjusted , At the same time, keep the quantization parameters unchanged .
For each coefficient , Round off ( Integer division introduced into quantization ) Set the threshold , Used to map a set of values to unique values . Back to the previous quantization step is equal to 10 An example of , We can move the rounding threshold , bring \{0,1,2,3,4,5,6,7,8\} Quantified as 0,\{9,10 ,….,17,18\} To 1 etc. . Adjusting rounding provides a great deal of freedom in the quantization process . For the example discussed , It is just a modification of the dead zone , But you can design smarter strategies .
Lattice quantization , for example RDO-Q It is an option of intelligent quantization strategy at coefficient level . In a typical configuration , For each coefficient , Two possible refactoring values should be tested ( Rounding down and rounding up ), And shall be based on the given R-D The standard keeps the best one . for example , Given 57 The sum of the coefficients of is equal to 10 Quantization step size , around 5.7 The possible quantification coefficient of is 5 and 6, Possible refactoring values are 50 and 60. Each coefficient in the block has the same two options , Resulting in Grid Architecture . It defines the minimum path problem optimization using Viterbi algorithm , Used to identify the best combination of rounding .
We will not further develop this part , However , An important comment is that coefficient level quantization optimization does not affect quantization step size / Parameter optimization , This is the purpose of this article .
Summary of this chapter
The quantification process can be optimized at different levels of granularity , Most technologies can be combined . Once defined The distortion D, The optimization problem to be solved is in the rate R Minimize under constraints D. Unfortunately , When dealing with the actual implementation , Computational complexity and resource consumption are additional constraints to weigh compression efficiency .
Two quantification techniques
- Spatio-Temporal Adaptive Quantization (STAQ)
- Optimize the overall situation R-D standard , Take coding dependency into account GOP
- Yes GOP Time distortion propagation on
- Based on a priori R(D) Output of each block modeled “ The best ” delta-QP
- Local QP Refinement (LQR)
- Optimize local without considering coding dependencies R-D
- iteration RDO Method , The posterior of each block can be refined delta-QP value
- Help compensate for R(D) Model leakage or to STAQ Simplification of
Spatio-Temporal Adaptive Quantization (STAQ)
STAQ It is an overall situation R-D optimization algorithm , Throughout GOP On the basis of R To minimize the D, And provide the best local quantizer for each block . In practice , The algorithm is MBTree Deep evolution of algorithm , All of these mechanisms have been re examined , Yes R-D Standards are better modeled . The most remarkable thing is ,STAQ Distortion modeling in allows easy introduction of perceptual criteria , And based on MSE Compared to the simpler model of , This helps to significantly improve subjective quality outcomes .
STAQ Based on a single principle : Distortion propagates over time . The quantization process applied to each block produces distortion . By forecasting ( Motion compensation prediction ), The partial distortion generated on each reference block is propagated to the next block for coding through motion compensation . therefore , Image by image coding , Block distortion accumulates over time . Usually , Time distortion propagation ( From one image to another ) stay Skip Maximum in mode , Propagation stops when intra coding ( There is no motion compensation ). The essence of the algorithm is to identify the most referenced sample areas in the prediction , Code these areas as well as possible ( Low distortion / low QP), And copy as many of these areas as possible ( The bit rate is almost zero ).
chart 4: Blocks that are reused by reference and blocks that are not reused by reference
Pictured 4 Shown , In the upper left corner of the first image , Green area ( Or block ) Persist in the next image of the sequence , And it is often referred to as prediction . This region will use a lower quantization step when encoding the first image , The region under consideration is relatively static in time , Therefore, continuous motion compensation will tend to skip mode ( A copy of the sample area ), And the encoder will require almost no bits to obtain the minimum distortion . The same principle applies to any well predicted motion region . therefore , There is an ideal side effect : The copy does not cause video quality fluctuations , And the video quality will be stable over time . contrary , When occlusion occurs in the future image ( chart 4 Red area in ), The next block is likely to be intra encoded , So as to break the time dependence . therefore , For areas with low prediction reference probability , No need to spend too many bits on coding .
STAQ A weighted dependency network is constructed , Will be the same GOP All the blocks of are connected together , Motion estimation is considered 、 Coding mode probability 、 Other information estimated from the forward looking module and GOP The target bit rate of . Space ( In frame ) Distortion also propagates , It usually propagates down from the upper left corner of the image to the lower right corner of the image ( Depending on the standard ).STAQ Integrate spatial and temporal distortion propagation into its R-D Optimization .
STAQ Provides impressive objective benefits . We are in the paper “Optimal Adaptive Quantization based on Temporal Distortion Propagation model for HEVC” A simplified STAQ Model , It is called spatiotemporal quantization based on rate distortion (RDSTQ) Algorithm . By means of HEVC Reference model (HM) Implementation of RDSTQ Algorithm , We report on the same basis SSIM And based on PNSR Under the quality of , Compared with no adaptive quantization , Average bit rate savings of up to -26.9% and -15.6%. stay HM In the context of , The proposed algorithm is obviously superior to the most advanced related methods . surface 1 The use of JCT-VC Coding efficiency results of test conditions .
surface 1: RDSTQ Of RD performance
In addition to objective index score comparison , According to the pairing comparison method , In non experts MediaKind Several subjective quality assessment meetings were held among the employees of . The analysis of the results shows that , because STAQ Algorithm , The quality of space and time has been improved consistently . STAQ A very important and inherent benefit of is to improve the stability of video quality over time , This is a SSIM or PSNR There is no measurable characteristic .
Last , As mentioned earlier ,STAQ Rely on the pre analysis module to perform various signal statistical estimation , be called Look-ahead.Look-ahead Modules are sub processes available in most efficient commercial encoders . Compared with no adaptive quantization , by STAQ Modeling adding to MediaKind The computational cost of the optimized software encoder has no effect on the overall coding run time 3%( Use optimization and multithreading ). Relative to a smaller run time increase , The significant video quality gain makes STAQ It has become one of the most powerful adaptive quantization algorithms .
Local QP Refinement (LQR)
LQR Representing local QP Optimize . Roughly speaking , For each block or CU Conduct Detailed coding , It includes distortion produced by measurement ( From the reconstruction cycle ) And speed ( From entropy coding estimation ) Weigh Compete thoroughly for several sets of local quantitative parameters . This brute force algorithm or concept is not new ; But it needs a lot of knowledge to effectively implement real-time software coding , And combine with the overall situation R-D Optimize .LQR My motivation is , A set of local quantizer candidates by refining or adjusting the posterior will help track two favorable situations : Or partial “ Distortion reduction ”( For almost the same rate ), Or partial “ Rate reduction ”( For almost the same distortion ).
STAQ A priori in RD Modeling has some known limitations :
- Shannon entropy is used to estimate the bit rate , but CABAC Not necessarily consistent with Shannon entropy
- Bypass syntax signaling cost is not considered , Such as delta-QP grammar
- Some assumptions are used in the theoretical analysis , Such as high bit rate assumption
- The model contains some simplifications , Such as inter Probability, etc .
In this paper , We prove ( Posteriori ) Take advantage of local distortion or rate QP Optimization can be done without compromising any ( transcendental ) overall situation R-D Additional compression efficiency in case of optimization . in fact ,LQR The plug-in can complement the global in coding efficiency RDO Self adaptation of excitation QP Algorithm , Such as STAQ. One explanation is , It helps to compensate the estimation error of prior model , Such as STAQ, Through real posterior measurements ( verification ) Distortion and rate ; For example, to better optimize the increment QP Grammar cost . Besides , And STAQ equally ,LQR Executive R-D Optimization can be driven by various distortion criteria , Such as MSE Or anything else based on HVS The measurement .
chart 5: According to the distortion of the reconstructed mesh
After inverse quantization, distortion will occur . A slight change in the quantization step size will also slightly change the proportion of possible reconstruction values : The possible reconstruction value slides in the direction of the quantized value or in the opposite direction . chart 5 Explains the use of HEVC Quantizing parameters 𝑃 The displacement of the possible reconstruction value ( The offset ). An equal reconstruction curve for a given quantized value is drawn to show the logarithmic form . The quantized value is determined by 𝑃 To QStep The scale defines the reconstruction value on the grid . Pictured 5 Shown , If we consider a given coefficient ( QP value ), This kind of favorable situation on the grid is defined by us “ Distortion decreases ”. Although the probability of this case decreases with the number of non-zero transform coefficients in the block , But for a large proportion of non-zero transform coefficients in multiple blocks ,“ Distortion decreases ” The effect is still possible .
contrary , For almost no distortion increase , For some selected QStep value , The rate of blocks can be advantageously reduced . Through the design , When you add 𝑃 when , The magnitude of the quantized value will decrease , And the rate will decrease as expected ; in the majority of cases , Quantization error and distortion will increase accordingly . Interestingly , Can be observed , For some transformed coefficient distributions , The rate drops ( or “ falling ”) The increase relative to local distortion may be significant , This leads to local R-D Balance .
Besides , We observed that , For some transformed coefficient distributions , And in CABAC In the context of , Small 𝑃 A decrease may result in almost no rate increase . It can be explained by two facts . First , When change 𝑃 when ,CABAC The context may be more appropriate , Thus, the entropy coding of quantized coefficients is more effective . secondly , A slight increase in the quantization coefficient bit rate can be achieved by differential 𝑃 Syntax bit rate reduction to compensate for . The rate of these two cases is lower than what we call “ratedrop” The expected .
Overall speaking , Without affecting any overall situation RDO Under the circumstances , Refine a group locally 𝑃 Candidates can benefit both rate and distortion , This is it. LQR What we did .
Last , Due to additional heuristics and optimizations ( For example, distortion estimation in transform domain ),LQR The implementation can remain reasonable in terms of computational overhead , stay MediaKind Optimized software HEVC Encoding , The overall coding run time increase is less than 10%.
surface 2 Sum up LQR Algorithm combination STAQ (RDSTQ) Compression efficiency performance , stay x265 To implement and use JCT-VC Common test conditions . As STAQ Supplement to the algorithm , The coding efficiency add-on can save about 6% Bit rate of same PSNR Or based on SSIM The quality of the .
surface 2: stay RDSTQ Add LQR Of RD performance (x265)
summary
By sharing an overview of hybrid video coding schemes applicable to most modern video compression standards , We emphasize quantization in optimizing video quality - Key role in bit rate tradeoffs , As ( almost ) The only adjustable lossy processing step enters any coding system . therefore , We detailed the various levels of granularity and control points that can be used for quantitative optimization , Especially a block or CU Grade QP The adaptive .
As a practical example , We introduced and shared some insights on two adaptive quantization algorithms :STAQ and LQR. Careful implementation of these two complementary algorithms can be upgraded based on HEVC Real time software encoder , On the same basis SSIM Under the quality of , Bit rate savings exceed -25%. These algorithms will benefit from any support for local QP Adaptive standards ( for example MPEG-2、H264/AVC、H266/VVC etc. ).
Usually , The software reference encoder model for video standard development does not implement the look ahead module and advanced encoder quantization technology , To some extent, it underestimates the compression efficiency performance provided by a given standard . As a result, most commercial encoder providers will compete for these missing performance optimizations .
Finally, the video of the speech is attached :http://mpvideo.qpic.cn/0b2ejyaacaaadiaekihss5qvatwdafhaaaia.f10002.mp4?dis_k=f528acceca5c03ffdff4fb0391e1d552&dis_t=1645152642&vid=wxv_2242692425712615431&format_id=10002&support_redirect=0&mmversion=false
边栏推荐
- Measurement fitting based on Halcon learning -- Practice [2]
- Bear market guide | some essential lessons and specific survival rules
- Build the first website with idea
- [nailing scenario capability package] company / Park Digital canteen
- Input a line of characters to count the English letters, spaces, numbers and other characters
- multiplication table
- mysql 通过sql 修改多表增加多个字段
- STM32 self balancing robot project, with code, circuit diagram and other data attached at the end (learning materials and learning group at the end)
- The robotframework executes JS commands to move the mouse from X to y
- Beginner to embedded development
猜你喜欢
数学分析_笔记_第4章:连续函数类和其他函数类
Beginner to embedded development
"Exclusive interview with IDC people" Suzhou Shengwang: the road of IDC transformation in the post epidemic Era
PHP Chinese word segmentation API, Harbin Institute of technology ltpcloud, naturallanguageprocessing, free, best practices!
Measurement fitting based on Halcon learning -- Practice [2]
HNU计网实验:实验一 应用协议与数据包分析实验(使用Wireshark)
On ACM competition
Processing of limit operator in Presto
org. apache. ibatis. exceptions. PersistenceException:
Top in the whole network, it is no exaggeration to say that this Stanford machine learning tutorial in Chinese notes can help you learn from the beginning to the mastery of machine learning
随机推荐
Understand which C standards are there & understand the C compilation pipeline
Pat 1083 list grades (25 points)
[nailing - scenario capability package] nailer card
Jmeter- (II) basic interface and common components for interface testing
Explain memcached principle in detail
Invalid bound statement (not found): com. qf. mapper. PassengerMapper. findByPassengerId
智云健康上市在即:长期亏损,美年健康俞熔已退出,未来难言乐观
Pat 1050 string subtraction (20 points) string find
Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing Bing
JVM Foundation
Free your hands and automatically brush Tiktok
24 张图一次性说清楚 TCP
"Exclusive interview with IDC people" Suzhou Shengwang: the road of IDC transformation in the post epidemic Era
MySQL modifies multiple tables and adds multiple fields through SQL
On merging and sorting
About the version mismatch of unity resource package after importing the project
Docker Alpine image installation PHP extension redis
leetcode: 49. 字母异位词分组
Measurement fitting based on Halcon learning -- Practice [2]
“No bean named ‘UserController‘ available“