当前位置:网站首页>ByteDance Interviewer: talk about the principle of audio and video synchronization. Can audio and video be absolutely synchronized?
ByteDance Interviewer: talk about the principle of audio and video synchronization. Can audio and video be absolutely synchronized?
2022-06-24 09:44:00 【Cattle within yards】

psychoanalysis : Audio and video synchronization itself is difficult , In general use ijkplayer The third party synchronizes audio and video . Do not rule out live video Video calls require audio and video synchronization , There are three kinds of Audio control Video shall prevail There are three ways to realize audio and video synchronization based on the custom clock
Job seekers : If asked Put your mind right , What you can answer is what you can answer . If you read this article, you can definitely answer
The live broadcast system of audio and video is a complex engineering system , To achieve very low latency live , Need complex system engineering optimization and very familiar with the components of the master . Here are some simple and common tuning techniques :
With fflay Look at the audio and video synchronization process
ffplay The main scheme to synchronize video to audio is , If the video plays too fast , Repeat the previous frame , To wait for audio ; If the video playback is too slow , Then lose the frame to catch up with the audio .
The logic of this part is implemented in the video output function video_refresh in , Before analyzing the code , Let's first review the flow chart of this function :

In this flow ,“ Calculate the display time of the previous frame ” This step is crucial . Let's look at the code first :
static void video_refresh(void *opaque, double *remaining_time)
{
//……
//lastvp On a frame ,vp The current frame ,nextvp The next frame
last_duration = vp_duration(is, lastvp, vp);// Calculate the duration of the previous frame
delay = compute_target_delay(last_duration, is);// Reference resources audio clock Calculate the real duration of the last frame
time= av_gettime_relative()/1000000.0;// Take the system time
if (time < is->frame_timer + delay) {// If the display duration of the previous frame is not full , Repeat the previous frame
*remaining_time = FFMIN(is->frame_timer + delay - time, *remaining_time);
goto display;
}
is->frame_timer += delay;//frame_timer Update to the end of the previous frame , It is also the start time of the current frame
if (delay > 0 && time - is->frame_timer > AV_SYNC_THRESHOLD_MAX)
is->frame_timer = time;// If the deviation from the system time is too large , Then it is corrected to system time
// to update video clock
// Video sync audio doesn't work
SDL_LockMutex(is->pictq.mutex);
if (!isnan(vp->pts))
update_video_pts(is, vp->pts, vp->pos, vp->serial);
SDL_UnlockMutex(is->pictq.mutex);
//……
// Frame loss logic
if (frame_queue_nb_remaining(&is->pictq) > 1) {
Frame *nextvp = frame_queue_peek_next(&is->pictq);
duration = vp_duration(is, vp, nextvp);// Display duration of current frame
if(time > is->frame_timer + duration){// If the system time is already greater than the current frame , The current frame is discarded
is->frame_drops_late++;
frame_queue_next(&is->pictq);
goto retry;// Go back to the beginning of the function , Continue to try again ( You can't just while Frame loss , Because it's possible audio clock It's time again , such delay The value needs to be recalculated )
}
}
}
The logic of this code is included in the above flowchart . The main idea is mentioned at the beginning if the video is played too fast , Repeat the previous frame , To wait for audio ; If the video playback is too slow , Then lose the frame to catch up with the audio . The way to do this is , Reference resources audio clock, Calculate the previous frame ( The picture on the screen ) It should also show how long ( Including the duration of the frame itself ), Then compare with the system time , Whether it's time to display the next frame .
Here is a comparison with the system time , Introduced another concept ——frame_timer. It can be understood as frame display time , If before update , Is the display time of the previous frame ; For updated (is->frame_timer += delay), Displays the time for the current frame .
The last frame shows the time plus delay( It should also show how long ( Including the duration of the frame itself )) That is, the time when the display of the previous frame should end . See the following schematic diagram for the specific principle :

Here are 3 Schematic diagram of two cases :
- time1: The system time is less than lastvp End the displayed time (frame_timer+dealy), That is, the dotted circle position . You should continue to display lastvp
- time2: The system time is greater than lastvp The end of the display time , But less than vp The end of the display time (vp The display time of starts with the dashed circle , End in a black circle ). At this time, it is not repeated lastvp, And don't throw it away vp, It should display vp
- time3: The system time is greater than vp End display time ( Black circle position , It's also nextvp The expected start display time ). You should discard vp.
delay The calculation of
Then we'll look at the most critical lastvp Display duration of delay How is it calculated .
This is in the function compute_target_delay To realize :
static double compute_target_delay(double delay, VideoState *is)
{
double sync_threshold, diff = 0;
/* update delay to follow master synchronisation source */
if (get_master_sync_type(is) != AV_SYNC_VIDEO_MASTER) {
/* if video is slave, we try to correct big delays by
duplicating or deleting a frame */
diff = get_clock(&is->vidclk) - get_master_clock(is);
/* skip or repeat frame. We take into account the
delay to compute the threshold. I still don't know
if it is the best guess */
sync_threshold = FFMAX(AV_SYNC_THRESHOLD_MIN, FFMIN(AV_SYNC_THRESHOLD_MAX, delay));
if (!isnan(diff) && fabs(diff) < is->max_frame_duration) {
if (diff <= -sync_threshold)
delay = FFMAX(0, delay + diff);
else if (diff >= sync_threshold && delay > AV_SYNC_FRAMEDUP_THRESHOLD)
delay = delay + diff;
else if (diff >= sync_threshold)
delay = 2 * delay;
}
}
av_log(NULL, AV_LOG_TRACE, "video: delay=%0.3f A-V=%f\n",
delay, -diff);
return delay;
}
The comments in the above code are all comments of the source code , The code is not long. , Comments account for nearly half , It can be seen that the importance of this code .
The hardest thing to understand in this code is sync_threshold, Draw a picture to help understand :

The coordinate axis in the figure is diff Value size ,diff by 0 Express video clock And audio clock Exactly the same , Perfect sync . Color block at the bottom of the drawing , Represents the value to return , Color block value delay Refers to the passed in parameter , Combined with the code in the previous section , namely lastvp Display duration of .
It can be seen from the picture that sync_threshold Is to build an area , There is no need to adjust in this area lastvp Display duration of , Go straight back to delay that will do . That is, it is considered to be quasi synchronous in this area .
If it is less than -sync_threshold, That is, the video playback is slow , Appropriate frame loss is required . Specifically, it returns a maximum of 0 Value . According to the front frame_timer Graph , At least the screen should be updated to vp.
If it is greater than sync_threshold, So the video is playing too fast , Repeat the display as appropriate lastvp. Specifically, return to 2 Times delay, That is to say 2 Times lastvp According to the length , Also is to let lastvp Show another frame .
If not only greater than sync_threshold, And more than AV_SYNC_FRAMEDUP_THRESHOLD, Then the return delay+diff, By specific diff Decide how long to show ( The intent of the code is not very clear here , As I understand , Unified processing is to return 2*delay, perhaps delay+diff that will do , There is no need to distinguish )
thus , Basically, the process of video synchronization and audio is analyzed , In a brief summary :
- The basic strategy is : If the video plays too fast , Repeat the previous frame , To wait for audio ;
- If the video playback is too slow , Then lose the frame to catch up with the audio .
- The implementation of this strategy is : introduce frame_timer Concept , Mark the display time of the frame and the time when the display should end , Then compare with the system time , Decide whether to repeat or lose frames .
- lastvp The time when the display should end , In addition to considering the display time of this frame itself , Consideration should also be given to video clock And audio clock The difference between the .
- It's not synchronized all the time , But there is one “ Quasi synchronous ” The difference area of .
If you want to know more Android Development 、 More knowledge points related to audio and video development , You can reply to me by private letter 666 Ready to pick up , There are many records Android Audio and video knowledge points . Finally, please praise and support !!!


边栏推荐
- 实战剖析:app扫码登陆实现原理(app+网页端详细逻辑)附源码
- 带文字的seekbar : 自定义progressDrawable/thumb :解决显示不全
- 正则匹配邮箱
- PostgreSQL DBA快速入门-通过源码编译安装
- Inspiration from reading CVPR 2022 target detection paper
- Prct-1400: failed to execute getcrshome resolution
- Amazing tips for using live chat to drive business sales
- 生产者/消费者模型
- ssh远程免密登录
- Time series data augmentation for deep learning: paper reading of a survey
猜你喜欢

LeetCode: 240. Search 2D matrix II

In depth study paper reading target detection (VII) Chinese English Bilingual Edition: yolov4 optimal speed and accuracy of object detection

CF566E-Restoring Map【bitset】
![[GDB debugging tool] | how to debug under multithreading, multiprocessing and running programs](/img/b5/38a53c88240c4308452d0208173461.png)
[GDB debugging tool] | how to debug under multithreading, multiprocessing and running programs

文献调研报告

二十、处理器调度(RR时间片轮转,MLFQ多级反馈队列,CFS完全公平调度器,优先级翻转;多处理器调度)

Servlet fast foundation building

零基础自学SQL课程 | 子查询

如何规范化数据中心基础设施管理流程

R ellipse random point generation and drawing
随机推荐
PTA monkey chooses King (Joseph Ring problem)
P6117-[joi 2019 final] greedy
谈谈数字化转型晓知识
IDEA 无法保存设置 源根 D:XXXX在模块XXX中重复
实战剖析:app扫码登陆实现原理(app+网页端详细逻辑)附源码
Literature Research Report
Event registration Apache pulsar x kubesphere online meetup hot registration
Netrca: an effective network fault cause localization
e的lnx为什么等于x
百度AI模板 获取知识理解
In depth study paper reading target detection (VII) Chinese English Bilingual Edition: yolov4 optimal speed and accuracy of object detection
LeetCode: 137. 只出现一次的数字 II
PostgreSQL
Codeforces Round #392 (Div. 2) D. Ability To Convert
达梦数据库如何定位锁等待问题解决方法
如何提高网络基础设施排障效率,告别数据断档?
PHP封装一个文件上传类(支持单文件多文件上传)
最新Windows下Go语言开发环境搭建+GoLand配置
Turn to: CEO of Samsung Electronics: all decisions should start from recognizing yourself
PTA猴子选大王(约瑟夫环问题)