当前位置:网站首页>Research on natural transition dubbing processing scheme based on MATLAB
Research on natural transition dubbing processing scheme based on MATLAB
2022-06-26 16:28:00 【Zhuoqing】

Jane Medium : Abstract : This article starts from modifying the background of video dubbing , Put forward the goal of natural integration and modification of audio ; By analyzing the factors of environmental sound , The processing scheme of recording impulse response and convolution with audio is determined ; Then through the actual test and MATLAB The feasibility of convolution scheme is explored , After improvement, the effect is acceptable ; Finally, a feasible flow scheme is proposed and the problems in the experiment are supplemented .
key word: Convolution , Audio processing , Impulse response
One 、 The background and objective of the problem
1. The background of the question
I am learning , stay B standing ( Bilibili) Operating personal account , Release some videos about unpacking evaluation of digital products . In the process of video production , Occasionally, you need to modify the audio of the original material , For example, to make up for a slip of the tongue 、 Correct typos and add some content . For the material recorded in the scene outside the dormitory , If you dub it directly in the dormitory , Due to different recording scenes , There is a strong sense of conflict between the picture and the sound , The transition is not natural .
2021 year 12 month , The author participated in the late editing of the class drama . In one of the scenes , The voice of the female leader is obviously less than that of the male leader , Post dubbing is required . Due to the lack of outdoor shooting conditions at that time , The post dubbing work is carried out in the dormitory , Although the recording effect is very good , But there is no outdoor background sound , When integrated into the video, the effect is poor .

▲ chart 1.1 Class play clips that need post-processing 2. The goal to achieve
Based on MATLAB Audio processing for , Make it quiet in the dormitory 、 Audio recorded without echo , can
It can be naturally integrated into other environments . In this way, the workload of later modification is reduced , It can also improve the final video
The overall look and feel of .
Two 、 Theoretical scheme analysis
1. Factors that produce environmental sound
When shooting outdoors , The environment will have a great impact on audio recording . The sound received by the microphone is in addition to the direct sound from the characters in the video , And the walls 、 Reflected sound from the ground , Noise from passing motor vehicles , There is even air flow 、 Background noise such as microphone noise .

▲ chart 2.1 Schematic diagram of microphone receiving sound during outside diameter shooting These sounds will have a considerable impact on the final recording , Can not be ignored . therefore , The sound received by the microphone is equivalent to the sound of the target sound after being processed by a specific environmental system . If you want to achieve the goal of simulating location recording in a quiet environment , It is necessary to find a way to describe the system .
2. Impulse response
Impulse response is defined as : A system under test when a pulse excitation signal is input , The obtained time domain response characteristics . In acoustic analysis , It is considered that the impulse response is the acoustic signature of a system , Contains a wealth of information about the system , Including arrival time 、 Frequency component 、 Reverberation attenuation characteristics and overall frequency response, etc . therefore , By measuring the impulse response, a system description scheme can be obtained .

▲ chart 2.2 Composition diagram of impulse response ( Source network )3. Using convolution to realize audio processing
Just mentioned , The sound received by the microphone is equivalent to the sound of the original sound after being processed by a specific environmental system . set up Y For the sound received by the microphone , X For the original sound , H Transfer functions to the system , stay s Domain has :
Y ( s ) = H ( s ) ⋅ H ( x ) Y\left( s \right) = H\left( s \right) \cdot H\left( x \right) Y(s)=H(s)⋅H(x)
Since the system transfer function is equal to the impulse response of the system , therefore H(s) It can be obtained by actually measuring the impulse response . from s The operational relationship between domain and time domain , Yes :
y ( t ) = h ( t ) ∗ x ( t ) y\left( t \right) = h\left( t \right) * x\left( t \right) y(t)=h(t)∗x(t)
therefore , Convolute the audio recorded in a quiet environment with the impulse response recorded in a specific environment , In theory, you can get audio that is similar to the recording effect in this environment , Then it can be harmoniously integrated into the materials that need to be changed .
3、 ... and 、 Test practice
1. Test plan development
(1) Selection of test scenarios
Combine the actual conditions with the ease of operation , Select the quiet recording environment inside the dormitory , The specific recording environment is selected as the bathroom . Because the bathroom is very small 、 Strong tightness , So there will be strong reverberation when recording , It's easy to experiment .

▲ chart 3.1 The specific recording environment is selected as the bathroom (2) Selection of recording device
The recording device is capacitive USB Microphone . Compared with mobile phones , The microphone has a certain noise reduction effect , The recorded audio is mono , It is convenient for subsequent experimental operation .

▲ chart 3.2 The experiment used USB Microphone (3) Test audio selection
The audio text recorded in the bathroom is “ Tsinghua University since 02 class ”, The audio text recorded in the dormitory is “ Department of automation ”, The aim is to integrate it into “ Department of automation, Tsinghua University 02 class ”. For impulse signals , After testing a series of triggering methods , Select the best sounding signal to record the impulse response of the bathroom .
2. Actual test process
(1) Recording audio in the bathroom
Record in the bathroom “ Tsinghua University since 02 class ” Audio , Name it “ bathvoice”.
(2) Measure the impulse response of the bathroom
Snap your fingers in the bathroom , Determine the impulse response in the bathroom environment , Name it “ pulse”.
(3) Recording audio in the dorm
Recording in the dormitory “ Department of automation ” Audio , Name it “ roomvoice”.
(4) Audio waveform check
use GoldWave Music software checks whether the audio is mono , The results were normal , As shown in the figure below :

▲ chart 3.3 Waveform diagram of each audio file , All are mono among , The narrow column below each sound is the overall progress bar , Not the second channel , All three audio frequencies are mono .
(5) MATLAB Convolution
Import each audio file MATLAB, Here's the picture :

▲ chart 3.4 Import MATLAB Rear audio file , Sampling rate fs by 48000take roomvoice And pulse Convolution , Name it ans1, It is the result of convolution :
>> ans1 = conv(roomvoice,pulse)
Four 、 Analysis of test results
1.ans1 Effect analysis
audition ans1, It is found that the convolution result is noisy , And the response time is very long , The effect is not perfect .
>> sound(ans1, fs)
>> audiowrite('ans1.wav', ans1, 48000
The analysis reason , use plot Command draw roomvoice、pulse and ans1 The image is as follows :



▲ chart 4.1 RoomVoice,Pulse,Ans1 wave form Observe the image , You can find roomvoice The audio is about 3.8 Second decay to close to 0,ans1 The waveform of is approaching 4 Seconds before it begins to decay ; and ans1 Before 4 Second waveform ratio roomvoice Much tighter , Not close to the weak voice 0, This results in heavy noise and long reverberation . To solve the problem , Observe the impulse response pulse, It is found that the impulse start time is not zero, resulting in transmission delay , And the reverberation attenuation slope is small 、 Large background noise leads to unsatisfactory convolution results . therefore , Next, deal with the impulse response pulse Improvement .
2. Interception of impulse response
The impulse response pulse Waveform amplification , Pictured :

▲ chart 4.2 Image after impulse response amplification Contrast map 3- Schematic diagram of impulse response composition , It is observed that the period from direct sound to the end of early attenuation only accounts for a small part of the impulse response , The rest is reverberation building and reverberation attenuation . Considering that the impact of the bathroom system on the sound mainly lies in the primary reflection 、 Wall absorption and reverberation , And the limitations of the recording equipment itself , For impulse response , Direct sound arrival shall be reserved ~ Reverberation building part , Round off the rest . This can ensure that the bathroom environment system itself accounts for most of the sound processing , Try to reduce the influence of other factors on the sound .
The treatment is : First re import pulse.m4a And named it pulse2, Double-click to open pulse2, Will start all 0 Value delete , Pictured :

▲ chart 4.3 delete pulse2 Drive enough of all 0 value Then find the area where the impulse response begins to show a small value on a large scale , Delete the following part , For example, the 22290~74304 part , For best convolution , Multiple adjustments may be required :

▲ chart 4.4 delete pulse2 The smaller value of the second half use plot The impulse response after the command is processed is shown in the figure below :

▲ chart 4.5 After processing the pulse2 The first half of the amplified waveform 3.ans2 Effect analysis
take roomvoice And pulse2 Convolution , Name it ans2, Export to wav File and listen .
>> ans2 = conv(voomvoice,pulse2)
>> audiowrite('ans2.wav', ans2, 48000)
>> sound(ans2,fs)
Convolution effect is very good , Close to recording directly in the bathroom bathvoice The effect of . plot Command to draw the waveform as follows , It can be seen that the reverberation length has been greatly improved , The noise problem has also been solved to some extent :

▲ chart 4.6 ans2 Waveform of Next use Adobe Premiere Pro Software , Yes bathvoice and ans2 Audio splicing , Simulate the modification and replacement of audio in practical application , And exported as final.mp3. In addition, we will bathvoice And unprocessed audio roomvoice Make the same splicing , Export to normal.mp3 As a contrast . final.mp3 and normal.mp3 as well as Adobe Premiere Pro Project file audio editing .prproj All have been attached to the thesis package .

▲ chart 4.7 Simulate splicing and replacement in practical application 4. The processed audio is compared with the unprocessed audio
audition final.mp3 and normal.mp3, Although the processed audio is easy to hear that it is spliced , But compared to unprocessed audio , The processed splicing part adds reverberation to simulate the bathroom environment , Give a person a kind of “ The supplementary part was also recorded in the bathroom ” The feeling of , Unprocessed audio has no such effect .
Through audio comparison , It is proved that this paper is based on MATLAB Audio processing is effective , It can reduce the sense of disobedience in the late dubbing , Achieve the effect of natural transition .
5、 ... and 、 Practical application and improvement supplement
1. A feasible process plan
During the video recording of the location , You can record one or more impulse responses in each scene . If you need to modify the dubbing later , Just re record the correct audio , Convolute with the processed impulse response of the corresponding scene , Finally, select the one closest to the original audio effect for splicing , You can achieve the effect of natural transition .
2. Supplementary explanation of the problems in the experiment
The parts that lack implementation conditions and the parts that need attention in this paper are summarized as follows :
(1) Try to choose high-quality recording equipment
The key part of this paper is the convolution of signals , If the impulse response is not received correctly , Or the noise in the late dubbing is too loud , It may seriously affect the accuracy of convoluted audio . In this experiment 100 Yuan price condenser microphone , In minimizing noise ( Close the door and close the window ) Under the premise of , Recorded in bathroom and dormitory , Although there is some bottom noise , But the effect is still ideal . If you can use a higher level microphone , Recording in a professional studio , It should have a better effect .
(2) Pay attention to the microphone distance during the later dubbing
According to the author's experience in dubbing video on weekdays , The distance between the mouth and the microphone will greatly affect the recording
The effect of . In the late dubbing , Try to keep the distance between the microphone and the recording , If conditions permit, the angle and relative position should also be consistent as far as possible , In this way, the best reduction effect can be achieved . This experiment recorded roomvoice Limited by dormitory conditions , The microphone is too close , As a result, there is still a great sense of disobedience after handling .
(3) Complete reading a sentence to realize the restoration of mood
Impulse response convolution can process audio , But it can't affect the tone . So try not to read single words in the later dubbing , Instead, read the whole sentence , Guarantee tone with the original material 、 The tone is consistent , In this way, the effect of integration is better . In this experiment roomvoice Only “ Department of automation ” Four words , Tone and original sentence “ Tsinghua University since 02 class ” Large gap , Difficult to integrate , It should be avoided in practical application .
(4) Adjust loudness and other parameters with other software
Convolution will affect the loudness of the audio , It usually shows a slight increase in loudness . When integrating splices , It can be used MATLAB To adjust the amplitude of the convoluted audio , Try to match the original audio . It can also be used. Adobe Premiere Pro And other audio and video processing software to adjust the loudness , In this experiment final.mp3 in , The gain of the three audio segments is : +3.8dB、-3.3dB and +5.9dB.

▲ chart 5.1 Adjust loudness for better integration reference :
[1] Signals and systems 2022 The fourth assignment in the spring semester , https://zhuoqing.blog.csdn.net/article/details/123550045.
[2] 26 Class play -Video-Export,https://www.bilibili.com/video/BV1da411r7uM.
[3] Explanation of acoustic concepts —— Figure out what impulse response is , https://blog.csdn.net/qq_28350219/article/details/114096751.
[4] Sound changing principle : Convolution and transfer function , https://www.csdn.net/tags/MtTaAgzsNTgzNTM1LWJsb2cO0O0O.html. [5]matlab Process audio signals 33, https://www.csdn.net/tags/OtTaAgysODM2MDUtYmxvZwO0O0OO0O0O.html.
● Related chart Links :
- chart 1.1 Class play clips that need post-processing
- chart 2.1 Schematic diagram of microphone receiving sound during outside diameter shooting
- chart 2.2 Composition diagram of impulse response ( Source network )
- chart 3.1 The specific recording environment is selected as the bathroom
- chart 3.2 The experiment used USB Microphone
- chart 3.3 Waveform diagram of each audio file , All are mono
- chart 3.4 Import MATLAB Rear audio file , Sampling rate fs by 48000
- chart 4.1 RoomVoice,Pulse,Ans1 wave form
- chart 4.2 Image after impulse response amplification
- chart 4.3 delete pulse2 Drive enough of all 0 value
- chart 4.4 delete pulse2 The smaller value of the second half
- chart 4.5 After processing the pulse2 The first half of the amplified waveform
- chart 4.6 ans2 Waveform of
- chart 4.7 Simulate splicing and replacement in practical application
- chart 5.1 Adjust loudness for better integration
边栏推荐
- Big talk Domain Driven Design -- presentation layer and others
- Keepalived 实现 Redis AutoFailover (RedisHA)1
- 11 introduction to CNN
- Make up the weakness - Open Source im project openim about initialization / login / friend interface document introduction
- # 补齐短板-开源IM项目OpenIM关于初始化/登录/好友接口文档介绍
- Dialogue with the senior management of Chang'an Mazda, new products will be released in Q4, and space and intelligence will lead the Japanese system
- Solidus Labs欢迎香港前金融创新主管赵嘉丽担任战略顾问
- 【时间复杂度和空间复杂度】
- R语言广义线性模型函数GLM、glm函数构建逻辑回归模型(Logistic regression)、分析模型是否过离散(Overdispersion)、使用残差偏差与二项式模型中的剩余自由度的比率评估
- JS tutorial - printing stickers / labels using the electronjs desktop application
猜你喜欢

基于Kubebuilder开发Operator(入门使用)

Cloud platform monitoring system based on stm32+ Huawei cloud IOT design

牛客小白月赛50

构造函数和析构函数
![[Li Kou brush question] monotone stack: 84 The largest rectangle in the histogram](/img/75/440e515c82b5613b117728ba760786.png)
[Li Kou brush question] monotone stack: 84 The largest rectangle in the histogram

Dialogue with the senior management of Chang'an Mazda, new products will be released in Q4, and space and intelligence will lead the Japanese system

【力扣刷题】11.盛最多水的容器//42.接雨水

TCP拥塞控制详解 | 1. 概述

Big talk Domain Driven Design -- presentation layer and others
Scala 基础 (二):变量和数据类型
随机推荐
知道这几个命令让你掌握Shell自带工具
JS教程之Electron.js设计强大的多平台桌面应用程序的好工具
pybullet机器人仿真环境搭建 5.机器人位姿可视化
R language uses cor function to calculate the correlation matrix for correlation analysis, uses corrgram package to visualize the correlation matrix, reorders rows and columns using principal componen
Kept to implement redis autofailover (redisha) 1
Niuke programming problem -- dynamic programming of must brush 101 (a thorough understanding of dynamic programming)
Codeforces Round #802 (Div. 2)
MHA switching (recommended operation process)
6 custom layer
【从删库到跑路】MySQL基础 完结篇(入个门先跑路了。。)
Codeforces Round #802 (Div. 2)
Develop operator based on kubebuilder (for getting started)
Arduino UNO + DS1302简单获取时间并串口打印
R language plot visualization: plot visualizes the normalized histogram, adds the density curve KDE to the histogram, and uses geom at the bottom edge of the histogram_ Adding edge whisker graph with
What is the process of switching C # read / write files from user mode to kernel mode?
JS tutorial using electron JS build native desktop application ping pong game
【207】Apache崩溃的几个很可能的原因,apache崩溃几个
牛客小白月赛50
神经网络“炼丹炉”内部构造长啥样?牛津大学博士小姐姐用论文解读
无需人工先验!港大&同济&LunarAI&旷视提出基于语义分组的自监督视觉表征学习,显著提升目标检测、实例分割和语义分割任务!...