当前位置:网站首页>Audio knowledge (I)
Audio knowledge (I)
2022-06-24 16:41:00 【languageX】
I have been exposed to many audio projects , You need to review what you have learned each time . Here is a systematic summary of the previous knowledge points .
This article mainly summarizes the basic knowledge of audio , Terms and some basic mathematical knowledge for subsequent feature extraction .
To learn about audio , First, understand the sound : Sound is a wave produced by the vibration of an object .
Audio Basics
1. Three elements of sound
loudness : The subjective perception of sound intensity by the human ear is called loudness . Loudness is related to the amplitude of the vibration of sound waves .
tone : The feeling of the human ear about the level of sound is called tone .
The tone is mainly related to the frequency of the sound wave . But the tone is not proportional to the frequency , It also relates to the intensity of the sound And waveform .
timbre : It is the human ear to various frequencies 、 The combined response of sound waves of various intensities . The characteristics of sound , And the material of the sound producing object itself 、 Structure is related to .
2. Digital to analog conversion
The sounds heard by the human ear are continuous , This continuous smooth signal is called analog signal . The audio data processed in the computer is a discrete signal , This discontinuous signal is called digital signal . Converting analog signals into digital signals is called Digital to analog conversion , The steps that need to be taken : sampling , quantitative , code .
sampling : Take values on the analog signal according to a certain time interval . For example, it is often said that 16KHZ Audio , Refers to the number of samples per second 16000 A little bit .
quantitative : Quantize the sampled value , Use restrictions A number represents the amplitude signal . Usually use bit Work unit . such as 16bit The audio quantization level is 16 position , Value range -32768,32767, Altogether 65536 It's worth .
code : Record in a certain format sampling and quantitative Later data . Generally speaking, audio raw data refers to pulse code modulation (PCM) data . The encoded binary data is a digital signal .
3. The term
Sampling rate : sampling frequency , How many points are sampled every second . Commonly used 16KHZ,44.1KHZ.
Sample size : Each sampling point pair bit Count . Commonly used 16bit,24bit. The channel number : The signal generates several sets of acoustic data at a time . Commonly used mono , Two channel .
Bit rate : Also called bit rate , It refers to the number of... Transmitted per second bit Count . Unit is bps(Bit Per Second), Higher bit rate , The more data is transferred per second , The better the sound quality .
Rate calculation formula : Bit rate = Sampling rate * Sample size * Track number
pcm Encoding vs. file size (M): Bit rate *1000* Number of seconds /1024/1024/8
Signal Basics
1. Continuous signal , Discrete signal
Continuous signal x(t) At intervals T Sample evenly , Then the discrete signal is obtained x(nT), Finally, by quantifying bit It means to get s digital signal .
Signals are also divided into periodic signals and aperiodic signals . The following figure shows the aperiodic continuous signal , Periodic continuous signal , Aperiodic discrete signal , Periodic discrete signal .
2. Fourier analysis
Fourier said : Any continuous periodic signal can be composed of a set of suitable sinusoids . So why use a sine curve ? Because sine wave is a description of frequency domain ,
The only waveform in the frequency domain . Fourier analysis is a method to transform signals in time domain and frequency domain .
(https://zhuanlan.zhihu.com/p/19759362 This article explains the analysis of Fourier transform very well )
2.1 Fourier series (Fourier Series)
According to Fourier , The periodic signal f(t) Express by a series of sine functions : And here t Time , A It means amplitude , w Is the angular frequency \omega=2\pi/T, \psi
\ It's the first phase
Then through trigonometric function formula sin(\alpha \pm \beta) = sin\alpha coa\beta \pm cos\alpha sin\beta You can convert the formula to
Make a_n = A_nsin\psi , b_n=A_ncos\psi , So the formula (1) Just switch to
According to the trigonometric function, the integral in a period is 0, And the orthogonality of trigonometric functions , We can solve A_0,a_n,b_n
In order to unify the form , remember a_0=2A_0, cycle T=2\pi , For the formula 4 Perform down transform
The formula 5 That's the Fourier series formula ~
2.2 Fourier transformation (Fourier Transform)
Fourier series are in trigonometric form , Let's change it to exponential form .
Through Euler's formula e^{i\theta} = cos(\theta) + isin(\theta) You can calculate that
Put the formula 6 Into the 5 Medium f(t)
And then 5 in a_n,b_n,a_0 Bring in the formula
We make N Tend to \infty , that \omega=\frac{2\pi}{N} , Make \omega_x=\frac{2\pi}{N}n=\omega n
Re order F(ωt) by f(t) The Fourier transform of
You can put the formula 8 Transformation for
According to the definition above , step \omega=\frac{2\pi}{N} , According to the Riemann sum expression of the integral ( Integration can be seen as dividing a curve into very small intervals and then summing them )
Then the formula can be changed to
Final order T\rightarrow N
f(t)=\frac{1}{2\pi}\int^{+\infty}_{-\infty}F(\omega_t)e^{i\omega_xt} d\omega_x\tag{12}
The formula 12 and 9 It is the formula of Fourier transform ~
2.3 Discrete Fourier transform (Discrete Fourier Transform)
Fourier transform is an integral form calculated on a continuous signal . In the computer , What we get is a discrete signal , So you have to go through DFT.
DFT Yes, it will FT The integral of is converted into a summation form ,FT Inside is the order step \omega_t\rightarrow 0 , We put \omega_x=n\frac{2\pi}{N} Bring it to the formula 10
Make T\rightarrow N , Yes 13 and 9 Make changes , obtain DFT Variation formula
2.4 fast fourier transform (FFT)
DFT And FFT It's actually doing the same thing , It's just FFT yes DFT A fast algorithm .
We're going to calculate DFT, Every F(n) , So the time complexity is O(n2), however FFT The time complexity of just O(nlog2n).
2.5 Discrete cosine transform (DCT)
DCT Is in the Fourier series expansion , If the expanded function is Real even function , Then the Fourier series contains only the cosine term , And then discretize it (DFT) The cosine transform can be derived , So it is called discrete cosine transform (DCT).DCT yes DFT A subset of .
Discrete cosine transform is actually a discrete Fourier transform that produces a new signal after certain processing of the original signal . The transformation process from the original signal to the new signal is shown in the figure below .
The original signal is first transformed symmetrically , And then pan 1/2 Get a new signal after units . If the original signal is used as f(x), So the new signal is g(x)=f(x-\frac{1}{2})+f(-x-\frac{1}{2})
Go straight up DCT The formula :
inverse transformation
I'd like to introduce you here today , We will continue to introduce the audio MFCC Feature extraction and code implementation .
Reference article :
https://zhuanlan.zhihu.com/p/75521342
https://blog.csdn.net/qq_39546227/article/details/99686160
边栏推荐
- Cloud + community [play with Tencent cloud] video solicitation activity winners announced
- [go] runtime package for concurrent programming and its common methods
- [tke] nodelocaldnschache is used in IPVS forwarding mode
- What does the router pin mean?
- Introduction of thread pool and sharing of practice cases
- Page scrolling effect library, a little skinny
- 期货怎么开户安全些?哪些期货公司靠谱些?
- There are potential safety hazards Land Rover recalls some hybrid vehicles
- If only 2 people are recruited, can the enterprise do a good job in content risk control?
- Tencent releases the full platform version of reasoning framework TNN, and supports mobile terminal, desktop terminal and server terminal at the same time
猜你喜欢

Ps\ai and other design software pondering notes
MySQL Advanced Series: Locks - Locks in InnoDB
![[leetcode108] convert an ordered array into a binary search tree (medium order traversal)](/img/e1/0fac59a531040d74fd7531e2840eb5.jpg)
[leetcode108] convert an ordered array into a binary search tree (medium order traversal)

Applet wxss

ZOJ - 4104 sequence in the pocket

Some adventurer hybrid versions with potential safety hazards will be recalled

A survey on dynamic neural networks for natural language processing, University of California

There are potential safety hazards Land Rover recalls some hybrid vehicles
Advanced programmers must know and master. This article explains in detail the principle of MySQL master-slave synchronization

Problems encountered in the work of product manager
随机推荐
Pytorch transpose convolution
Applet wxss
Leetcode notes of Google boss | necessary for school recruitment!
A survey on dynamic neural networks for natural language processing, University of California
In those years, I insisted on learning the motivation of programming
The RTSP video structured intelligent analysis platform easynvr stops calling the PTZ interface through the onvif protocol to troubleshoot the pending status
转置卷积详解
Istio FAQ: sidecar stop sequence
Pageadmin CMS solution for redundant attachments in website construction
Factory mode
Cognition and difference of service number, subscription number, applet and enterprise number (enterprise wechat)
Little red book, hovering on the edge of listing
Where is the most formal and safe account opening for speculation futures? How to open a futures account?
Recent progress of ffmpeg go
Web page live broadcast on demand RTMP streaming platform easydss newly added virtual live broadcast support dash streaming function
Interpretation of swin transformer source code
proxy pattern
Object store signature generation
Data acquisition and transmission instrument reservoir dam safety monitoring
Ps\ai and other design software pondering notes