当前位置：网站首页>Voiceprint Technology (V): voiceprint segmentation and clustering technology

Voiceprint Technology (V): voiceprint segmentation and clustering technology

2022-06-25 09:05:00 【u013250861】

5.1　 Segmentation clustering ： Better understand the voice of dialogue

5.1.1　 About name and history

Voiceprint segmentation clustering （speaker diarization） It is second only to voiceprint recognition in the field of voiceprint , It is much more difficult than voiceprint recognition . The problems solved by voiceprint recognition can be summarized as ——“ Who said that ”, This includes a hypothesis , That is, the known speech to be recognized , There is only one speaker's voice . In the voiceprint segmentation and clustering problem , We have overturned this assumption , in other words , A speech can contain the voice of multiple speakers speaking alternately . therefore , The problems solved by voiceprint segmentation and clustering can be summarized as ——“ Who said it at what time ”（who spoke when）.

In English diarization The word" , From words diary, That is, diary or diary . from diary To verb diarize, And then to nouns diarization, Literally , It can be understood as “ send …… Become a log ”, Or say “ Log ”. generally speaking , A journal is usually recorded in the time of the day , Who did what at what time . Then it is extended to speaker diarization, Naturally, it can be understood as “ Who said what at what time ”.

About speaker diarization The earliest origin of this name , It is difficult to study . Some early literature directly referred to this problem as speaker segmentation and clustering [114,115], This is why many Chinese documents translate it into “ Voiceprint segmentation clustering ”[116]. But with the development of this field , Especially in recent years, the supervised method （ see 5.5 section ） Even end-to-end models （ see 5.5.6 section ） Appearance ,“ Segmentation clustering ” The name is no longer appropriate . Whether it is segmentation or clustering , Can be replaced by other methods . Another Chinese translation that I prefer is “ Voiceprint time sharing archive ”

原网站

版权声明
本文为[u013250861]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/176/202206250736417658.html