当前位置:网站首页>[speech synthesis] tensorflowtts Chinese text to speech
[speech synthesis] tensorflowtts Chinese text to speech
2022-07-24 00:09:00 【Wang Xiaoxi WW】
【 speech synthesis 】TensorFlowTTS Chinese text to speech
List of articles
brief introduction
This project is based on TensorFlowTTS Chinese speech synthesis based on Demo TensorFlowTTS It's an offline 、 Open source speech synthesis (text to speech) Model . It supports a variety of cutting-edge model choices , Have SOTA Level effect .
The source project path address is :https://gitee.com/sherlocking_755/tts-demo
The reference materials of the project are : An article teaches you how to get started with speech synthesis , Train a Chinese voice tts
Environment configuration
1、windows End ( The attempt failed )
First pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple To configure TensorFlowTTS Environmental Science
stay windows Run in tensorFlowTTS Project will report an error :
Traceback (most recent call last):
File "D:\programSoftware\python\anaconda\envs\temp_env\lib\tempfile.py", line 258, in _mkstemp_inner
fd = _os.open(file, flags, 0o600)
PermissionError: [Errno 13] Permission denied: 'D:\\programSoftware\\python\\anaconda\\envs\\temp_env\\lib\\site-packages\\librosa\\util\\__pycache__\\tmpyrv1bpb4'
During handling of the above exception, another exception occurred:
online Say modify
f = tempfile.NamedTemporaryFile(mode='w+', delete=False)
But I changed it and it didn't work
2、ubuntu End ( feasible )
Here it is. WSL2 + docker Desktop Environment configuration under , That is to say docker Pull from ECS Anaconda Mirror image , And instantiate anaconda Containers
docker run -it --name="anaconda" -p 8888:8888 continuumio/anaconda3 /bin/bash
Next, configure under this container TensorFlowTTS The environment , Directly in linux The configuration in the system should also be quite different .
direct pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple that will do .
The following problems may exist during configuration , If you encounter it, you can refer to the following link :
- install llvmlite Report errors : Reference resources https://blog.csdn.net/qq_41977618/article/details/119572879,https://www.cnblogs.com/kele-dad/p/12955804.html
- install pyaudio Report errors : Reference resources https://www.csdn.net/tags/MtjaYgysNjcxODQtYmxvZwO0O0OO0O0O.html
The program runs
1、 Load data
If it's true windows End , Then the decompressed nltk_data Put it in C:\Users\ user name \AppData\Roaming
If it's in ubuntu in , Then the decompressed nltk_data Put it in /root that will do
2、 Load model
First the tacotron2.part1.rar Extract several model files to the project root directory , load tacotron2 Model
MelGAN The model file is already in the project root directory
Then modify them respectively tacotron2 and MelGAN Model configuration file :tacotron2.baker.v1.yaml and TensorFlowTTS/examples/multiband_melgan/conf/multiband_melgan.baker.v1.yaml, Here you can use the default configuration
Run directly under the project path python tts-demo.py that will do .
3、 Possible problems
When running a project , You may encounter the following problems , Please refer to the following link :
When running a project ,numba Report errors , Reference resources https://blog.csdn.net/qq_41590635/article/details/112499219
TypeError: create_target_machine() got an unexpected keyword argument 'jitdebug'numba Not recommended tsinghua Source , The package downloaded is wrong , Use the original pip Just go to the source
An error is prompted at runtime ,
nltk_datastay/rootI didn't find it under the folder , In the projectnltk_dataCopied to the/rootThen you can :Resource cmudict not found. Please use the NLTK Downloader to obtain the resource: >>> import nltk >>> nltk.download('cmudict') For more information see: https://www.nltk.org/data.html Attempted to load corpora/cmudict Searched in: - '/root/nltk_data' - '/opt/conda/nltk_data' - '/opt/conda/share/nltk_data' - '/opt/conda/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' **********************************************************************
The test results
Here we use the following 4 Test Chinese sentences
" Old trees and crows , Under a small bridge near a cottage a stream flows , Old road west wind thin horse . Westward declines the sun , Heartbroken people in the end of the world ",
" This is an open source end-to-end Chinese speech synthesis system ",
" Harden took the initiative to cut his salary 2 year 1450 Ten thousand dollars remain 76 ers ",
" Wu aping hasn't found it yet , The monk in Xuanzang Temple surprised me "
The average time of speech synthesis :
phoneme seq: sil k u1 #0 t eng2 #0 l ao3 #0 sh u4 #0 h uen1 #0 ^ ia1 #0 x iao3 #0 q iao2 #0 l iou2 #0 sh uei3 #0 r en2 #0 j ia1 #0 g u3 #0 d ao4 #0 x i1 #0 f eng1 #0 sh ou4 #0 m a3 #0 x i1 #0 ^ iang2 #0 x i1 #0 x ia4 #0 d uan4 #0 ch ang2 #0 r en2 #0 z ai4 #0 t ian1 #0 ^ ia2 sil
index = 0, cost = 2.2670693397521973
phoneme seq: sil zh e4 #0 sh iii4 #0 ^ i2 #0 g e4 #0 k ai1 #0 ^ van2 #0 d e5 #0 d uan1 #0 d ao4 #0 d uan1 #0 zh ong1 #0 ^ uen2 #0 ^ v3 #0 ^ in1 #0 h e2 #0 ch eng2 #0 x i4 #0 t ong3 sil
index = 1, cost = 1.7127163410186768
phoneme seq: sil h a1 #0 d eng1 #0 zh u3 #0 d ong4 #0 j iang4 #0 x in1 #0 n ian2 #0 ^ uan4 #0 m ei3 #0 j in1 #0 x v4 #0 l iou2 #0 r en2 #0 d uei4 sil
index = 2, cost = 1.4303524494171143
phoneme seq: sil ^ u2 #0 ^ a5 #0 p ing2 #0 h ai2 #0 m ei2 #0 zh ao3 #0 d ao4 #0 x van2 #0 z ang4 #0 s ii4 #0 ch uan2 #0 zh en1 #0 h e2 #0 sh ang4 #0 q ve4 #0 r ang4 #0 ^ uo3 #0 j ing1 #0 d ai1 #0 l e5 sil
index = 3, cost = 1.7882094383239746
mean cost = 1.7995868921279907
Existing problems : Unrecognized numbers , Can only recognize Chinese
边栏推荐
- 473-82(40、662、31、98、189)
- 腾讯将关闭“幻核”,数字藏品领域发展是否面临阻力?
- logback
- IT基础英语
- PyTorch 中遇到的问题
- docker搭建sonarqube,mysql5.7环境
- QT | set part size sizehint, minimumsizehint, sizepolicy, stretch factor
- C language explanation series -- understanding of functions (2) how to use functions to exchange the values of two integer variables
- Write all the code as soon as you change the test steps? Why not try yaml to realize data-driven?
- 多表查询之_外连接
猜你喜欢

String function 1 of C language

Esp8266 - at command + network transparent transmission

Deep learning 9 feedforward neural network 2: realize feedforward neural network and model optimization

总结谋划明方向 凝心聚力开新局——和数软件对口援疆项目显成效

DGS first acquaintance

iNFTnews | 呵护“雪山精灵”,42VERSE“数字生态保护”公益项目即将盛启

北大青鸟昌平校区:运维就业现状怎么样?技能要求高吗?

Xmind用例导入到TAPD的方案(附代码)

mysql数据库基础

数据驱动之Excel读写
随机推荐
C language explanation series -- understanding of functions (2) how to use functions to exchange the values of two integer variables
Y75. Chapter IV Prometheus factory monitoring system and practice -- Prometheus alarm setting (VI)
Beijing University qingniaochangping Campus: how about the current situation of operation and maintenance employment? Are skills required?
webrtc 1对1 -基本架构与目录
深度学习之 9 前馈神经网络 基本概念
合宙ESP32C3基于Arduino IDE框架下配置分区表
多表查询之_外连接
Nacos
jenkins下使用声明式(Declarative)和Jenkinsfile的方式构建Pipeline流水线项目
.NET下发同Outlook邮件格式以及表格的拼接
复制客服微信号,前往微信添加,拨打电话
太空射击第08课: 改进的碰撞
文本和图片的绘制、数据存储、localStorage、sessionStorage、cookie三者的区别
Sentinel链路方式流控失效解决
Chapter 4: implementation use cases
PushGateway+Prometheus+Grafana构建Flink实时监控
My meeting of OA project (query)
进步成长的快乐
Analysis and resolution of slot conflict in solid delegatecall
Qt | 设置部件大小 sizeHint、minimumSizeHint、sizePolicy、stretch factor