当前位置:网站首页>Voxceleb1 dataset Download
Voxceleb1 dataset Download
2022-07-25 10:30:00 【Haulyn5】
Preface
VoxCeleb1 Is widely used Speaker recognition 、 verification Data sets . Because it is from YouTube Extract from video , There is rich noise .( Make up the introduction when you are free )
If you can use Google forms and translation software, you should be able to download smoothly , Distributing datasets privately risks infringement .
Text
The official website is as follows :
VoxCeleb
https://www.robots.ox.ac.uk/~vgg/data/voxceleb/
But what's amazing is now (2022-7-12), All download links to this website have been cancelled .
VoxCeleb
https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox1.html
You can see , It can only be downloaded to Metadata, Audio files are temporarily unavailable .
After searching for a long time, I found that the link below can be downloaded , At first, I was worried that it was not the official website , Later, it was found that this was a South Korean Laboratory , Undertook the fourth VoxCeleb Speaker Recognition Challenge (VoxSRC)
VoxCeleb
https://mm.kaist.ac.kr/datasets/voxceleb/ Before downloading, you need to fill in Google Form, Fill in the name of the unit . Because it is an automatic process , So you can check your email inbox soon after filling it out , You will see an email giving User name and password .
Here are instructions , The given identity can only be used 1 Months .
It's easy to get the user name and password , Use Windows And the browser can directly find the corresponding data set download in the following link , Because it's too big, the official made it into pieces , The specific operation is detailed on the official website , Click the link when downloading , You need to fill in the user name and password , Enter to start downloading .VoxCeleb
https://mm.kaist.ac.kr/datasets/voxceleb/
Add something extra ,Linux Download command of environment .
wget http://cnode01.mm.kaist.ac.kr/voxceleb/vox1a/vox1_test_wav.zip --http-user=username--http-passwd=passwordtake link `http://cnode01.mm.kaist.ac.kr/voxceleb/vox1a/vox1_test_wavip` Switch to the file you need to download , then username and password Just replace .
The official website gives md5, You can check it easily .
md5sum vox1_dev_wav.zipThen decompression , use unzip command .
unzip -d vox1_dev_wav vox1_dev_wav.zipThen the big work was done , The use of data sets can refer to GitHub look for voxceleb trainer, In addition to using Pytorch Users of can refer to torchaudio.datasets.voxceleb1 — Torchaudio nightly documentation. This API Relatively new , The older version may not have .
Add
For the data set to be used Train The model student added ,Identification The training of tasks should also be downloaded Test Data .
Direct use https://mm.kaist.ac.kr/datasets/voxceleb/meta/iden_split.txt This file reads the data set , Will report a mistake ,id10270-id10309 The data of is missing , however iden_split This document is marked with some id The data of speakers in this range is Training, I thought it was just Training Data ( Because it's not doing ASV) So I didn't download Test…… It turned out to be a mistake , Audio file not found .


边栏推荐
猜你喜欢
随机推荐
复现 ASVspoof 2021 baseline RawNet2
5.这简单的 “echo” 用法隔壁小孩能不会吗!
Ansible部署指南
Mysql5.7 master-slave database deployment (offline deployment)
微信小程序WxPrase中包含文件无法点击解决
Simple addition calculator
shortest-unsorted-continuous-subarray
FRP reverse proxy deployment
Swing的组件图标
4. FTP service configuration and principle
Salt FAQs
4.隔壁小孩都会的,各种shell符号{}[]等
MySQL offline deployment
常用类的小知识
6.PXE结合Kickstart原理和配置实现无人值守自动装机
4. Children next door will know all kinds of shell symbols {}[], etc
Angr(一)——安装
Small knowledge of common classes
异常处理Exception
Open virtual private line network load balancing






