Text-to-Speech

Setup

Choose a PyTorch container from NVIDIA PyTorch Container Versions and create a Dockerfile as /text2speech/docker/Dockerfile

Build and run docker

$ docker build --no-cache -t torcht2s .
$ docker run -it --rm --gpus all -p 2222:8888 -v /your/working/directory/text-to-speech/text2speech:/your/working/directory/text-to-speech/text2speech torcht2s

Add environment to jupyter notebook and launch jupyter notebook

$ python -m ipykernel install --user --name=torcht2s
$ jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root

Open a browser from your local machine and navigate to 127.0.0.1:2222/?token=${TOKEN} and enter your token specified in your terminal.

Data Preperation

Follow these steps to use custom dataset.

Prepare a directory with .wav files, filelists (training/validation split of the data) with transcripts and paths to .wav files under the text2speech/Fastpitch/dataset/ location. Those filelists should list a single utterance per line as:

<audio file path>|<transcript>

Preprocess the data

Run the pre-processing script to calculate pitch and mels with text2speech/Fastpitch/data_preperation.ipynb

python prepare_dataset.py \ 
    --wav-text-filelists dataset/tts_data_train.txt \ 
                         dataset/tts_data_val.txt \
    --n-workers 0 \
    --batch-size 1 \
    --dataset-path dataset \
    --extract-pitch \
    --f0-method pyin \
    --extract-mels \

Prepare file lists with paths to pre-calculated pitch running create_picth_text_file() from text2speech/Fastpitch/data_preperation.ipynb Those filelists should list a single utterance per line as:

<audio file path>|<audio pitch .pt file path>|<transcript>

The complete dataset has the following structure:

./dataset
├── mels
├── pitch
├── wavs
├── tts_data_train.txt
├── tts_data_val.txt
├── tts_data_pitch_train.txt
├── tts_data_pitch_val.txt

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
text2speech		text2speech
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-to-Speech

Setup

Data Preperation

About

Releases

Packages

Languages

ai-to-ai/Text-to-Speech

Folders and files

Latest commit

History

Repository files navigation

Text-to-Speech

Setup

Data Preperation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages