Text to Speech with FastSpeech2

FastSpeech2 article and FastSpeech article.

Example

Inference result is audio, but Github supports only video+audio formats.

https://github.com/tgritsaev/fastspeech2/assets/34184267/80b357d5-6a8f-492d-a550-d8c83645e2f2

You can also download a folder with tts-results from Google Drive, it includes 27 audios with different length, pitch and energy for the first three inputs from test_model/input.txt.

Installation guide

Use python3.9

conda create -n fastspeech2 python=3.9 && conda activate fastspeech2

Install libraries

pip3 install -r requirements.txt

Download data

bash scripts/download_data.sh

Preprocess data: save pitch and energy

python3 scripts/preprocess_data.py

Download my final FastSpeech2 checkpoint

python3 scripts/download_checkpoint.py

Train

Run for training

python3 train.py -c configs/train.json

Final model was trained with train.json config.

Test

Run for testing

python3 test.py

test.py include such arguments:

Config path: -c, --config, default="configs/test.json"
Create multiple audio variants with different length, pitch and energy -t, --test, default=False
Increase or decrease audio speed: -l, --length-control, default=1
Increase or decrease audio pitch: -p, --pitch-control, default=1
Increase or decrease audio energy: -e, --energy-control, default=1
Checkpoint path: -cp, --checkpoint, default="test_model/tts-checkpoint.pth"
Input texts path: -i, --input, test_model/input.txt
Waveglow weights path: -w, --waveglow, default="waveglow/pretrained_model/waveglow_256channels.pt"

Results will be saved in the test_model/results, you can see example in this folder.

Wandb Report

https://api.wandb.ai/links/tgritsaev/rkir8sp9 (English only)

Credits

This repository is based on a heavily modified fork of pytorch-template repository. FastSpeech2 impementation is based on the code from HSE "Deep Learning in Audio" course seminar and official FastSpeech2 repository.

fastspeech2

Commits

fd

fd

Merge branch 'main' of https://github.com/tgritsaev/fastspeech2 into main

Update README.md

Update README.md

Update README.md

README

Text to Speech with FastSpeech2

Example

Installation guide

Train

Test

Wandb Report

Credits