FastSpeech 2 - PyTorch Implementation

This repository is an extended PyTorch implementation of Microsoft's FastSpeech 2: Fast and High-Quality End-to-End Text to Speech, initially based on xcmyz's implementation, with the core code structure derived from ming024's original FastSpeech2 implementation.
We introduce several modifications to enable training and inference using phonological features instead of phoneme IDs, supporting cross-lingual and low-resource speech synthesis scenarios. This modification allows more linguistically informed training and better generalization across languages. Using this version, we successfully trained a German baseline TTS model, and further performed transfer learning with a small amount of English data to train an English model.

Our method is inspired by the concept of using cross-lingual phonological information as described in the paper:

"Cross-lingual Transfer of Phonological Features for Low-resource Speech Synthesis"
SSW11 Paper PDF

We also refer to the PHOIBLE database for phonological feature definitions and mappings.

The overall training and synthesis pipeline still follows the original repository structure ming024's original FastSpeech2 implementation. However, we have made the following key modifications to support phonological feature-based modeling:

text/ folder: contains several modified files to support phonological feature data preparation.
transformer/models.py: updated to allow model input as phonological feature vectors instead of phoneme IDs.
synthesis.py: modified to support inference using phonological features as input.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
audio		audio
config		config
demo		demo
hifigan		hifigan
img		img
lexicon		lexicon
model		model
preprocessed_data		preprocessed_data
preprocessor		preprocessor
text		text
transformer		transformer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
index.html		index.html
prepare_align.py		prepare_align.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
synthesize.py		synthesize.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FastSpeech 2 - PyTorch Implementation

References

About

Uh oh!

Releases

Packages

Languages

License

DongJiashu/FastSpeech2_phonological_features

Folders and files

Latest commit

History

Repository files navigation

FastSpeech 2 - PyTorch Implementation

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages