GitXplorerGitXplorer
s

speech_embeddings

public
17 stars
5 forks
0 issues

Commits

List of commits on branch master.
Unverified
e02b330c8d37b0be54ad0ac8d9c03bf5aea6467f

less phones folding

ssyhw committed 10 years ago
Unverified
e909baa7517f0f3ad0708e6a657fef92c75a8183

phn2vec biclustering

ssyhw committed 11 years ago
Unverified
db146af39eea746d57764af52941814906852814

serializing word2vec models of both CBOW and SG

ssyhw committed 11 years ago
Unverified
98ac2f4a7b2fe3c0fcac8841779c85f534ddef2b

Merge branch 'master' of github.com:SnippyHolloW/speech_embeddings

ssyhw committed 11 years ago
Unverified
c02bf4f817d2d7885af2983240f49e96b9924b15

having a little fun with TIMIT phonetics and translations

ssyhw committed 11 years ago
Unverified
0543a2a54086c22298c54daf9d2fc1acd95f60ca

add vq for IFA

mmwv committed 11 years ago

README

The README file for this repository.

Speech Embeddings

Using embedding-based loss functions for phonetics/speech recognition.

ABX-distance based embeddings:

emb_from_ab_dist.py

TODO write doc

"phn2vec" embeddings:

Phonetic annotations

There is no silver bullet, you need phonetically annotated speech corpora (e.g. TIMIT or the Buckeye corpus).

Phonemic annotations

Then you can also work on the phonemic annotations, for that you need to transform words into phonemes. I did a hack-job using the CMU phonemic dict.:

python timit_words_to_phonemes.py

You need to have the TIMIT corpus with a train.scp leading to *.xyz files having corresponding *.wrd files with word-level annotation (look at the constant at the start of timit_words_to_phonemes.py).

How to train the embedding? (Using word2vec from gensim)

python mlf_to_text.py < ~/postdoc/datasets/TIMIT_train_dev_test/train/train.mlf >> timit_train_from_phones.txt

or

python mlf_to_text.py --forcealigned --timitfoldings < ~/postdoc/datasets/TIMIT_train_dev_test/aligned_train.mlf >> timit_train_from_phones.txt
python train_word2vec.py timit_train_from_phones.txt

Same for the Buckeye corpus.

Comparing two embeddings is as simple as:

python train_word2vec.py timit_train_from_phones.txt timit_train_from_words.txt

or

python train_word2vec.py timit_train_from_phones.txt buckeye_train_from_phones.txt

Notes on the phone(me)s annotations:

For the Buckeye corpus, "tq" (glotal stop in "cat") folded to "sil".

For the TIMIT corpus, "dx" (flap in "butter") inexistent in "words" (phonemic annotation) version.