GitXplorerGitXplorer
r

tensorflow-speech_commands

public
6 stars
5 forks
1 issues

Commits

List of commits on branch master.
Verified
2e5017155d234d0eb685914374a1067037560b60

Update README.md

rrwightman committed 7 years ago
Verified
33a28dc1e7ab145385814625285b37b912601582

Update README.md

rrwightman committed 7 years ago
Unverified
eaa9d8430a45430f04bc6b28ac65e12208351cd2

README

rrwightman committed 7 years ago
Unverified
8717aecc4827ba095503134b87fbf662c13121cc

Update README

rrwightman committed 7 years ago
Unverified
8049357fdfa852406430a9b44227567320fa515d

The last changes before moving on to PyTorch...

rrwightman committed 7 years ago
Unverified
e25927ddbebb4209d2b102052ee5cc711eac57ae

Fix some silly slim layer params and tweak some other net architectures

rrwightman committed 7 years ago

README

The README file for this repository.

Kaggle Speech Recognition Challenge

This was my first crack at the Kaggle Tensorflow Speech Recognition Challenge ( https://www.kaggle.com/c/tensorflow-speech-recognition-challenge).

It is a Tensorflow based solution that combines the Tensorflow speech commands (as the base, https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands) with the Tensorflow models audioset vggish model (https://github.com/tensorflow/models/tree/master/research/audioset) and some custom models that I built.

The models and training/validation code were updated to the TF Slim style.

I was never able to get great results with this code base (0.86 on the competition LB was the max, and 0.82-0.85 was more typical for most training rounds). I suspect most of it was due to the default data handling and augmentation that I was relying on. There wasn't a whole lot of difference from model to model. I quickly surpassed all results I achieved with these models in a few days of hacking around in PyTorch in the dying days of the challenge (https://github.com/rwightman/pytorch-commands). NOTE: At the point I moved on I was experimenting with a new Rosa based input data pipeline, it was in place but slow and had some multiprocess hangups on shutdown/ctrl-c.

One noteworthy thing I did experiment with here before moving to PyTorch is the 1D convolution models. The performance of several custom 1D models was at or just below the performance of the 2D MFCC spectrogram based models. 'conv1d_basic3' being the most promising of those.