d

cs230

public

1 stars

0 forks

5 issues

Commits

List of commits on branch master.

Unverified

711a974da99face18c72fbdbb121233732d5bbae

some wav files to make graphs from

ddavidsgrogan committed 6 years ago

Unverified

5448cd015c873ffe3abd5d15c92f7043c8baf738

Added speaker_20_file_17_cumulative_minute_1101_NOISEplastic.npy

ddavidsgrogan committed 6 years ago

Unverified

a757052e0c9da107e8150d76e4362551d8f8328a

"Last-ditch" model with 94 train, 86 dev, haven't run a test set yet

ddavidsgrogan committed 6 years ago

Unverified

0bf94c322269581e1f4bf25ff3cded85071b4a31

Results of large model from previous commit

ddavidsgrogan committed 6 years ago

Unverified

32d3be984513ef366e33794f496ca3fa00c12866

Running a huge model on noisy data overnight.

ddavidsgrogan committed 6 years ago

Unverified

d043933882b503d8bdcbac16fca8ccdd2bfa361a

94/90 when running this on the clean data

ddavidsgrogan committed 6 years ago

README

The README file for this repository.

convert the original 22050Hz mp3 to an 8000Hz wave:

ffmpeg -i upperroom_16_ryle_64kb.mp3 -acodec pcm_s16le -ar 8000 upperroom_16_ryle_8khz.wav

Convert it to a raw binary file consisting solely of 8000 16bit integers per second:

ffmpeg -i upperroom_16_ryle_64kb.mp3 -f s16le -acodec pcm_s16le -ar 8000 upperroom_16_ryle_8khz.raw

The file sizes are the same modulo the wav header that contains info about the sampling rate and bit depth etc.

To pull 5% of each speaker's data to the test set, while in the data_np_save/ directory, I ran:

for i in seq 2 20; do ls -1 speaker_${i}_*npy |shuf -n 55 |xargs -I{} mv {} test_only; done

The 55 is because each speaker has 1102 files, 5% of which is 55.1.