GitXplorerGitXplorer
j

onsets-and-frames

public
204 stars
65 forks
17 issues

Commits

List of commits on branch master.
Verified
783ca08498bb8ada41516c9ed492868b4e947abf

Fixed bug with offsets in midi parsing function (#25)

ccwitkowitz committed 3 years ago
Unverified
f5f5bc812a45d88f029452e52ad76ff742626ec3

added a script for transcribing wav files to midi (#7)

SSargamMenghani committed 5 years ago
Verified
12c62148bad8e7455b851bd50762fee5f65615f3

Update decoding.py

jjongwook committed 5 years ago
Unverified
14c7dd65f0936e5f0c4ea6952a772713028b4e11

train_on option for training on MAPS instead

jjongwook committed 5 years ago
Unverified
6509de3f99c1b4e7a055deb85480daf4d9662131

some cosmetic changes

jjongwook committed 5 years ago
Verified
d010c59e85cbb7550e7434b46e7be05633647ecd

fixed the broken link to the TF implementation

jjongwook committed 5 years ago

README

The README file for this repository.

PyTorch Implementation of Onsets and Frames

This is a PyTorch implementation of Google's Onsets and Frames model, using the Maestro dataset for training and the Disklavier portion of the MAPS database for testing.

Instructions

This project is quite resource-intensive; 32 GB or larger system memory and 8 GB or larger GPU memory is recommended.

Downloading Dataset

The data subdirectory already contains the MAPS database. To download the Maestro dataset, first make sure that you have ffmpeg executable and run prepare_maestro.sh script:

ffmpeg -version
cd data
./prepare_maestro.sh

This will download the full Maestro dataset from Google's server and automatically unzip and encode them as FLAC files in order to save storage. However, you'll still need about 200 GB of space for intermediate storage.

Training

All package requirements are contained in requirements.txt. To train the model, run:

pip install -r requirements.txt
python train.py

train.py is written using sacred, and accepts configuration options such as:

python train.py with logdir=runs/model iterations=1000000

Trained models will be saved in the specified logdir, otherwise at a timestamped directory under runs/.

Testing

To evaluate the trained model using the MAPS database, run the following command to calculate the note and frame metrics:

python evaluate.py runs/model/model-100000.pt

Specifying --save-path will output the transcribed MIDI file along with the piano roll images:

python evaluate.py runs/model/model-100000.pt --save-path output/

In order to test on the Maestro dataset's test split instead of the MAPS database, run:

python evaluate.py runs/model/model-100000.pt Maestro test

Implementation Details

This implementation contains a few of the additional improvements on the model that were reported in the Maestro paper, including:

  • Offset head
  • Increased model capacity, making it 26M parameters by default
  • Gradient stopping of inter-stack connections
  • L2 Gradient clipping of each parameter at 3
  • Using the HTK mel frequencies

Meanwhile, this implementation does not include the following features:

  • Variable-length input sequences that slices at silence or zero crossings
  • Harmonically decaying weights on the frame loss

Despite these, this implementation is able to achieve a comparable performance to what is reported on the Maestro paper as the performance without data augmentation.