This is the code to reproduce the experiments in the "Training Hybrid Language Models by Marginalizing over Segmentations" paper.
The code requires:
- A C++ compiler with good C++11 support (e.g. g++ >= 4.8)
- cmake — version 3.5.1 or later, make
- ArrayFile (>= 3.6.1) is required
- flashlight is required.
Please follow the instruction here to build flashlight as well as this project.
The following command line can be used to reproduce experiments on the MWC datasets:
./transformer --data PATH/TO/DATA --bsz 8 --bptt 512 --d_model 512 --d_ff 2048 --n_blocks 12 --n_heads 4 --warmup 8000 --warmup_loss 20 --ngram 4 --threshold 200 --lr 0.025 --clip 0.1 --dropout 0.3 --use_cache --nepoch 30
The code is licensed under CC-BY-NC, as found in the LICENSE file.