GitXplorerGitXplorer
l

hct

public
13 stars
1 forks
0 issues

Commits

List of commits on branch main.
Unverified
e2716c41fd7f34dd0e92de136f140b3890457a65

Camera-ready fixes and updated models

llisjin committed 2 years ago
Unverified
bcd19ecfa30d049f1b409a427a009548e2fcd363

Give tokenized inputs to corpus_bleu

llisjin committed 3 years ago
Unverified
a092d5f301234a4ca6511bd1c7f5f06bf3279e96

SRL analysis fix and English version

llisjin committed 3 years ago
Unverified
6a3956df1b1bd4433a81c5bae2185d712706a80f

Reorganize slightly

llisjin committed 3 years ago
Unverified
7a14d1b31d666556cd3f5da25ee561916e82b838

Update README.md

llisjin committed 3 years ago
Unverified
da50492a5f4e62d9d6d556b6e7d76d9ba2177514

Add data and checkpoint download instructions

llisjin committed 3 years ago

README

The README file for this repository.

Hierarchical Context Tagging for Utterance Rewriting

Data Preprocessing

To extract labels and rules automatically, run the following commands. For Chinese Rewrite dataset, replace data_preprocess_en with data_preprocess_zh. You can replace pipeline_canard.sh with either pipeline_mudoco.sh or pipeline.sh (for Rewrite).

Download preprocessed data

From Google Drive.

tar -xzvf RaST_data.tar.gz
mv RaST_data/canard* data_preprocess_en
mv RaST_data/mudoco* data_preprocess_en
mv RaST_data/rewrite* data_preprocess_zh

Training

Return to the root directory and modify line 22 of train.sh with the correct model directory in experiments/ that contains params.json. Then run sh train.sh <dataset>. The top-2 checkpoints will be saved in this given directory under the current epoch number (e.g., experiments/canard/05).

Model checkpoints

Download from these links:

tar -xzvf <checkpoint_tar>  # e.g., canard21_03-16.tar.gz
mv <checkpoint_dir> experiments  # e.g., canard21_03-16

Evaluation

Modify line 16 of test.sh to point to the correct model directory. Then run the following command:

sh test.sh <dataset> <epoch_number>  # e.g., sh test.sh canard 05

From checkpoints

The best-performing models per dataset are below.

Dataset Path
CANARD experiments/canard21_03-16/05
MuDoCo experiments/mudoco21_03-16/19
Rewrite experiments/rewrite21_03_19/19

To evaluate on an existing checkpoint, modify line 16 of test.sh to point to the correct checkpoint directory. Note that the best epoch number per checkpoint can be found in the table above.