GitXplorerGitXplorer
w

PixieVGModel

public
2 stars
0 forks
0 issues

Commits

List of commits on branch main.
Verified
8907373b5c0bf33f6cb6b1fc4c08dcea26310d7f

Update README.md

wwilliamLyh committed 3 years ago
Unverified
5bbd8edc282687af494f48b07a55929077b36abc

remove files

committed 3 years ago
Unverified
1f3bcd8e41aa1b8e44fb0a8be86546b679c41a73

Merge branch 'main' of https://github.com/williamLyh/PixieVGModel into main

committed 3 years ago
Unverified
f3129161626f9559881ebb6f9b6f877c6138a430

2nd commit

committed 3 years ago
Verified
0f1171870304dbd12e9777ea1b546fe7208bf3a8

Add data address

wwilliamLyh committed 3 years ago
Unverified
93b9626ea28f0e6ff763a87e5adfe1484c4df46f

finishing code

committed 4 years ago

README

The README file for this repository.

I'm still cleaning up this repo. Will release the final version soon.

PixieModel

Evaluation datasets

MEN: https://aclweb.org/aclwiki/MEN_Test_Collection_(State_of_the_art)

SimLek-999: https://fh295.github.io/simlex.html

Generate dataset from Visual Genome

--min_freq to set the minimum threshold of triple occurrence.

--pixie_dim to set the dimension of pixie, which is 100.

--data_pach

--pca-path

python3 preprocessing.py \
    --VG_path=/the_path_of_your_VG_zip_data/ \
    --pixie_dim=100 \
    --data_path=/the_path_of_your_filtered_transformed_data/ \
    --pca_path=/the_path_of_your_pca_transformed_data/ \
    --pca_only=True \
    --min_freq=100

Train the PixieModel

--pixie_dim

--data_path

--parameter_path

--pca_path

--lr

--dr

epoch_num

python3 pixie_model.py \
    --pixie_dim=100 \
    --pca_path=/the_path_of_your_pca_transformed_data/ \
    --parameter_path='parameters/' \
    --lr=0.01 \
    --dr=5e-8 \
    --epoch_num=20

Evalute the PixieModel

--pixie_dim: Dimension of the pixie, default is 100.

--dataset: This could be one of the 'MEN', 'Simlek', 'RELPRON' and 'GS2011'.

--data_path

--pca_path

--parameter_path: The path of the trained parameters 'world_parameters.p' and 'Lexical_parameters.p'.

--use_EVA_vocab: Boolean value to set if only use the vocab covered by the EVA work.

python3 evaluation.py --dataset=Simlek --pixie_dim=100 --data_path=pixie_data_2/ --pca_path=data_pca_2/