I'm still cleaning up this repo. Will release the final version soon.
MEN: https://aclweb.org/aclwiki/MEN_Test_Collection_(State_of_the_art)
SimLek-999: https://fh295.github.io/simlex.html
to set the minimum threshold of triple occurrence.
to set the dimension of pixie, which is 100.
python3 preprocessing.py \
--VG_path=/the_path_of_your_VG_zip_data/ \
--pixie_dim=100 \
--data_path=/the_path_of_your_filtered_transformed_data/ \
--pca_path=/the_path_of_your_pca_transformed_data/ \
--pca_only=True \
python3 pixie_model.py \
--pixie_dim=100 \
--pca_path=/the_path_of_your_pca_transformed_data/ \
--parameter_path='parameters/' \
--lr=0.01 \
--dr=5e-8 \
: Dimension of the pixie, default is 100.
: This could be one of the 'MEN', 'Simlek', 'RELPRON' and 'GS2011'.
: The path of the trained parameters 'world_parameters.p' and 'Lexical_parameters.p'.
: Boolean value to set if only use the vocab covered by the EVA work.
python3 evaluation.py --dataset=Simlek --pixie_dim=100 --data_path=pixie_data_2/ --pca_path=data_pca_2/