HAIS18

Python project with the proposal's implementation from the paper:

A Hybrid Approach to Mining Conditions - HAIS, 2018 - Fernando O. Gallego and Rafael Corchuelo

Repository contents

datasets/ Dataset's folder
- dataset-en
- dataset-en-lite
- dataset-es
- dataset-es-lite
models/ Word2vec models' folder
- w2v-modelv2-en
- w2v-modelv2-en
LICENCE
candidates_creator.py
main.py
model_factory.py
README
validation.py
word_preprocessing.py
word_vectorizer.py

main.py is the entry point of our experiments. It contains the following script parameters:

relative path of the dataset's file
language selected
relative path of the word2vec model's file
number of folds to perform k-fold cross validation
deep learning model to use in the experiment (name of the class inside model_factory.py)
relative path of the output csv file with the performance results
score threshold to consider whether a candidate is a condition or not

python main.py dataset/dataset-en en models/w2v-modelv2-en 4 ModelA results/results-ModelA-en.csv 0.75