GitXplorerGitXplorer
F

HAIS18

public
0 stars
0 forks
0 issues

Commits

List of commits on branch master.
Verified
8d60ca2db87d1f3cc19dda3f1c54a43756cd0921

Created README.md

FFernanOrtega committed 6 years ago
Verified
715fdc8fdf5f7b51f85c1bfaca1be25911f3c3b3

Added license

FFernanOrtega committed 7 years ago
Unverified
d0276bd7b9d82ecc7d133fff6f2a4b296dae7970

The score threshold is now a configurable parameter.

FFernanOrtega committed 7 years ago
Unverified
3ab6eac1e1ee2c1c4a0df0d69e6d2221ca252df4

Fixed error when computing effectiveness measures.

FFernanOrtega committed 7 years ago
Unverified
7a3d7b61afd8403779e00fc2df1ac3c5f1d6649e

Fixed error in score formula.

FFernanOrtega committed 7 years ago
Unverified
27a161db170d762dcb4487490657f77704b68938

Removed unnecessary parameter.

FFernanOrtega committed 7 years ago

README

The README file for this repository.

HAIS18

Python project with the proposal's implementation from the paper:

A Hybrid Approach to Mining Conditions - HAIS, 2018 - Fernando O. Gallego and Rafael Corchuelo

Repository contents

  • datasets/ Dataset's folder
    • dataset-en
    • dataset-en-lite
    • dataset-es
    • dataset-es-lite
  • models/ Word2vec models' folder
    • w2v-modelv2-en
    • w2v-modelv2-en
  • LICENCE
  • candidates_creator.py
  • main.py
  • model_factory.py
  • README
  • validation.py
  • word_preprocessing.py
  • word_vectorizer.py

Requirements

  • Python 3.5.4 or above
  • Theano 0.9.0
  • Keras 2.0.8
  • NLTK 3.2.4 with punkt and SnowballData models installed.
  • Numpy 1.13.1
  • Scikit-learn 0.19.0
  • Gensim 2.3.0

Usage

main.py is the entry point of our experiments. It contains the following script parameters:

  1. relative path of the dataset's file
  2. language selected
  3. relative path of the word2vec model's file
  4. number of folds to perform k-fold cross validation
  5. deep learning model to use in the experiment (name of the class inside model_factory.py)
  6. relative path of the output csv file with the performance results
  7. score threshold to consider whether a candidate is a condition or not

Example of use:

python main.py dataset/dataset-en en models/w2v-modelv2-en 4 ModelA results/results-ModelA-en.csv 0.75