GitXplorerGitXplorer
s

people-occupations-classifier

public
5 stars
3 forks
1 issues

Commits

List of commits on branch master.
Unverified
a32c393f9c12e1c96cbd8acfc329057ac92ad084

remove ref tags and process data again (fixes #7)

ssdabhi23 committed 6 years ago
Verified
d0668d02e6331a0a328320b4f5cf1cb15915b7dc

Merge pull request #5 from sdabhi23/fix/stemmed-words

ssdabhi23 committed 6 years ago
Unverified
e4da290bc8dd854d96b6945fb1aa8fcbd0af0947

Stem data files

ssdabhi23 committed 6 years ago
Verified
f54942e2e1a7addb9a662c6885e1ea83dcb6fb27

Merge pull request #4 from sdabhi23/feature/stop-words

ssdabhi23 committed 6 years ago
Unverified
bc5edcbf9c45e5872e190bd5c52287d1558e9dee

Removed stop words

ssdabhi23 committed 6 years ago
Unverified
770c5f2f3e6fbc04747b0f8fdc2f38956631ffa2

Dataset cleaned & converted to raw text

ssdabhi23 committed 6 years ago

README

The README file for this repository.

people-occupations-classifier

This project has been undertaken as a part of our undergraduate coursework, to understand generation of datasets and classification algorithms.

Getting started

  1. Clone this repo

    $ git clone https://github.com/sdabhi23/people-occupations-classifier.git
  2. Setup virtual environment (optional)

    $ virtualenv .venv
    # for windows
    $ .venv\Scripts\activate
    # for *nix
    $ .venv/Scripts/activate
  3. Installing the required libraries

    $ pip install -r requirements.txt
  4. Additional steps for nltk

    >>> import nltk
    >>> nltk.download('stopwords')
  5. Steps to configure ipython kernel

    Required only if using virtual environment

    $ ipython kernel install --user --name=people_classifier

    Then change the kernel in the jupyter interface to people_classifier.

Maintainers

References

  • Classifying Wikipedia People Into Occupations by Aleksander Gabrovski (.pdf)
  • Multi-Class Text Classification with Scikit-Learn (article)
  • Using jupyter notebooks with a virtual environment (article)