GitXplorerGitXplorer
d

NLP-tutorials

public
2 stars
1 forks
2 issues

Commits

List of commits on branch master.
Verified
cfb16ff6f2a649b72271d1bfa7275be31e21003a

Merge pull request #5 from davidalbertonogueira/dependabot/pip/ujson-5.4.0

ddavidalbertonogueira committed 2 years ago
Verified
80a9dcd3e411b27954869035e023f2aae2b4d339

Bump ujson from 5.2.0 to 5.4.0

ddependabot[bot] committed 2 years ago
Verified
571706db8ed2ddc8b06ad0bab4a866eb06c5f748

Merge pull request #4 from davidalbertonogueira/dependabot/pip/numpy-1.22.0

ddavidalbertonogueira committed 2 years ago
Verified
4f3c4ad8abcaf4c58d39eebf9abe57a0524a1f2f

Bump numpy from 1.21.0 to 1.22.0

ddependabot[bot] committed 2 years ago
Verified
e464ca64d5de3d85265360a9af9c51e3e242b1ba

Merge pull request #3 from davidalbertonogueira/dependabot/pip/numpy-1.21.0

ddavidalbertonogueira committed 2 years ago
Verified
6fe4cc63bc1f82bbeadab3964ef4749d554eeebb

Bump numpy from 1.17.2 to 1.21.0

ddependabot[bot] committed 2 years ago

README

The README file for this repository.

NLP-tutorials

Slides and code for the intro to NLP class at the Data Science Retreat @ Berlin, 28 Oct 2019

Installation

Clone this repo with

git clone --recursive https://github.com/davidalbertonogueira/NLP-tutorials
git submodule update --depth 10 --recursive --init

Requirements

Besides python >= 3.7 and pip, the required libraries are listed in requirements.txt.

Code is cross-platform. Tested in Windows and Ubuntu 18.04.

Setting up a virtual environment

Linux Windows
sudo apt-get install virtualenv / pip install virtualenv pip install virtualenv
virtualenv --python /usr/bin/python3.7 venv virtualenv venv
source venv/bin/activate venv\Scripts\activate.bat
pip install -r requirements.txt pip install -r requirements.txt
pip install http://download.pytorch.org/whl/cpu/torch-0.4.1-cp37-cp37m-linux_x86_64.whl (1)* pip install http://download.pytorch.org/whl/cpu/torch-0.4.1-cp37-cp37m-win_amd64.whl (1)*
pip install torchvision pip install torchvision
pip install torchtext==0.2.3 pip install torchtext==0.2.3
python -m spacy download en_core_web_sm python -m spacy download en_core_web_sm

(1)* replace "cpu" in link if you plan to use GPU: "cu80" for CUDA 8, "cu90" for CUDA 9.0, "cu92" for CUDA 9.2, ...

If you require a newer version, please visit http://pytorch.org/ and follow their instructions to install the relevant pytorch binary.

Text Classification

Text Classification with BoW + TF-IDF + Naive bayes and SVM based on http://github.com/Gunjitbedi/Text-Classification

Text Classification with word embeddings + neural network

Sentiment Analysis

Sentiment Analysis with task-specific word embeddings + Pytorch neural network

The implemented model is a Deep Bidirectional LSTM model with Attention, based on the work of Baziotis et al., 2017: DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis.

Recomendation

Recomendation using Colaborative filtering

Named Entity Recognition

Transition-based NER system (in C++ Dynet) (Pytorch version)

Paper: https://arxiv.org/pdf/1603.01360.pdf

Bonus material

Transformers: Attention is all you need copied from http://nlp.seas.harvard.edu/2018/04/03/attention.html

Paper: https://arxiv.org/abs/1706.03762