GitXplorerGitXplorer
t

nlp-class

public
3 stars
5 forks
0 issues

Commits

List of commits on branch master.
Unverified
ec6d68acca4d827fc4eddfbe934f4153ce430ea9

correct typo

ttomtung committed 12 years ago
Unverified
0d192bb6c5edffb2eee6504fcb0e1cdc4c46f8a9

update final report

ttomtung committed 12 years ago
Unverified
8d75c2db0c561e143d408e9c91d92076db487c77

ignore external bin

ttomtung committed 12 years ago
Unverified
586744e323f0c10cf0652b2d0ff0f764a4cc5f71

add gizapp result

ttomtung committed 12 years ago
Unverified
fc25bb3872e99d6aea1d3972ded3522ffdb5383d

eval bugfix

ttomtung committed 12 years ago
Unverified
ad6f197939c23e04b11523e5cc63da4086a1e266

Merge branch 'master' of https://github.com/tomtung/nlp-class

ttomtung committed 12 years ago

README

The README file for this repository.

This repository contains code I wrote in the class Empirical Methods in Natural Language Processing.

Hopefully this can be useful to other people who are also learning NLP techniques. The code that I wrote is licensed under WTFPL version 2.

The details are described in the report.pdf files in each sub-folder. Here's a brief summary of what they are about:

  1. hw1: Restore vowels to vowelless text using Finite-State Machines
  2. hw2: Trigram language model with improved Kneser-Ney smoothing (best performance in class)
  3. hw3: Unsupervised part-of-speech tagging using Baum–Welch Algorithm
  4. hw4: Part-of-speech tagging using Conditional Random Fields (3rd best performance in class)
  5. hw5: Context free grammar parsing using CKY algorithm, with vertical markovization
  6. final project: a modern-Chinese-to-classical-Chinese machine translation system

Most code is written in scala (except for hw5, which is written Python: the professor kindly provided a few hundred lines of useful code in Python and I was too lazy to reimplement them, so I just learnt the language instead). You may need sbt or sbt script runner to execute them.