GitXplorerGitXplorer
k

TPlag

public
0 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
5a523cf2c13f468badb80c82ab09cf2e68d85aaa

Added better UI

kkjain1810 committed 4 years ago
Unverified
44b7833454ff9f3bc0b775eef6f3eea47056bc6c

Update .gitignore

kkjain1810 committed 4 years ago
Unverified
f5bb92b71ab370fd3333eb77804fb82d082f29d9

Update README.md

kkjain1810 committed 4 years ago
Unverified
540a03fb9cf17513d5230f27a634f7c81b35c775

Add option to web server

kkjain1810 committed 4 years ago
Unverified
c5ef12d6c79ac901b5a6f3c53f7b29df7557ab6a

Update README.md

kkjain1810 committed 4 years ago
Unverified
2393f64d92da00f8b09acfef08cb4bdd74d35da6

Adding download script

kkjain1810 committed 4 years ago

README

The README file for this repository.

TPlag

TPlag uses NLP techniques to detect plagiarism is text files. It can be used to find plagiarism in essays and other exam responses with about 80% accuracy.

Usage

The app can be used as a command line interface or a web interface.

The web interface supports uploading multiple files and a pair-wise plagiarism check for all the files. You can choose whether you want to classify files on basis of optimizing the F score or optimizing the accuracy. This is the recommended method for testing plagiarism in exams with large number of answers.

Meanwhile, the command line interface is recommended for exploratory purposes. It provides an option to optimizes either the F score (threshold around 5.64) of the classifications or the accuracy(threshold around 2.27) or provide a custom threshold!

For both, the server and the CLI, GLoVe embeddings are required. You can download them using the following command:

bash dl-script.sh # This will take some time

To run the web server, install dependencies using the command:

pip3 install -r <requirements.txt

Then, run the server using:

rm -rf files
mkdir files
export FLASK_APP=main.py
# uncommend following line to run the server in debug mode
# export FLASK_ENV=development
flask run

To run the CLI, install dependencies using:

pip3 install -r <requirements_2.txt

Then, to run the server, use:

python3 cli.py

Accuracy

The file was test against corpus from here. Of the 95 files, 80% were classified correctly as plagiarised or not plagiarised in accuracy mode while an F score of 0.95 was achieved in f_score mode!

To Do

  1. Explore better model
  2. Create UI