GitXplorerGitXplorer
o

generating-reviews-discovering-sentiment

public
1508 stars
379 forks
32 issues

Commits

List of commits on branch master.
Verified
0032e4b87903713ad4242599f5dda2eb7b8c40fd

Merge pull request #57 from christopherhesse/update-readme

cchristopherhesse committed 6 years ago
Unverified
a0ba2e44a09d0d81503ad5cc64e97b0d9e96fb97

update README with repo status

cchristopherhesse committed 6 years ago
Unverified
d1c4c18ac4888ad767881cf6ab522d91609983f0

add link to pytorch port

oopenai-sys-okta-integration committed 7 years ago
Unverified
c10e61029e3cc1fd6d0b19d58ba77fd699ba4611

update readme to include demo

oopenai-sys-okta-integration committed 7 years ago
Unverified
9b8b1ae714583257feca9663b9fb585b0378af4b

add sentiment unit histogram vis

oopenai-sys-okta-integration committed 7 years ago
Unverified
83d940d132fbd585deb63b8e3fcb523ea9557683

port to tensorflow v1+ and sst binary demo

oopenai-sys-okta-integration committed 7 years ago

README

The README file for this repository.

Status: Archive (code is provided as-is, no updates expected)

Generating Reviews and Discovering Sentiment

Code for Learning to Generate Reviews and Discovering Sentiment (Alec Radford, Rafal Jozefowicz, Ilya Sutskever).

Right now the code supports using the language model as a feature extractor.

from encoder import Model

model = Model()
text = ['demo!']
text_features = model.transform(text)

A demo of using the features for sentiment classification as reported in the paper for the binary version of the Stanford Sentiment Treebank (SST) is included as sst_binary_demo.py. Additionally this demo visualizes the distribution of the sentiment unit like Figure 3 in the paper.

Sentiment Unit Visualization

Additionally there is a PyTorch port made by @guillitte which demonstrates how to train a model from scratch.

This repo also contains the parameters of the multiplicative LSTM model with 4,096 units we trained on the Amazon product review dataset introduced in McAuley et al. (2015) [1]. The dataset in de-duplicated form contains over 82 million product reviews from May 1996 to July 2014 amounting to over 38 billion training bytes. Training took one month across four NVIDIA Pascal GPUs, with our model processing 12,500 characters per second.

[1] McAuley, Julian, Pandey, Rahul, and Leskovec, Jure. Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM, 2015.