GitXplorerGitXplorer
b

doc2date

public
0 stars
0 forks
0 issues

Commits

List of commits on branch master.
Verified
ecdc62f40bc980394b5c18ee3536b6ed49a8933f

took down doc2date

bbencwallace committed 4 years ago
Unverified
4a8732acbc594bb47cc63e8d8c8775364b2ad396

fixed scaling

bbencwallace committed 5 years ago
Unverified
6f8ea4b338b82ce524b364ab44edac987ad5081b

comments

bbencwallace committed 5 years ago
Unverified
036ac3f1c64ef0df7e9bfb99ef757b01a3afd3c2

fixed broken link

bbencwallace committed 5 years ago
Unverified
55e7829846de0d16e1b0adf52e41cb99fcfb5dd6

deleted

bbencwallace committed 5 years ago
Unverified
6da4a2da334db005958362a506f1c75830ee0ecd

added pipeline diagram

bbencwallace committed 5 years ago

README

The README file for this repository.

doc2date: A Study in Document Regression

Document classification is a common application of machine learning techniques. Examples include sentiment analysis, the classification of texts into a (typically small) number of moods (such as positive and negative); as well as authorship attribution in stylometry, in which texts are grouped according to their original author,

Unsupersived learning methods have also been applied to the analysis of documents. For instance, doc2vec is a dimensionality reduction technique that extends [word embeddings] to documents.

But what about document regression? In this notebook, we investigate the problem of learning the date of a publication from the text contained therein. Since the target space, a range of years, can be viewed as a continuum, this problem presents a natural test case for applying regression techniques to document analysis.

Read more about it

Try it out (Update 2021.02 I've taken doc2date down)