GitXplorerGitXplorer
c

haddit

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
843898dd9a9275dbf3fd08d4a76435fc746245bf

using clojure 1.2; clojure-hadoop 1.2.1 (hadoop 0.20.2)

cchoas committed 15 years ago
Unverified
56b4a61dba3152c7abdbd2a7f940d2e58b7949af

initialize; modifications for clojure-contrib 1.1.0 and clojure-hadoop 1.2.0

cchoas committed 15 years ago

README

The README file for this repository.

haddit

For a more detailed article see Hadoop - Feeding Reddit to Hadoop.

This version has following modifications:

  • clojure-contrib 1.1.0 json support

  • clojure-hadoop 1.2.0 defjob (input-format)

  • added more job definitions: output-format and compress-output

building and running

To build the uberjar with leiningen, just type:

lein uberjar

To execute haddit with some example data:

./run.sh

There're already some example data in the data folder. But I think they has duplicated entries and the reddit data download needs some modification ...