GitXplorerGitXplorer
m

gocrawler

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
287a0c7e86408ca0ab06a98951ee108728f9c2f8

Rename webcrawler.go to gocrawler.go

mmarcvanzee committed 9 years ago
Unverified
cb1b89275d4382baff23a14e0f8d299811687935

Create README.md

mmarcvanzee committed 9 years ago
Unverified
31f807a39f44c8019f2384b2c00a847ab9290046

Initial commit

mmarcvanzee committed 9 years ago

README

The README file for this repository.

gocrawler

Simple Webcrawler written in Go

Input syntax:

$ ./webcrawler --url=<url> --depth=<depth> --max_urls=<max_urls>

<url> The url to start crawling from (default=http://www.marcvanzee.nl)

<depth> Recursive depth of the crawling (default=2)

<max_urls> Maximum number of urls to crawl for (default=150)

Installing

Requires the golang.org/x/net/html package from the golang subrepositories. Install as follows to get all net packages (including html):

go get golang.org/x/net/...

Get this package as follows:

go get github.com/marcvanzee/gocrawler

Examples

$./webcrawler --url=http://www.golang.org --depth=2 --max-urls=200
$./webcrawler --url=http://www.golang.org --max-urls=1000
$./webcrawler --url=http://www.golang.org --depth=1