GitXplorerGitXplorer
b

alf

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
6a48f8fd49066085689244d2cf8aad6f58449b00

Add note about psyco.

bbleach committed 14 years ago
Unverified
620fb69ad04317b04e59ab257623ff396e44e409

Basic readme.

bbleach committed 14 years ago
Unverified
28eb71cd21221db77a174d1e180874ed96ac2e9f

Only makes sense to print either lines or summary or manually selected

bbleach committed 14 years ago
Unverified
0c3265897f0eab38c81dd92227b74e9a68488a31

Be a filter by default - don't print any line counts unless asked.

bbleach committed 14 years ago
Unverified
c5aeec13df9d28b7bea6c2572c1213e8cea8848f

Add vhost_combined format.

bbleach committed 14 years ago
Unverified
203aafc02f2fff79de02901f6020b00ce7309e4e

Don't error if the files are empty.

bbleach committed 14 years ago

README

The README file for this repository.

Alf - A log filter

This is my log filter. There are many like it, but this one is mine.

Alf is intended to be a convenient way of performing common extraction and processing tasks on webserver log files:

  • Selecting lines based on field contents such as url or status code.
  • Extracting only certain fields from logfiles.
  • Counting the number of lines that match a pattern.

It currently parses some of the more commonly used apache formats, but it could parse many others, with a little work.

I am currently not very happy with the command line options, so it is likely that I will change both their names and the defaults in future commits.

Installing

You will need Python >= 2.4 installed.

If you have the psyco pythom module installed, alf will use it in an attempt to improve performance.

Examples

  1. Extract all the logged requests which had the status code 404:

    alf --status-regex='404' logfile

  2. Extract all the requests to /robots.txt:

    alf --url-regex='^/robots.txt' logfile

TODO

Tech debt:

  • Make a log format a more structured thing including:
    • regex for parsing the log
    • knowledge of how to translate timestamp fields into a python datetime
  • Address the confusion when you --print-invalid and --print-lines at the same time

Features:

  • Detect log formats automatically.
  • Query on fields which are in non-default log formats.
  • Add an option to sort output by date.
  • Configuration file to add more log formats.
  • Make it easier to call alf from other scripts.

Internals:

  • Testing:
  • Unit tests.
  • Test runner to run tests on all supported python versions.
  • Simple integration tests against sample logfiles.
  • Performance benchmarks.