GitXplorerGitXplorer
b

alf

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
63314473e90123e6a859053ba39385c0aeda9c87

Add problems identified when adding support for ELB format

committed 8 years ago
Unverified
65bbf944638ee8edd16b173d903561ddc8903d53

Very hacky support for ELB access logs

committed 8 years ago
Unverified
c62b2589836337deef621c3b6b2d9527d3792621

Moar testing.

bbleach committed 14 years ago
Unverified
ceb24a2926b87b6c6a1e467698e77247c08ceea5

Need a way to query on fields not present in "combined" format.

bbleach committed 14 years ago
Unverified
a3ce318c734abaf9744a313b06bfeac6f6a2da92

Add TODO.

bbleach committed 14 years ago
Unverified
7bb5e3bf8452b69940e84f665482cbe8c54acdd6

Remove debugging print line.

bbleach committed 14 years ago

README

The README file for this repository.

Alf - A log filter

This is my log filter. There are many like it, but this one is mine.

Alf is intended to be a convenient way of performing common extraction and processing tasks on webserver log files:

  • Selecting lines based on field contents such as url or status code.
  • Extracting only certain fields from logfiles.
  • Counting the number of lines that match a pattern.

It currently parses some of the more commonly used apache formats, but it could parse many others, with a little work.

I am currently not very happy with the command line options, so it is likely that I will change both their names and the defaults in future commits.

Installing

You will need Python >= 2.4 installed.

If you have the psyco pythom module installed, alf will use it in an attempt to improve performance.

Examples

  1. Extract all the logged requests which had the status code 404:

    alf --status-regex='404' logfile

  2. Extract all the requests to /robots.txt:

    alf --url-regex='^/robots.txt' logfile

TODO

Tech debt:

  • Make a log format a more structured thing including:
    • regex for parsing the log
    • knowledge of how to translate timestamp fields into a python datetime
  • Address the confusion when you --print-invalid and --print-lines at the same time

Features:

  • Detect log formats automatically.
  • Query on fields which are in non-default log formats.
  • Add an option to sort output by date.
  • Configuration file to add more log formats.
  • Make it easier to call alf from other scripts.

Internals:

  • Testing:
  • Unit tests.
  • Test runner to run tests on all supported python versions.
  • Simple integration tests against sample logfiles.
  • Performance benchmarks.