GitXplorerGitXplorer
m

DontGetKicked

public
3 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
f8bb1f7ea36816724bfe597573453b4095200b63

Update README.markdown

committed 13 years ago
Unverified
2305f8ed9eb8811969181a5183a1a090f2f0de60

Simplify config and folder structure

committed 13 years ago
Unverified
a8efe5f38b58c3ae565ad87daf92279c92889166

Add '--full' option to train.py

committed 13 years ago
Unverified
312fb9ffcc65b629b231f88f9fcc2c5d397d17d0

Fix debug output and small errors in load.py

committed 13 years ago
Unverified
1718f841727afc5628a1bc2299f5ee92dc7451d9

Make load.py create a neuron description file

committed 13 years ago
Unverified
cdde44a18bccf971a99dd3548338c804fd32ddec

Change '--range' argument to load.py

committed 13 years ago

README

The README file for this repository.

DontGetKicked

This is a project for the Kaggle competition Don't Get Kicked!.

The project uses:

  • Python (tested with 2.6 and 2.7)
  • numpy (tested with 1.6.1, does not work with 1.5.1)
  • FANN (2.1.0)
  • argparse (tested with 1.1)
  • matplotlib (tested with 1.0.1)
    • only needed for: zipcodes.py, states.py

The project currently consists of the following files:

Main files

  • load.py Loads the training data and transforms it into training data for FANN. (Step 1)
  • train.py
    • Trainer. Runs FANN on generated training data and generates neural network. (Step 2)
    • Tester. Runs generated neural network on generated testing data. (Step 3)

Note: Training data and testing data are basically the same (they have the same format), but it is recommended to have them in seperate files so you don't test on the same data you trained on.

Auxiliary files

  • zipcodes.py -- Plots the IsBadBuy score for each zipcode.
  • states.py -- Plots the IsBadBuy score for each state.

Basic usage

  1. Make training file (reads from training.csv outputs to data/train.dat)

     python load.py
    
  2. Make test file (reads from training.csv outputs to data/train.dat)

     python load.py --test-data --range 2001,4000
    
    • The training data is by default made from line 1-2000 in training.csv, here we choose line 2001-4000
  3. Train the neural network (reads from data/train.dat and outputs the neural network to networks/default.net)

     python train.py
    
  4. Test the neural network (reads from data/test.dat using the neural network in networks/default.net)

     python train.py --test
    

Advanced usage

  1. Make training file (reads lines 5000-6000 from training.csv outputs to data/minimal_train.dat)

    The training file's input neurons is specified in config/datamapping/minimal.py

     python load.py --data-mapping minimal --range=5000,6000 --train-file 'data/minimal_train.dat'
    
  2. Make test file (reads lines 3700-4700 from training.csv outputs to data/minimal_train.dat)

    The test file's input neurons is specified in config/datamapping/minimal.py

     python load.py --data-mapping minimal --range=3700,4700 --test-file 'data/minimal_test.dat'
    
  3. Train the neural network (reads from data/minimal_train.dat and outputs the neural network to networks/minimal.net)

    The neural network's learning rate, hidden layer etc. is read from config/learn/minimal.py

     python train.py --train-file 'data/minimal_train.dat' --neural-config minimal --network-file 'networks/minimal.net'
    
  4. Test the neural network (reads from data/minimal_test.dat using the neural network in networks/minimal.net) Also gives line-by-line output of the test results

     python train.py --test-file 'data/minimal_test.dat' --network-file 'networks/minimal.net' --full
    

load.py

The main file. Loads the training data and transforms it into training data for FANN.

zipcodes.py

Move along -- nothing to see here...

states.py

Move along -- nothing to see here...