GitXplorerGitXplorer
t

rustdca

public
1 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
54cea6fb9c1a9bf5f250b010909dd6eb45f2de0f

Correct naming of call duration/rate parameters

ttsoernes committed 6 years ago
Unverified
2a1e9f01d4302a8b5127ac6c9a1a83247ca409c8

Update links and description in README

ttsoernes committed 6 years ago
Unverified
0eaf2bc05199b51eb7564d18c66b41aab88a43b1

Add basic run instructions

ttsoernes committed 6 years ago
Unverified
d718019813382c9ddee214a51a7a29832a95efc9

Finish basic implementation

ttsoernes committed 6 years ago
Unverified
3a7593ed388360d1b9afd1cedb380cef761f345f

move readme to root

ttsoernes committed 7 years ago
Unverified
30e670144f7936e8098d43a96475bb9307ef01fb

add readme

ttsoernes committed 7 years ago

README

The README file for this repository.

Dynamic Channel Allocation by Reinforcement Learning

This project implements a RL agent for doing Dynamic Channel Allocation in a simulated mobile caller environment.

The implementation is in Rust and uses ndarray for numerical work.

It is a near-complete Rust port of the best performing agent (AA-VNet) from https://github.com/tsoernes/dca. This agent utilizes a linear neural network as state value function approximator which is updated using a newly proposed variant of TDC gradients, originally defined in Sutton et al. 2009: "Fast gradient-descent methods for temporal-difference learning with linear function approximation."

See also the version written in Haskell, and Python.

How to build

cargo build --release

How to run

cargo run --release -- --n_events 100_000

or, to see the help menu for all command line arguments:

cargo run --release -- --help
USAGE:
    rustdca [FLAGS] [OPTIONS]

FLAGS:
    -h, --help           Prints help information
    -V, --version        Prints version information
    -v, --verbose        Log level: '-v' for debug, '-vv' for trace
        --verify_grid    Verify channel reuse constraint each iteration

OPTIONS:
    -l, --alpha <alpha>                    Learning rate for neural network [default: 2.52e-6]
    -a, --alpha_avg <alpha_avg>            Learning rate for average reward [default: 0.06]
    -g, --alpha_grad <alpha_grad>          Learning rate for TDC gradient corrections [default: 5e-6]
        --call_dur <call_dur>              Call duration, in minutes [default: 3]
    -r, --call_rate <call_rate_ph>         Call rate, in calls per hour [default: 200]
        --hoff_call_dur <hoff_call_dur>    Call duration for hand-offs, in minutes [default: 1]
        --log_iter <log_iter>              Show blocking probability every 'log_iter' iterations [default: 5000]
    -i, --n_events <n_events>              Simulation duration [default: 10000]
    -p, --p_handoff <p_hoff>               Hand-off probability [default: 0.0]