GitXplorerGitXplorer
s

tensorflow-deepq

public
1171 stars
295 forks
6 issues

Commits

List of commits on branch master.
Unverified
149e69e5340984d75df3ff1a374920d870517fb9

Update README.md

ssiemanko committed 8 years ago
Unverified
97f8a6eb630b7b2ff8dca0c9e5f3a85cc3c2d083

Merge pull request #39 from saturnism/patch-2

ssiemanko committed 8 years ago
Unverified
e32bd9b585ab86d0d0ab89e5ab0ab7396f1382f2

Merge pull request #38 from saturnism/patch-1

ssiemanko committed 8 years ago
Unverified
fb8c1472d22fcbde869ba0c56293ddbc98f22dfe

Updated lambda to receive partition_info

ssaturnism committed 8 years ago
Unverified
f4b362c6c2f2dae284caafd08e42e94152c5a3a4

Fixed DiscreteDeepQ initializer in game.ipyndb

ssaturnism committed 8 years ago
Unverified
e269a4f0a13701e6b7f722ee42a8881aa85852f5

Added second license

ssiemanko committed 8 years ago

README

The README file for this repository.

This reposity is now obsolte!

Check out the new simpler, better performing and more complete implementation that we released at OpenAI:

https://github.com/openai/baselines

(scroll for docs of the obsolete version)
























Reinforcement Learning using Tensor Flow

Quick start

Check out Karpathy game in notebooks folder.

The image above depicts a strategy learned by the DeepQ controller. Available actions are accelerating top, bottom, left or right. The reward signal is +1 for the green fellas, -1 for red and -5 for orange.

Requirements

  • future==0.15.2
  • euclid==0.1
  • inkscape (for animation gif creation)

How does this all fit together.

tf_rl has controllers and simulators which can be pieced together using simulate function.

Using human controller.

Want to have some fun controlling the simulation by yourself? You got it! Use tf_rl.controller.HumanController in your simulation.

To issue commands run in terminal

python3 tf_rl/controller/human_controller.py

For it to work you also need to have a redis server running locally.

Writing your own controller

To write your own controller define a controller class with 3 functions:

  • action(self, observation) given an observation (usually a tensor of numbers) representing an observation returns action to perform.
  • store(self, observation, action, reward, newobservation) called each time a transition is observed from observation to newobservation. Transition is a consequence of action and has associated reward
  • training_step(self) if your controller requires training that is the place to do it, should not take to long, because it will be called roughly every action execution.

Writing your own simulation

To write your own simulation define a simulation class with 4 functions:

  • observe(self) returns a current observation
  • collect_reward(self) returns the reward accumulated since the last time function was called.
  • perform_action(self, action) updates internal state to reflect the fact that aciton was executed
  • step(self, dt) update internal state as if dt of simulation time has passed.
  • to_html(self, info=[]) generate an html visualization of the game. info can be optionally passed an has a list of strings that should be displayed along with the visualization

Creating GIFs based on simulation

The simulate method accepts save_path argument which is a folder where all the consecutive images will be stored. To make them into a GIF use scripts/make_gif.sh PATH where path is the same as the path you passed to save_path argument