GitXplorerGitXplorer
A

contra_3_rl

public
13 stars
1 forks
2 issues

Commits

List of commits on branch master.
Verified
f398408a7713c197339bdaa294a7a92919c46828

Add files via upload

AAurelianTactics committed 6 years ago
Verified
b97d12bbd0a1a7d995dcf62d52c1104c6956ae5a

Add files via upload

AAurelianTactics committed 6 years ago
Verified
214d5f9b06b4d46f61141807b66046ad9d76a9d8

Add files via upload

AAurelianTactics committed 6 years ago
Verified
fe24c5ef58745ae21b920e2c4c6a13ba756aae75

Add files via upload

AAurelianTactics committed 6 years ago
Verified
019d9d001665d17cc8e4fb2b1331e1fc484a6fe1

Create blank.md

AAurelianTactics committed 6 years ago
Verified
56dc1923db338b685d47d356a21f93b4c721cc6c

Create blank.md

AAurelianTactics committed 6 years ago

README

The README file for this repository.

contra_3_rl

  • Reinforcement Learning attempts to beat Contra 3 for the SNES
  • Run using Retro Gym (https://github.com/openai/retro)
  • Levels 1-6 cleared using OpenAI Baselines PPO implementation

To run levels 1-6 easy clears:

  • Install Open AI Baselines (https://github.com/openai/baselines). These runs use the November 11th, 2018 version of PPO with some slight modifications (see the misc directory)
  • Place Contra III-Snes directory in corresponding Retro Gym directory and modify scenario.json and data.json if necessary (see readme.md)
  • run ppo2_contra_baselines_agent (...).py from the command line

Explanation of directories:

  • /log: stores TensorBoard data for successful level clears
  • /logs: stores .csv files with basic stats like reward and timesteps
  • /videos: videos of cleared runs
  • /ppo_save: saved models

Explantion of files:

  • ppo2_contra_baselines_agent (...).py: launches the runs
  • render.py: view videos of episodes
  • retro_movie_debugger.py: view videos of episodes and examing env outputs like rewards, info, lua script print messages etc. Useful for debugging
  • monitor_graphs.ipynb: Jupyter notebook that shows data from the /logs files to see results
  • image_test.ipynb: visual testing of observation space so you can see what your agent sees

Command Line Arguments for Level 1-6 Clears:

  • python3 ppo2_contra_medium.py --game ContraIII-Snes --state level4.1player.hard.100lives --num_env 21 --seed 28 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_medium_lvl1_run0 --time_limit 20000 --network contra_mixed_cnn
  • python3 ppo2_contra_medium.py --game ContraIII-Snes --state level3.1player.medium.100lives --num_env 21 --seed 644 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_medium_lvl6_run0 --time_limit 20000 --network contra_mixed_cnn
  • python3 ppo2_contra_medium.py --game ContraIII-Snes --state level1.1player.hard.100lives --num_env 21 --seed 203 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_medium_lvl1_run0 --time_limit 20000 --network contra_mixed_cnn
  • python3 ppo2_contra_medium.py --game ContraIII-Snes --state level1.1player.hard.100lives --num_env 21 --seed 7262 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_medium_lvl1_run0 --time_limit 20000 --network cnn
  • python3 ppo2_contra_medium.py --game ContraIII-Snes --state level1.1player.medium.100lives --num_env 21 --seed 5242 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_medium_lvl1_run0 --time_limit 20000 --network cnn
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level1.1player.easy.100lives --num_env 21 --seed 112 --scenario_number 3 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario16
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level2.1player.easy.100lives --num_env 21 --seed 443 --scenario_number 7 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_lvl2_run2
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level3.1player.easy.100lives --num_env 21 --seed 279 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_lvl3_run12 --time_limit 20000
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level4.1player.easy.100lives --num_env 22 --seed 11143 --scenario_number 2 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario22
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level5.1player.easy.100lives --num_env 21 --seed 46 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario_lvl5_run6
  • python3 ppo2_contra_baselines_agent.py --game ContraIII-Snes --state level6.1player.easy.100lives --num_env 21 --seed 222372 --scenario_number 1 --stochastic_frame_skip 4 --scale_reward 0.01 --skip_prob 0.25 --scenario scenario18