GitXplorerGitXplorer
o

ppo-ewma

public
46 stars
16 forks
0 issues

Commits

List of commits on branch master.
Unverified
625dac7aceb9fb8bfdc3eff9dba597ec8f685ac7

ppo_ewma

jjacobhilton committed 3 years ago
Unverified
7295473f0185c82f9eb9c1e17a373135edd8aacc

n_epoch_vf typo bugfix

kkcobbe committed 4 years ago
Unverified
c789b00be58aa704f7223b6fc8cd28a5aaa2e101

initial commit

kkcobbe committed 4 years ago

README

The README file for this repository.

Status: Archive (code is provided as-is, no updates expected)

PPO-EWMA

This is code for training agents using PPO-EWMA and PPG-EWMA, introduced in the paper Batch size-invariance for policy optimization (citation). It is based on the code for Phasic Policy Gradient.

Installation

Supported platforms: MacOS and Ubuntu, Python 3.7

Installation using Miniconda:

git clone https://github.com/openai/ppo-ewma.git
conda env update --name ppo-ewma --file ppo-ewma/environment.yml
conda activate ppo-ewma
pip install -e ppo-ewma

Alternatively, install the dependencies from environment.yml manually.

Visualize results

Results are stored in blob storage at https://openaipublic.blob.core.windows.net/rl-batch-size-invariance/, and can be visualized as in the paper using this Colab notebook.

Citation

Please cite using the following BibTeX entry:

@article{hilton2021batch,
  title={Batch size-invariance for policy optimization},
  author={Hilton, Jacob and Cobbe, Karl and Schulman, John},
  journal={arXiv preprint arXiv:2110.00641},
  year={2021}
}