vwxyzjn

RLHF @allenai, CS Ph.D. from Drexel University in RL.

193 repositories

1411 followers

Philadelphia, PA

https://costa.sh

@vwxyzjn

Repositories

Select a repository to view its commits, contributors, and more.

public

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

portwarden

Create Encrypted Backups of Your Bitwarden Vault with Attachments

604

Updated 2 days ago

public

lm-human-preference-details

RLHF implementation details of OAI's 2019 codebase

invalid-action-masking

Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

summarize_from_feedback_details

cleanba

CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

PPO-Implementation-Deep-Dive

DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details

Python

Updated 2 months ago

public

gym-microrts-paper

The source code for the gym-microrts paper.

Python

Updated 2 months ago

public

a2c_is_a_special_case_of_ppo

A2C is a special case of PPO!

Python

Updated 5 months ago

public

jupyter_disqus

Add Disqus to your Jupyter notebook.

Python

Updated 5 months ago

public

SC2AI

Integrated Tensorforce and OpenAI Gym to train SC II game agents.

Jupyter Notebook

Updated 5 months ago