Neighbourhood Models: a humble approach towards World Models

Authors: Michele Conti and Luca Iezzi.

Codebase for the Deep Learning and Applied AI course project at Sapienza University of Rome (ay 2020/2021). This consists of a reimplementation of World Models, and its adaptation on Plunder, one of the environments from Procgen Benchmark.

Agent playing

Usage

This implementation is based on Python 3.8. To install all the requirements:

$ pip install -r requirements.txt

Hyperparameters

VAE

MDN-RNN

Controller

hyperparameter	value
Num. rollouts	1000
Setting	easy
Num. levels	20
Image size	(64, 64, 3)
Batch size	32
Optimizer	Adam
Learning rate	0.001
Learning rate scheduler	ReduceLROnPlateau
Latent size	64
Epochs	153

hyperparameter	value
Num. rollouts	1000
Setting	easy
Num. levels	20
Image size	(64, 64, 3)
Batch size	32
Optimizer	Adam
Learning rate	0.001
Learning rate scheduler	ReduceLROnPlateau
Latent size	64
LSTM hidden units	256
Sequence length	32
Epochs	147

hyperparameter	value
Image size	(64, 64, 3)
Setting	easy
Num. levels	20
Evolution algorithm	CMA-ES
Learning rate scheduler	ReduceLROnPlateau
Latent size	64
LSTM hidden units	256
Population size	64
Num. samples	16
Target return	20
Evaluation frequency	2
Epochs	300

Running

The complete pipeline to train the 3 model components:

1. Generate dataset

First, we generate 1000 rollouts from a random policy:

$ PYTHONPATH=. python3 src/generate_data.py --rollouts 1000

We then split the data in train, validation, and test:

$ PYTHONPATH=. python3 src/split_data.py

Finally, we reorganize the data in order to train the MDN-RNN:

$ PYTHONPATH=. python3 src/pl_data/new_dataset.py

2. Train Vision

To train the Vision, set config_name="vae" in line 149 of the run.py script, then run the following:

$ PYTHONPATH=. python3 src/run.py

3. Train Memory

To train the Memory, set config_name="mdrnn" in line 149 of the run.py script, then run the following:

$ PYTHONPATH=. python3 src/run.py

4. Train Controller

To train the Controller, run the following:

$ PYTHONPATH=. python3 src/controller.py

Credits

We took inspiration from the implementations of Corentin Tallec and Sebastian Risi.

dlai-project

Commits

Fix minor bugs

Update readme

Refactor code and remove junk

Refactor code and remove junk

Refactor code and remove junk

Update readme

README

Neighbourhood Models: a humble approach towards World Models

Usage

Hyperparameters

Running

1. Generate dataset

2. Train Vision

3. Train Memory

4. Train Controller

Credits