GitXplorerGitXplorer
m

dlai-project

public
2 stars
1 forks
0 issues

Commits

List of commits on branch master.
Unverified
56fa0d1e682d07cd89cb011400b0a4ef92ec9265

Fix minor bugs

mmikcnt committed 3 years ago
Unverified
101b353f51e8766733360fdfa0b0ec038b1ba16f

Update readme

mmikcnt committed 3 years ago
Unverified
017f58ae7cdaccdfbae1734b5c2a1e868c0a795d

Refactor code and remove junk

mmikcnt committed 3 years ago
Unverified
ecc7a10516c96f56a38fe01d34c7c14c3c1f4bac

Refactor code and remove junk

mmikcnt committed 3 years ago
Unverified
6679f7fa92b31f6ae3d8a871db2b7af2d6f9b9b4

Refactor code and remove junk

mmikcnt committed 3 years ago
Unverified
0aa7912e090751bd7f9ed5b200504bf410bea57b

Update readme

mmikcnt committed 3 years ago

README

The README file for this repository.

Neighbourhood Models: a humble approach towards World Models

Authors: Michele Conti and Luca Iezzi.

Codebase for the Deep Learning and Applied AI course project at Sapienza University of Rome (ay 2020/2021). This consists of a reimplementation of World Models, and its adaptation on Plunder, one of the environments from Procgen Benchmark.

Agent playing
Controller run

Usage

This implementation is based on Python 3.8. To install all the requirements:

$ pip install -r requirements.txt

Hyperparameters

VAE MDN-RNN Controller
hyperparameter value
Num. rollouts 1000
Setting easy
Num. levels 20
Image size (64, 64, 3)
Batch size 32
Optimizer Adam
Learning rate 0.001
Learning rate scheduler ReduceLROnPlateau
Latent size 64
Epochs 153
hyperparameter value
Num. rollouts 1000
Setting easy
Num. levels 20
Image size (64, 64, 3)
Batch size 32
Optimizer Adam
Learning rate 0.001
Learning rate scheduler ReduceLROnPlateau
Latent size 64
LSTM hidden units 256
Sequence length 32
Epochs 147
hyperparameter value
Image size (64, 64, 3)
Setting easy
Num. levels 20
Evolution algorithm CMA-ES
Learning rate scheduler ReduceLROnPlateau
Latent size 64
LSTM hidden units 256
Population size 64
Num. samples 16
Target return 20
Evaluation frequency 2
Epochs 300

Running

The complete pipeline to train the 3 model components:

1. Generate dataset

First, we generate 1000 rollouts from a random policy:

$ PYTHONPATH=. python3 src/generate_data.py --rollouts 1000

We then split the data in train, validation, and test:

$ PYTHONPATH=. python3 src/split_data.py

Finally, we reorganize the data in order to train the MDN-RNN:

$ PYTHONPATH=. python3 src/pl_data/new_dataset.py

2. Train Vision

To train the Vision, set config_name="vae" in line 149 of the run.py script, then run the following:

$ PYTHONPATH=. python3 src/run.py

3. Train Memory

To train the Memory, set config_name="mdrnn" in line 149 of the run.py script, then run the following:

$ PYTHONPATH=. python3 src/run.py

4. Train Controller

To train the Controller, run the following:

$ PYTHONPATH=. python3 src/controller.py

Credits

We took inspiration from the implementations of Corentin Tallec and Sebastian Risi.