Official repository for my MSc thesis in Artificial Intelligence at the University of Amsterdam:
Addressing Goal Misgeneralization with Natural Language Interfaces
Find and read the thesis here or here.
Details such as python and package versions can be found in the generated pyproject.toml and poetry.lock files.
We recommend using an environment manager such as conda. After setting up your environment with the correct python version, please proceed with the installation of the required packages
First, install the pre-requirements1:
pip install -r requirements/pre-reqs.txt
Then install the rest of the requirements:
pip install -r requirements/requirements-complete.txt
These requirements.txt
file are generated by running
gen_pip_reqs.sh
├── LICENSE
├── README.md <- The top-level README
├── data/ <- Datasets
├── checkpoints/ <- Trained and serialized models.
├── notebooks/ <- Jupyter notebooks.
├── documents/ <- Documents as HTML, PDF, LaTeX, etc.
├── pyproject.toml <- Project metadata, handled by poetry.
├── poetry.lock <- Resolving and locking dependencies, handled by poetry.
├── requirements.txt <- For non-poetry users.
├── gen_pip_reqs.sh <- For generating the pip requirements.txt file
├── tests/ <- Tests
├── outputs/ <- Output files. Not committed.
└── src/nlgoals/ <- Source code for use in this project.
├── __init__.py <- Makes src a Python module
├── babyai/ <- (modified) code from babyai repo
├── configs/ <- Miscellaneaous configuration
├── utils.py <- Miscellaneaous utils
├── data/ <- Data processing and handling
├── interfaces/ <- Logic for connecting datasets and models
├── models/ <- Model definitions
├── losses/ <- Custom loss function implementations
├── trainer/ <- Configurations for training
└── run/ <- Scripts to train, evaluate and use models
The project structure is largely based on the cookiecutter data-science template. This is purposely opinionated so that paths align over collaborators without having to edit config files. Users may find the cookiecutter data-science opinions page, of relevance
The top level data/
and checkpoints/
directories are in version control only
to show structure. Their contents will not be committed and are ignored via
.gitignore
.
"turn on the light bulb"
https://github.com/thesofakillers/nlgoals/assets/26286291/e6849007-11f0-44e9-9d66-3b2b575492fc
"go to the key"
https://github.com/thesofakillers/nlgoals/assets/26286291/4d8711f4-e53a-4c7f-bd2d-9fb5a2ee1bdb
-
This annoying first step is necessary because of stupid packages we depend on like pyhash and multicoretsne that don't know how to get their dependencies sorted out like everyone else. See this issue and this issue. These two packages are the main culprits but other packages haven't been very good at defining their dependencies either. ↩