This is a simple notebook you can run on Colab to test and train your own baselines
%%capture
# Install NLE [~3 mins]
!pip install -U cmake
!apt update -qq && apt install -qq -y flex bison libbz2-dev libglib2.0 libsm6 libxext6 git-lfs
!pip install -U pip
!pip install nle torch wandb
%%capture
# Clone Repos (TorchBeast & Starter Kit) [~3 mins]
!git lfs install
!git clone https://github.com/condnsdmatters/torchbeast.git --recursive
!git clone http://gitlab.aicrowd.com/nethack/neurips-2021-the-nethack-challenge.git
%%capture
# Install TorchBeast [~17 mins]
%env CMAKE_MAX_PARALLEL=20
!cd torchbeast \
&& pip install -r requirements.txt \
&& pip install ./nest \
&& python setup.py install
%%capture
# Install StarterKit [~2 mins]
!cd neurips-2021-the-nethack-challenge && pip install -r requirements.txt && git lfs pull
Testing Submissions¶
The starter-kit comes with a pretrained models stored in saved_models/torchbeast/*
and the starter-kit is setup to submit one of these models by default.
By default the starter-kit is configured (in submission_config.py
) to:
- submit a
TorchBeastAgent
... (defined inagents/torchbeast_agent.py
) - ... by loading
MODEL_DIR=saved_models/torchbeast/pretrained_0.5B
When training your own TorchBeast model, simply set MODEL_DIR
to point to the output directory where TorchBeast has saved your model.
In the meantime you can test submissions by running:
cd neurips-2021-the-nethack-challenge/
!python test_submission.py
Training Your Own Agent¶
You can train your agent however you wish, but a standard IMPALA set up is given to you in nethack_baselines/torchbeast/
to help you get started.
Files¶
The best place to start is the README.md
which has info and suggestions on how to improve the model. However a very brief overview of the key files is:
config.yaml
- The yaml file specifies the main flags used for the agent and training (accessible in the variableflags
).models/baseline.py
- Specify the model in usepolybeast_learner.py
- Specify the learning step used for training.polybeast_env.py
- Specify the environments used for training.
Logging¶
When training its can be helpful to keep artefacts and logs of useful data. While TorchBeast logs to stdout (and some files) by default, there is also a supported integration with Weights and Biases! We suggest setting this up by doing the following step:
import wandb
run = wandb.init()
run.finish()
Training¶
Now you can easily get to training! The command is simple.
WARNING If you wish to end your training run: ⌘/Ctrl + M I (Interrupt Execution). Colab struggles to complete cleanup on PolyBeast runs.
!python nethack_baselines/torchbeast/polyhydra.py total_steps=10000 num_actors=128 # Set other arguments here, eg wandb=True entity=<wandb-username>
Content
Comments
You must login before you can post a comment.