Loading

NeurIPS 2023 - The Neural MMO Challenge

Colab Starter Kit

Train the baseline in Colab

joseph_suarez
In [3]:
# Set up the work directory
import os
assert os.path.exists("/content/drive/MyDrive"), "Google Drive not mounted"

work_dir = "/content/drive/MyDrive/nmmo/"
In [4]:
# Install nmmo env and pufferlib
!pip install nmmo pufferlib > /dev/null
!pip show nmmo  # should be 2.0.3
!pip show pufferlib # should be 0.4.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
Name: nmmo
Version: 2.0.3
Summary: Neural MMO is a platform for multiagent intelligence research inspired by Massively Multiplayer Online (MMO) role-playing games. Documentation hosted at neuralmmo.github.io.
Home-page: https://github.com/neuralmmo/environment
Author: Joseph Suarez
Author-email: jsuarez@mit.edu
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: autobahn, dill, gym, imageio, numpy, ordered-set, pettingzoo, psutil, py, pylint, pytest, pytest-benchmark, scipy, tqdm, Twisted, vec-noise
Required-by: 
Name: pufferlib
Version: 0.4.3
Summary: PufferAI LibraryPufferAI's library of RL tools and utilities
Home-page: https://github.com/PufferAI/PufferLib
Author: Joseph Suarez
Author-email: jsuarez@mit.edu
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: cython, filelock, gym, numpy, opencv-python, openskill, pettingzoo
Required-by: 
In [5]:
# Create the work directory, download the baselines code
%mkdir $work_dir
%cd $work_dir
!git clone https://github.com/neuralmmo/baselines --depth=1
%cd baselines

# Create a requirements_colab.txt
with open(work_dir+'baselines/requirements_colab.txt', "w") as f:
  f.write("""
accelerate==0.21.0
bitsandbytes==0.41.1
dash==2.11.1
openelm
pandas
plotly==5.15.0
psutil==5.9.3
ray==2.6.1
scikit-learn==1.3.0
tensorboard==2.11.2
tiktoken==0.4.0
torch
transformers==4.31.0
wandb==0.13.7
  """)
/content/drive/MyDrive/nmmo
Cloning into 'baselines'...
remote: Enumerating objects: 57, done.
remote: Counting objects: 100% (57/57), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 57 (delta 2), reused 33 (delta 2), pack-reused 0
Receiving objects: 100% (57/57), 23.54 MiB | 14.61 MiB/s, done.
Resolving deltas: 100% (2/2), done.
/content/drive/MyDrive/nmmo/baselines
In [6]:
# Install libs to run the baselines
!pip install -r requirements_colab.txt > /dev/null
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llmx 0.0.15a0 requires cohere, which is not installed.
tensorflow 2.13.0 requires tensorboard<2.14,>=2.13, but you have tensorboard 2.11.2 which is incompatible.
In [7]:
# Just to check if the training flow works
!python train.py --runs-dir $work_dir --local-mode true
INFO:root:Training run: nmmo_20231022_225107 (/content/drive/MyDrive/nmmo/nmmo_20231022_225107)
INFO:root:Training args: Namespace(attend_task='none', attentional_decode=True, bptt_horizon=8, checkpoint_interval=30, clip_coef=0.1, death_fog_tick=None, device='cuda', early_stop_agent_num=8, encode_task=True, eval_batch_size=32768, eval_mode=False, eval_num_policies=2, eval_num_rounds=1, eval_num_steps=1000000, explore_bonus_weight=0.01, extra_encoders=True, heal_bonus_weight=0.03, hidden_size=256, input_size=256, learner_weight=1.0, local_mode=True, map_size=128, maps_path='maps/train/', max_episode_length=1024, max_opponent_policies=0, meander_bonus_weight=0.02, num_agents=128, num_buffers=1, num_cores=None, num_envs=1, num_lstm_layers=0, num_maps=128, num_npcs=256, policy_store_dir=None, ppo_learning_rate=0.00015, ppo_training_batch_size=128, ppo_update_epochs=3, resilient_population=0.2, rollout_batch_size=1024, run_name='nmmo_20231022_225107', runs_dir='/content/drive/MyDrive/nmmo/', seed=1, spawn_immunity=20, sqrt_achievement_rewards=False, task_size=4096, tasks_path='reinforcement_learning/curriculum_with_embedding.pkl', track='rl', train_num_steps=10000000, use_serial_vecenv=True, wandb_entity=None, wandb_project=None)
INFO:root:Using policy store from /content/drive/MyDrive/nmmo/nmmo_20231022_225107/policy_store
INFO:root:Generating 128 maps
Allocated 93.30 MB to environments. Only accurate for Serial backend.
PolicyPool sample_weights: [128]
Allocated to storage - Pytorch: 0.00 GB, System: 0.11 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.01 GB, System: 1.53 GB
Epoch: 0 - 1K steps - 0:01:20 Elapsed
	Steps Per Second: Env=759, Inference=185
	Train=430

Allocated during training - Pytorch: 0.07 GB, System: 0.24 GB
INFO:root:Saving policy to /content/drive/MyDrive/nmmo/nmmo_20231022_225107/policy_store/nmmo_20231022_225107.000001
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.00 GB, System: 0.01 GB
Epoch: 1 - 2K steps - 0:01:26 Elapsed
	Steps Per Second: Env=565, Inference=3752
	Train=610

Allocated during training - Pytorch: 0.01 GB, System: 0.03 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.00 GB, System: 0.02 GB
Epoch: 2 - 3K steps - 0:01:31 Elapsed
	Steps Per Second: Env=438, Inference=3722
	Train=651

Allocated during training - Pytorch: 0.01 GB, System: 0.04 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.00 GB, System: -0.04 GB
Epoch: 3 - 4K steps - 0:01:35 Elapsed
	Steps Per Second: Env=736, Inference=5234
	Train=732

Allocated during training - Pytorch: 0.01 GB, System: 0.01 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.00 GB, System: 0.00 GB
Epoch: 4 - 5K steps - 0:01:40 Elapsed
	Steps Per Second: Env=438, Inference=4811
	Train=738

Allocated during training - Pytorch: 0.01 GB, System: 0.01 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])
Allocated during evaluation - Pytorch: 0.00 GB, System: 0.00 GB
Epoch: 5 - 6K steps - 0:01:44 Elapsed
	Steps Per Second: Env=719, Inference=4496
	Train=637

Allocated during training - Pytorch: 0.01 GB, System: 0.00 GB
INFO:root:PolicyPool: Updated policies: dict_keys(['learner'])

In [ ]:


Comments

You must login before you can post a comment.

Execute