Loading

NeurIPS 2023 - The Neural MMO Challenge

Train and Submit from Colab

A step-by-step guide for setting up your Colab

kyoung_whan_choe

Open in Colab doesn't seem to work. Click this url instead: https://colab.research.google.com/drive/1v4B5h3MANw6PWH4U4oKba5hfYK5DtlVU

Set up your instance - gpu and google drive

In [ ]:
# Check if (NVIDIA) GPU is available
import torch
assert torch.cuda.is_available, "CUDA gpu not available"
In [ ]:
# Set up the work directory
import os
assert os.path.exists("/content/drive/MyDrive"), "Google Drive not mounted"

work_dir = "/content/drive/MyDrive/nmmo/"

Train your agent

Install nmmo env and pufferlib

In [ ]:
# Install nmmo env and pufferlib
!pip install nmmo pufferlib > /dev/null
!pip show nmmo  # should be 2.0.3
!pip show pufferlib # should be 0.4.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
tensorflow 2.14.0 requires numpy>=1.23.5, but you have numpy 1.23.3 which is incompatible.
Name: nmmo
Version: 2.0.3
Summary: Neural MMO is a platform for multiagent intelligence research inspired by Massively Multiplayer Online (MMO) role-playing games. Documentation hosted at neuralmmo.github.io.
Home-page: https://github.com/neuralmmo/environment
Author: Joseph Suarez
Author-email: jsuarez@mit.edu
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: autobahn, dill, gym, imageio, numpy, ordered-set, pettingzoo, psutil, py, pylint, pytest, pytest-benchmark, scipy, tqdm, Twisted, vec-noise
Required-by: 
Name: pufferlib
Version: 0.4.3
Summary: PufferAI LibraryPufferAI's library of RL tools and utilities
Home-page: https://github.com/PufferAI/PufferLib
Author: Joseph Suarez
Author-email: jsuarez@mit.edu
License: MIT
Location: /usr/local/lib/python3.10/dist-packages
Requires: cython, filelock, gym, numpy, opencv-python, openskill, pettingzoo
Required-by: 

Install the baselines

In [ ]:
# Create the work directory, download the baselines code
%mkdir $work_dir
%cd $work_dir
!git clone https://github.com/neuralmmo/baselines --depth=1
/content/drive/MyDrive/nmmo
Cloning into 'baselines'...
remote: Enumerating objects: 57, done.
remote: Counting objects: 100% (57/57), done.
remote: Compressing objects: 100% (50/50), done.
remote: Total 57 (delta 2), reused 33 (delta 2), pack-reused 0
Receiving objects: 100% (57/57), 23.54 MiB | 14.61 MiB/s, done.
Resolving deltas: 100% (2/2), done.
/content/drive/MyDrive/nmmo/baselines
In [ ]:
# Install libs to run the baselines
%cd $work_dir
%cd baselines

# Create a requirements_colab.txt
with open(work_dir+'baselines/requirements_colab.txt', "w") as f:
  f.write("""
accelerate==0.21.0
bitsandbytes==0.41.1
dash==2.11.1
openelm
pandas
plotly==5.15.0
psutil==5.9.3
ray==2.6.1
scikit-learn==1.3.0
tensorboard==2.11.2
tiktoken==0.4.0
torch
transformers==4.31.0
wandb==0.13.7
  """)

!pip install -r requirements_colab.txt > /dev/null
/content/drive/MyDrive/nmmo
/content/drive/MyDrive/nmmo/baselines
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llmx 0.0.15a0 requires cohere, which is not installed.
tensorflow 2.14.0 requires numpy>=1.23.5, but you have numpy 1.23.3 which is incompatible.
tensorflow 2.14.0 requires tensorboard<2.15,>=2.14, but you have tensorboard 2.11.2 which is incompatible.

Run python train.py

In [ ]:
# Just to check if the training flow works. The checkpoints are saved under nmmo/runs
%cd $work_dir
%cd baselines

ckpt_dir = work_dir + "runs"

!python train.py --runs-dir $ckpt_dir --local-mode true
/content/drive/MyDrive/nmmo
/content/drive/MyDrive/nmmo/baselines
INFO:root:Training run: nmmo_20231023_191505 (/content/drive/MyDrive/nmmo/runs/nmmo_20231023_191505)
INFO:root:Training args: Namespace(attend_task='none', attentional_decode=True, bptt_horizon=8, checkpoint_interval=30, clip_coef=0.1, death_fog_tick=None, device='cuda', early_stop_agent_num=8, encode_task=True, eval_batch_size=32768, eval_mode=False, eval_num_policies=2, eval_num_rounds=1, eval_num_steps=1000000, explore_bonus_weight=0.01, extra_encoders=True, heal_bonus_weight=0.03, hidden_size=256, input_size=256, learner_weight=1.0, local_mode=True, map_size=128, maps_path='maps/train/', max_episode_length=1024, max_opponent_policies=0, meander_bonus_weight=0.02, num_agents=128, num_buffers=1, num_cores=None, num_envs=1, num_lstm_layers=0, num_maps=128, num_npcs=256, policy_store_dir=None, ppo_learning_rate=0.00015, ppo_training_batch_size=128, ppo_update_epochs=3, resilient_population=0.2, rollout_batch_size=1024, run_name='nmmo_20231023_191505', runs_dir='/content/drive/MyDrive/nmmo/runs', seed=1, spawn_immunity=20, sqrt_achievement_rewards=False, task_size=4096, tasks_path='reinforcement_learning/curriculum_with_embedding.pkl', track='rl', train_num_steps=10000000, use_serial_vecenv=True, wandb_entity=None, wandb_project=None)
INFO:root:Using policy store from /content/drive/MyDrive/nmmo/runs/nmmo_20231023_191505/policy_store
Allocated 93.62 MB to environments. Only accurate for Serial backend.
PolicyPool sample_weights: [128]
Traceback (most recent call last):
  File "/content/drive/MyDrive/nmmo/baselines/train.py", line 134, in <module>
    trainer = setup_env(args)
  File "/content/drive/MyDrive/nmmo/baselines/train.py", line 39, in setup_env
    trainer = clean_pufferl.CleanPuffeRL(
  File "<string>", line 33, in __init__
  File "/content/drive/MyDrive/nmmo/baselines/reinforcement_learning/clean_pufferl.py", line 182, in __post_init__
    self.optimizer = optim.Adam(
  File "/usr/local/lib/python3.10/dist-packages/torch/optim/adam.py", line 45, in __init__
    super().__init__(params, defaults)
  File "/usr/local/lib/python3.10/dist-packages/torch/optim/optimizer.py", line 266, in __init__
    self.add_param_group(cast(dict, param_group))
  File "/usr/local/lib/python3.10/dist-packages/torch/_compile.py", line 22, in inner
    import torch._dynamo
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/__init__.py", line 2, in <module>
    from . import allowed_functions, convert_frame, eval_frame, resume_execution
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/allowed_functions.py", line 24, in <module>
    from . import config
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/config.py", line 49, in <module>
    torch.onnx.is_in_onnx_export: False,
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1831, in __getattr__
    return importlib.import_module(f".{name}", __name__)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/__init__.py", line 10, in <module>
    from . import (  # usort:skip. Keep the order instead of sorting lexicographically
  File "/usr/local/lib/python3.10/dist-packages/torch/onnx/symbolic_caffe2.py", line 4, in <module>
    from torch.onnx import symbolic_helper, symbolic_opset9 as opset9
  File "<frozen importlib._bootstrap>", line 1024, in _find_and_load
KeyboardInterrupt
^C

Submit your checkpoint

See https://gitlab.aicrowd.com/Mudou/start-kit

  • Sign up for AICrowd and click Participate on the competition page.
  • Generate your SSH key and paste it to https://gitlab.aicrowd.com/-/profile/keys
  • Clone the stark-kit repository: git@gitlab.aicrowd.com:Mudou/start-kit.git. HTTP will not work unless you have 2FA configured.
  • Install the requirements with pip install -r requirements.txt

Set up your SSH connection

In [ ]:
# Generating the ssh key with your aicrowd email
# WARNING: Having your ssh key way is not secure, so you should limit using this key for others

my_email = "choe.kyoung@gmail.com"  # YOUR AICROWD EMAIL

# See the top for the work_dir, which should be in your google drive
ssh_dir = work_dir + "ssh_key/"
key_file = ssh_dir + "id_rsa"

%cd $work_dir
!mkdir $ssh_dir
!ssh-keygen -t rsa -b 4096 -C $my_email -f $key_file
/content/drive/MyDrive/nmmo
mkdir: cannot create directory ‘/content/drive/MyDrive/nmmo/ssh_key/’: File exists
Generating public/private rsa key pair.
/content/drive/MyDrive/nmmo/ssh_key/id_rsa already exists.
Overwrite (y/n)? n
In [ ]:
# Copy the below text that starts with ssh-rsa to https://gitlab.aicrowd.com/-/profile/keys
ssh_dir = work_dir + "ssh_key/"
key_file = ssh_dir + "id_rsa"

!cat {key_file}.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDFK3GU203O48HvtFuRYpd4AW4+dszemCWfqbjRP+qiEu9vjpGbj/fHI/sa19+4wYkTNzVEBNl3vYnGUdGXhCWTjs3n7+nBAKN5T7AtD6Hbi1r0V77FZP4FQw/aiP/zu1HDYbGl1xD8OTVJe4/kKsFkbrp90F4C7q/2GlUpUccyMV7PbGp09+eN5XqYl2RLjGi0MFUO6/7AHhcal1FEeXddak9/KfRKcOmqgJUddMFlOq4P8Pf0KKOVupsXN8EFHvkRsEKvmJj2+JgGAA94qttSLcVoRablbXJLomx8yx0OasNvfxR34ZEQyZTEIDqlU9xPUNZYcPUPjtuksJ6xh/KSxQd8mrjis6np2UI8S1r8CMZVcz0eY5BoRLEUa5wGpc8cAWmfMGA5vZtsEPrAlEC0JJP60cDxQhGTR+FOz3oQnOhOmKunRthObi1HFt0wio8MgykrCdXJGAFlEhks3ZYlMirDSsAcCqPi8TxQEWlRqyYviXGJ6Z6+uzve9k+P2zDerk/glozbjGG0btW5ag1M+hN2CIt9Z+yqWqrVKpTaR4WWmiSxsz0/u8FEfjomNwQhXIGr3Le1HmzdiRHx/C177arTcqkEaMAdPiygxQX4rLVzrf/1huOkowCcBmvOOw8lWFIzKDhS14Ap4ENUAiwIFw6qRNKZymtDT8GSYD243Q== choe.kyoung@gmail.com
In [ ]:
# Copy the key to default ssh key path - you should see id_rsa
!mkdir /root/.ssh
!cp {key_file}* /root/.ssh
!ls /root/.ssh
!chmod 700 /root/.ssh

# Add the git server as a ssh known host
!touch /root/.ssh/known_hosts
!ssh-keyscan gitlab.aicrowd.com >> /root/.ssh/known_hosts
!chmod 644 /root/.ssh/known_hosts

# You should see something like: Welcome to GitLab, @kyoung_whan_choe!
# to clone the repo and submit
!ssh -T git@gitlab.aicrowd.com
mkdir: cannot create directory ‘/root/.ssh’: File exists
id_rsa	id_rsa.pub  known_hosts
# gitlab.aicrowd.com:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.6
# gitlab.aicrowd.com:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.6
# gitlab.aicrowd.com:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.6
# gitlab.aicrowd.com:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.6
# gitlab.aicrowd.com:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.6
Welcome to GitLab, @kyoung_whan_choe!

Prepare for submission with the start-kit

In [ ]:
# Clone the submission kit repo
%cd $work_dir
!git clone git@gitlab.aicrowd.com:Mudou/start-kit.git
/content/drive/MyDrive/nmmo
Cloning into 'start-kit'...
remote: Enumerating objects: 179, done.
remote: Counting objects: 100% (179/179), done.
remote: Compressing objects: 100% (108/108), done.
remote: Total 179 (delta 86), reused 147 (delta 63), pack-reused 0
Receiving objects: 100% (179/179), 712.61 KiB | 4.95 MiB/s, done.
Resolving deltas: 100% (86/86), done.
Updating files: 100% (10/10), done.
fatal: cannot exec '/content/drive/MyDrive/nmmo/start-kit/.git/hooks/post-checkout': Permission denied
In [ ]:
%cd $work_dir
%cd start-kit/

# Fix permissions
!chmod +x .git/hooks/*

# Install requirements
!pip install -r requirements.txt > /dev/null
/content/drive/MyDrive/nmmo
/content/drive/MyDrive/nmmo/start-kit
In [42]:
# Edit the aicrowd.json -- INCLUDE YOUR NAME

with open(work_dir+'start-kit/aicrowd.json', "w") as f:
  f.write("""
{
    "challenge_id" : "neurips-2023-the-neural-mmo-challenge",
    "authors" : ["kyoung_whan_choe"],
    "description" : "Submitting baselines from the submission tutorial colab, take 17"
}
  """)

Submit!

In [ ]:
# Click the link to authenticate into aicrowd
%cd $work_dir
%cd start-kit/

!python tool.py submit "track1-submission-tutorial-17"
/content/drive/MyDrive/nmmo
/content/drive/MyDrive/nmmo/start-kit
Please make sure putting all your submission related (code, model, ...) in the my-submission folder.
Current repo size: 17MB
aicrowd_setup done.
Current authors are: ['kyoung_whan_choe']
Enter the authors (seperated by comma(,)). If no change to the authors, just press ENTER.
: 
Current authors are: ['kyoung_whan_choe']
88fa2630752476524a8ad7a6dd4f90bfdb59fcfefc0e49e3a255c76994325b2f
Making submission as "kyoung_whan_choe"
Checking git remote settings...
Using gitlab.aicrowd.com:kyoung_whan_choe/start-kit as the submission repository
Updated git hooks.
Git LFS initialized.
234234
[main 7736be5] Changes for submission-track1-submission-tutorial-17
 1 file changed, 1 insertion(+), 1 deletion(-)
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Uploading LFS objects: 100% (2/2), 33 MB | 0 B/s, done.
Enumerating objects: 151, done.
Counting objects: 100% (151/151), done.
Delta compression using up to 2 threads
Compressing objects: 100% (80/80), done.
Writing objects: 100% (151/151), 707.10 KiB | 32.14 MiB/s, done.
Total 151 (delta 71), reused 142 (delta 66), pack-reused 0
remote: Resolving deltas: 100% (71/71), done.
remote: 
remote: 
remote: The private project kyoung_whan_choe/start-kit was successfully created.
remote: 
remote: To configure the remote, run:
remote:   git remote add origin git@gitlab.aicrowd.com:kyoung_whan_choe/start-kit.git
remote: 
remote: To view the project, visit:
remote:   http://gitlab.aicrowd.com/kyoung_whan_choe/start-kit
remote: 
remote: 
remote: 
To gitlab.aicrowd.com:kyoung_whan_choe/start-kit.git
 * [new branch]      main -> main
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Enumerating objects: 1, done.
Counting objects: 100% (1/1), done.
Writing objects: 100% (1/1), 197 bytes | 32.00 KiB/s, done.
Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
remote: 
remote:           #///(            )///#
remote:          ////      ///      ////
remote:         /////   //////////   ////
remote:         /////////////////////////
remote:      /// /////////////////////// ///
remote:    ///////////////////////////////////
remote:   /////////////////////////////////////
remote:     )////////////////////////////////(
remote:      /////                      /////
remote:    (///////   ///       ///    //////)
remote:   ///////////    ///////     //////////
remote: (///////////////////////////////////////)
remote:           /////           /////
remote:             /////////////////
remote:                ///////////
remote: 
To gitlab.aicrowd.com:kyoung_whan_choe/start-kit.git
 * [new tag]         submission-track1-submission-tutorial-17 -> submission-track1-submission-tutorial-17
Check the submission progress in your repository: https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit/issues

If submission hangs or got an git lfs error ...

In [ ]:
# If submission hangs or you get a git lfs error, retry after running:

!git lfs fetch --all origin
!git lfs push --all aicrowd
fetch: 3 object(s) found, done.
fetch: Fetching all references...
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Locking support detected on remote "aicrowd". Consider enabling it with:
  $ git config lfs.https://gitlab.aicrowd.com/kyoung_whan_choe/start-kit.git/info/lfs.locksverify true
Uploading LFS objects: 100% (2/2), 33 MB | 0 B/s, done.

Comments

You must login before you can post a comment.

Execute