Round 0 - Pre-Launch: Completed Round 1: Completed Round 2: Completed #neurips #reinforcement_learning #imitation_learning

We are excited to announce that the MineRL Diamond challenge has been selected for the NeurIPS 2021 competition track!

🕵️ Introduction

The MineRL 2021 Diamond Competition aims to foster the development of algorithms which can efficiently leverage human demonstrations to drastically reduce the number of samples needed to solve complex, hierarchical, and sparse environments. 

To that end, participants will compete to develop systems that can obtain a diamond in Minecraft from raw pixels using only 8,000,000 samples from the MineRL simulator and 4 days of training on a single GPU machine. Participants will be provided the MineRL-v0 dataset (website, paper), a large-scale collection of over 60 million frames of human demonstrations, enabling them to utilize expert trajectories to minimize their algorithm’s interactions with the Minecraft simulator. 

🆕 What's New!?

This competition is the third iteration of the MineRL Competition and we’ve introduced several new changes.

The major difference is the inclusion of two tracks: Intro and Research. The Research track follows the setup of the MineRL 2020 competition, where agents are retrained using submitted code, must use the obfuscated environments, and cannot contain hardcoding. The Intro track loosens these rules by using the non-obfuscated environments and allowing hardcoded behaviour.

💎 Task



The task of the competition is solving the MineRLObtainDiamond-v0 environment (or MineRLObtainDiamondVectorObf-v0 for Research track). In this environment, the agent begins in a random starting location without any items, and is tasked with obtaining a diamond. This task can only be accomplished by navigating the complex item hierarchy of Minecraft. 

item hierarchy

The agent receives a high reward for obtaining a diamond as well as smaller, auxiliary rewards for obtaining prerequisite items. In addition to the main environment, we provide a number of auxiliary environments. These consist of tasks which are either subtasks of ObtainDiamond or other tasks within Minecraft.

📜 Rules

The primary aim of the competition is to develop sample-efficient training algorithms. Therefore, the Research track discourages using environment-specific, hand-engineered features that do not demonstrate fundamental algorithmic improvements. To encourage more participation, the Intro track does not set such strict rules and focuses on obtaining the diamond by any means necessary.

Specific rules can be found in the "Rules" tab on this page. Participants must agree to those rules prior to participating.

🖊 Evaluation

A submission’s score is the average total reward across all of its evaluation episodes.

Intro Track

Evaluation happens with the MineRLObtainDiamond-v0  environment.

During Round 1, submissions will be evaluated as they are received, and the resulting score will be added to the leaderboard.

Research Track

Evaluation happens with the MineRLObtainDiamondVectorObf-v0  environment.

During Round 1, submissions will be evaluated as they are received, and the resulting score will be added to the leaderboard. At the end of the round, competitors’ submissions will be retrained, and teams with a significantly lower score after retraining will be dropped from Round 1. 

During Round 2, teams can make a number of submissions, each of which will be re-trained and evaluated as they are received. Each team’s leaderboard position is determined by the maximum score across its submissions in Round 2.

📁 Competition Structure

Round 1: General Entry

In this round, teams of up to 6 individuals will do the following:

  1. Register on the AICrowd competition website and receive the materials listed below. Optionally, may form a team using the ‘Create Team’ button on the competition overview (must be signed in to create a team).
    1. Starter code for running the environments for the competition task. Also see the "Notebooks" section here on AICrowd.
    2. Baseline implementations provided competition organizers.
    3. The human demonstration dataset.
    4. Docker Images and a quick-start template that the competition organizers will use to validate the training performance of the competitor’s models.
    5. Scripts enabling the procurement of the standard cloud compute system used to evaluate the sample-efficiency of participants’ submissions.
  2. Use the provided human demonstrations to develop and test procedures for efficiently training models to solve the competition task.
  3. Code and train their agents
    1. Intro track: Train/code their models to solve the MineRLObtainDiamond-v0 environment. Submit their trained models for evaluation when satisfied with their models.
    2. Research track: Train their models to solve the MineRLObtainDiamondVectorObf-v0 environment using only 8,000,000 environment samples in less than four days. The submission template provides scripts for training and evaluating the agent (same scripts are used on the evaluation server).
  4. Submit their trained models for evaluation when satisfied with their models. The automated evaluation setup will evaluate the submissions against the validation environment, to compute and report the metrics on the leaderboard of the competition.
  5. Repeat 2-4 until Round 1 is complete!

Once Round 1 is complete, the organizers will:

  • Examine the code repositories of the top submissions on the leaderboard to ensure compliance with the competition rules.
  • Research track
    • The top submissions which comply with the competition rules will then automatically be re-trained by the competition orchestration platform.
    • Evaluate the resulting models again over several hundred episodes to determine the final ranking.
    • Fork the code repositories associated with the corresponding submissions, and scrub them of any files larger than 30MB to ensure that participants are not using any pre-trained models in the subsequent round.


In this round, the top 15 performing teams from the Research track will continue to develop their algorithms. Their work will be evaluated against a confidential, held-out test environment and test dataset, to which they will not have access.

Participants will be able to make a few submissions (at most, accurate number TBA) during Round 2. For each submission, the automated evaluator will train their procedure on the held out test dataset and simulator, evaluate the trained model, and report the score and metrics back to the participants. The final ranking for this round will be based on the best-performing submission by each team during Round 2. Participants are also allowed to make several "debug" submissions per day to first validate their code on the evaluator server before doing a full submission.

💵 Prizes and Funding Opportunities

To be determined. Research track will have bigger prizes than the intro track!

We are currently in discussion with potential sponsors. We are open to accepting additional sponsors; if interested, please contact smilani@cs.cmu.edu.

📅 Timeline 

  • 9th June: Round 1 begins and submission system opens.
  • 30th September: Round 1 ends (Intro track ends). Round 2 begins for the research track.
  • 26th October: Round 2 ends. Submissions are validated for compliance with rules.
  • December: Winners are invited to NeurIPS 2021 to present their results.

💪 Getting Started

You can find the competition submission starter kit on GitHub here.

Here are some additional resources!

🙋 F.A.Q.

This F.A.Q is the only official place for clarification of competition Rules!

Q: Do I need to purchase Minecraft to participate?

 > A: No! MineRL includes a special version of Minecraft provided generously by the folks at Microsoft Research via Project Malmo.

We will be updating the FAQ soon!

Have more questions? Ask in Discord or on the Forum

🤝 Partners

Thank you to our amazing partners!

Carnegie Mellon UniversityMicrosoft

👥 Team

The organizing team consists of:

  • William H. Guss (OpenAI and Carnegie Mellon University)
  • Alara Dirik (Boğaziçi University)
  • Byron V. Galbraith (Talla)
  • Brandon Houghton (OpenAI)
  • Anssi Kanervisto (University of Eastern Finland)
  • Noboru Sean Kuno (Microsoft Research)
  • Stephanie Milani (Carnegie Mellon University)
  • Sharada Mohanty (AIcrowd)
  • Karolis Ramanauskas
  • Ruslan Salakhutdinov (Carnegie Mellon University)
  • Rohin Shah (UC Berkeley)
  • Nicholay Topin (Carnegie Mellon University)
  • Steven H. Wang (UC Berkeley)
  • Cody Wild (UC Berkeley)

The advisory committee consists of:

  • Sam Devlin (Microsoft Research)
  • Chelsea Finn (Stanford University and Google Brain)
  • David Ha (Google Brain)
  • Katja Hofmann (Microsoft Research)
  • Sergey Levine (UC Berkeley)
  • Zachary Chase Lipton (Carnegie Mellon University)
  • Manuela Veloso (Carnegie Mellon University and JPMorgan Chase)
  • Oriol Vinyals (DeepMind) 

📱 Contact

If you have any questions, please feel free to contact us on Discord or through the AIcrowd forum.




See all
Behavioural cloning baseline for the Research track
About 3 years ago
Behavioural cloning baseline for the Intro track
About 3 years ago
Fully scripted baseline for the Intro track
About 3 years ago
Testing MineRL environment
About 3 years ago