Activity
Ratings Progression
Challenge Categories
Challenges Entered
Multi-Agent Reinforcement Learning on Trains
Latest submissions
Classify images of snake species from around the world
Latest submissions
Multi Agent Reinforcement Learning on Trains.
Latest submissions
See Allfailed | 22806 | ||
failed | 22682 | ||
failed | 22681 |
Multi-Agent Reinforcement Learning on Trains
Latest submissions
Participant | Rating |
---|
Participant | Rating |
---|
Flatland Challenge
Publishing the Solutions
Almost 5 years agoAll submissions on the leaderboard are valid as they passed the official submission process.
Now we are only checking, that there are no licencing issues or cheats in the code. The order on the leaderboard will not change, except if we have to discualify solutions for cheating.
Hope this clarifies your questions.
Best regards,
Erik
Competition has ended
Almost 5 years agoThe Flatland Competition has finsihed!!!
!!Thank you for all the great submissions and contributions!!
Dear Particpants
After alot of hard work and many bugfixings, submissions and amazing solutions the Flatland Challenge has come to an end. We would like to thank all of you for your great contributions, inputs, enthusiasm, time and energy you spent on this challenge.
We are currently looking into the leaderboard solutions and will contact and announce the winners of the competition as soon as possible.
Whatβs next?
After reviewing all solutions, selected participants will be invited to attend the AMLD2020 where results from this Challenge will be presented. Furthermore we will prepare a publication containing some of the solutions provided through the challenge (we will reach out to participants in the coming days).
At the AMLD2020 we will also give a larger update about Flatland and what is coming next.
Thank you again for your great contributions and stay tunedβ¦
Best regards,
The Flatland Team
Publishing the Solutions
Almost 5 years agoWe are currently evaluation the top submissions of the challenge. We are planning on publishing the results and the challenge together with the top participants.
As soon as we have veryfied the solutions to be valid of the participants we will reach out to them and set upt to publishing process together with them.
At the applied machine learning days we will also present the results together with the invited participants.
So stay tuned for further updates on this.
Best regards,
The Flatland Team
Is this an error in the local evaluation envs?
Almost 5 years agoThis indeed looks like a bug. You don not need to handle these kinds of situations. Did you ever observe an agent being placed on such a rail segment? We will look into this bug and fix it.
Thanks for bringing this to our attention.
Best regards,
The Flatland Team
Further questions regarding submissions
Almost 5 years agoThe mean percentage of agents done is calculated as you expected. Number of arrived trains divided by total number of trains and then we take the mean over all episodes.
If we would calculate the mean over all episodes ones we would bias the mean towards the results of the larger envs. In the current setting the bias is towards smaller envs where few agents have more influence on the mean of agents finished.
Hope this answers your question.
best regards,
Erik
Fraction of done-agents
Almost 5 years agoHi @eli_meirom
The time step is per Episode. This means this is the total number of environment steps you can call before the enviromnent terminates.
Currently we donβt put stricter schedules on the individual trains, thus each individual train can run whenever it is best suited within the total time window.
Hope this answers your questions.
Have fun with the Challenge!
The Flatland Team
Flatland documentation / code inconsistency
Almost 5 years agoHi @eli_meirom
Thank you for pointing this out. We had just realized this at the current week. A workaround is to check out an earlier version of the Example. But we will also update the example to be compatible with the earlier version.
Sorry for the caused inconvenienceβ¦
[ANNOUNCEMENT] Submission wokring for Round 2
Almost 5 years agoLooking at the generated files. We use the default value of 20
agents for the files used for submission scoring. I will however still check to update this to be more precise in the future. Does this help you with your submissions?
Best regards,
Erik
[ANNOUNCEMENT] Submission wokring for Round 2
Almost 5 years agoSorry for the delay. I will look into this now. Will let you know if I can give a fix to this as the levels are stored in pickle files currently and donβt contain the information about number of cities. Maybe we will have to regenerate the files with this updated information. will let you know as soon as I have fix.
Best regards,
Erik
Spawning of agents seems to work diffrent in the second step
Almost 5 years agoThis will highly depend on the ordering of the trains.
This i a design choice and there would be other solutions to this problem. In the case of flatland, trains can drive behind each other as long as the index of the train in front is lower than the index of the train behind it. For example.
Train 2 can follow train 1 directly. But train 1 CANNOT follow train 2 directly.
Does this help?
Which log are you refereing to? Iβm happy to take a look at it.
Best regards
Erik
Spawning of agents seems to work diffrent in the second step
Almost 5 years agoHi @Lemutar
The check is actually done exactly the same way in all cases. We do not differentiate between spawning and movement in the environment.
Do you have an example where this differs between spawning and movement in the environment? If so, this would be very helpful for finding the bug and fixing it.
Thanks
Erik
Spawning of agents seems to work diffrent in the second step
Almost 5 years agohi @Lemutar
Looking at the example you provided it seem that agent 1 want to enter an occupied cell.
This has to do with the way the env.step executes the commands. It does take agent actions and executes them serially by increasing agent ID.
This means that agent 1 tries to enter cell (6,9), which is still occupied by agaent two who has not yet moved to cell (6,10). Because moving in to an occupied cell is an illegal action, the env does not execute the action and agent 1 stays outside the env and ready to depart.
Hope this clarifies your question.
best regards,
Erik
Training Related Questions
About 5 years agoAdditional information:
You can use git-lfs
to store large model files
Training Related Questions
About 5 years agoHi @leventkent
Welcome to the Challenge
Here the answers to your questions:
- Given the limited time of each submission we recommend that you submit already trained models.
- In the current round there are actually only
250
envs that you have to solve. They all have different maps and different schedules for individual agents. - They have to reside within the repository.
Hope this helps you get started with the challenge, have fun.
The Flatland Team
Tree Observation and Normalisation
About 5 years agoFor the stock observation builder the reasoning is the following:
- Negative Values indicate no path possible
- Values smaller than inf indicate something has been observed
- Values equal to infinity means it was not observed here
Then with the normalization we set a certain range of values that we care about. If they are larger than e ceraint value we donβt differentiate between the anymore and thus all of them collaps to 1. The reasoning here is that the exact values for objects far away are irrelevant.
We highly suggest you use your own custom observation builder and normalization function. The stock functions are just meant as examples for you to build upon.
Hope this clarifies your questions.
Have fun with the Challenge
Best regards,
Erik
[ANNOUNCEMENT] Submission wokring for Round 2
About 5 years agoThe agents done=True
is necessary for training to indicate that the episode terminated and thus your ML-Approach is not expecting a next observation anymore.
The returned reward should be equal to the current state of the environment. Thus if not all agents have reached their target the reward is equal to the step reward of each agent. If you need a more negative reward for agents not terminating their task in time you could do reward shaping using the env information. env.agent.status
will tell you whether or not the agent has finshed itβs individual task.
Looking at the code I see that if you continue the enviromnent beyond the time that it terminated it will return the positive reward to all agents. This is a bug on our side and we will fix this.
Hope this clarifies your question.
Best regards,
Erik
[ANNOUNCEMENT] Submission wokring for Round 2
About 5 years agoThe actual formula that you mentioned is correct. There is a problem when loading from files as we currently donβt store the numbers of cities in the pickle file. thus it is impossible for you to currently compute the appropriate max_time_steps
for the pickle file without the associated generator parameters.
I will open an issue about this on gitlab and adress it in the coming days. Sorry for the caused inconvenience.
Best regards,
Erik
Computation budget
About 5 years agoThank you for pointing this out to us. We are sorry for the confusion that we caused and the lack of clear communication with the changes with malfunctions.
I try to shed some light on our thought process to explain the current malfunction behavior.
Agents malfunctioning before they enter the environment is equivalent to delayed departures of trains, which are quite common in real life and thus we need to model them in flatland as well. Updating the malfunciton variables before the trains enter the environment is done to simplify the task. If we were to update the malfunciton data only at the entrance of the train into the environment there would be no room for alternative solutions. By giving the controller access to delayed departures we hoped to go allow for more flexibility.
In the old malfunction model this was not necessary as we already gave information about when the next malfunction would occur, and thus planning was easier in advance of the events.
This environment is growing with important imput from the community such as yours. We apologize for the inconvenience that some of our changes cause along the way. But we truly believe that with all your great feedback and effort we have come quite far with Flatland.
We hope you all enjoy this challenging task as much as we do and help us develop the necessary tools to make public transport even better in the future.
Best regards,
The Flatland Team
Computation budget
About 5 years agoYour observation about the malfunction rate was correct. However, we introduced a different method to generate malfunctions from malfunction_generators
. Thus just like the next_malfunction
we donβt use malfunction_rate
anymore for malfunctions.
In this new version, any agent can break so there is no point in leaving agents behind anymore.
For fairness all submissions in the leaderboard will be re-evaluated with these changes. This is also the reason why some variable are still present to keep code from crashing.
Yes when you update to the newest version of Flatland the new malfunction generators are in use. The FAQ has also been update to include these changes as well as explain the range of parameters and how to generate test files locally.
Your last assumption is wrong. They are not disjoint, the rate of malfunctions is low, but it can happen that multiple malfunctions occur at the same time. The malfunctions are indepentendt Poisson processes.
We are aware that this is a challenging task to solve, but that is unfortunately the reality of railway traffic where such problems have to be solved in real-time and where we canβt forsee when the next disturbance appears and how many we have at the same time. Thus the objective is to keep traffic as stable as possible even in presence of disturbances.
Hope this clarifies your issues.
Best regards,
Erik
Competition has ended
Almost 5 years agoHi @mugurelionut
We are happy to hear that you enjoyed the competition and that there are still many unexplored ideas.
Yes we would like to make a formal publication. And yes the solutions will be co-authors as well. As soon as we have started the process we will reach out to all participants to be included in the publication.
In the coming days we will reach out to all top participants on the leaderboard to discuss the next steps.
Hope this helps for now.
Best regards,
Erik