I'm Joy from Coder One. This is an early version of our upcoming Bomberman-inspired AI competition. Bomberland is an intentionally challenging environment for ML featuring non-trivial problems like real-time decision-making, large search space, and both adversarial + cooperative play.[1]
Longer term, we're building a place where anyone can explore cutting-edge algorithms like deep RL, GAN, MCTS etc on meaningful real-world challenges. Think OpenAI Gym, but with active competitions and open-ended multiplayer simulations.[2]
If powerups are randomly placed, it's possible for a "better" player to lose to a "worse" player simply by luck of the draw. How is that accounted for in determining the best AI?
No others (at the moment). We're a small team so Bomberland is our current focus - we want to improve the tooling first so that it's easy for people to dive into ML before we introduce other environments.
Will you make a fast implementation of the environment available? The best AIs right now are model based (AlphaGo), so the best bots will probably have to reimplement the environment, unless you make a model available for everyone?
I've been trying to create a Slay the Spire AI and am burned out on reimplementing environments, it's rather boring code, but there sure is a lot of it, and it takes a lot of work trying to figure out subtle details. It would be nice to be able to spend more than 20% of my time on actual AI stuff, rather than trying to reverse engineer the game so I can make a good model.
Getting the platform to the point where people can spend most of the time on the actual training and experiments (and less on the infrastructure) is our current goal. We do have a forward model simulator which should let you step through the environment without re-implementing it, but if that's not what you're after, we'd love to chat more on what we could do to make this easier (feel free to ping any of us on Discord https://discord.gg/tRUMgdfC).
P.S. Sounds like a cool project! Have you heard of the Hearthstone AI competition (https://hearthstoneai.github.io/)? Might be of interest to you.
Oh, I have been thinking about learning about reinforcement learning by trying to make a STS AI too, nice! I eventually gave up, but would still be interested in seeing what can be done. Do you plan on releasing something at some point?
About re-implementing the environment, it is probably worth getting in touch with STS major modders and even streamers (jorbs comes to mind...). In case you did not do that already.
I stretched the truth a bit, I'm actually doing something like "hierarchical model-free reinforcement learning", even so, figuring out how to break the game down to create a hierarchy of agents is a lot of work. Basically, the AI is composed of about 8 different traditional RL agents (neural networks), each deciding a different thing. One chooses which cards to draft, one chooses which actions to take in combat, one chooses which path to take on the map, etc.
Simple rules like "play random cards until your energy is used up" alone can sometimes beat the act 1 boss. My AI is barely above that, and still far from solving the game. I'm not convinced even DeepMind or other researchers could solve Slay the Spire right now.
It shows definite signs of improvement, but has only reached a point where it can beat the act 1 boss about 50% of the time. I think that is its limit right now. I'm doing policy gradient which is very sample inefficient. I'm going to implement soft-actor-critic and see if it can do better with better sample efficiency.
One thing I like about Slay the Spire is it's an environment to solve, not a competition. Gamers like to talk about PvP and PvE, well, I prefer AI vs environment over AI vs AI. In the end, an AI will win the competition, no surprise. An AI solving a new kind of environment is much more exciting IMHO.
I feel like a traditional expert system would work a lot better in Slay the Spire at this stage. The choices you make in the game are all highly interrelated so I'm not sure they can be broken down into separate agents like that.
For example, when deciding what cards to play you often need to take into account what is coming up next on the map; it is not sufficient to consider only how to win the current fight. Relics such as incense burner carry over their turn counters between fights and so it's a strong strategy to delay the end of the current fight in order to set up an optimal incense burner number for the next fight. What number that counter should be is highly dependent on which enemies/elites/bosses you'll be facing in the next fight.
An expert system would have a database of every opponent in the game and when they are likely/guaranteed to appear and then seek to optimize the various conditions at the end of the current fight so that the next fight goes as smoothly as possible. I don't see how this could be accomplished with separate agents each attempting to play a different component of the game in isolation.
Your may be right, but that's a lot of boring work I didn't want to do. It was much more fun to hook up a neural net and watch it learn at least a few things. The combat agent does know where it is on the map, but it was only rewarded for minimizing damage taken in fights, so it would probably never learn to set up pen nib.
Matt from Coderone here: have made a forward model of the environment available by default through the game engine so there shouldn't need to "much" work. There are definitely friction points and some more abstractions can definitely be made, happy to iterate off any feedback provided
Maybe. All approaches can be tried, that's part of the fun. I'm just saying that the best algorithm we know of for solving games in general requires a model, and so if a model is made available to everyone it will save people some work.
Speaking of Pommerman, we caught up recently with the organizer. While unfortunately Pommerman is no longer running, he was super helpful in giving us some advice for Bomberland.
It'd be awesome if there were previous participants of Pommerman here who could share some feedback on how we could improve Bomberland, since there are some obvious parallels.
As-is, I wouldn't consider participating, because doing so would require a lot of boring manual work from me. The reason why we have good AI models for some games is that they are easy and fast to evaluate. So you can just let your AI-in-training play 100*1000 rounds of the game to establish a base policy scoring.
For this competition, however, it appears that the gym environment is not available. So to get started, I would need to build my own Bomberman clone while trying to mimic your graphics style... I'll pass on that. The headline on the blog post says "open Bomberland arena" but I couldn't find any way to actually download it. I do like the idea of having an always-on AI competition running online, but that type of competitive AI play is usually only helpful after hundreds of GPU hours of offline training.
So that would be my one big suggestion to you, joooyzee: Put a small TensorFlow / PyTorch script on GitHub that just runs the Bomberman environment with random inputs.
Once I have such a script, I can then quite easily drop in my reinforcement learning research and get started with the actual AI.
Matt from coderone here, by gym do you mean a open ai gym wrapper.
Feel free to reach out to me on our discord: https://discord.com/invite/NkfgvRN @thegalah or reach out to me directly via email matt@gocoder.one
Definitely there are some misses with the environment would be happy to patch it up to get it into a good state. We have the game engine available as a binary (outside of the docker flow too) available here: https://github.com/CoderOneHQ/bomberland/releases
Is the code for these gonna be shared after the competition? There is a real dearth of complex reinforcement learning code publically available for tensorflow :(
We usually encourage people to open-source their code after the competition so that the community improves over time (but only if they're open to it).
Speaking of Tensorflow, we're working on some ML starter kits and would love some feedback on how to improve the workflow for people using TF, PyTorch etc! If you do end up trying it out and get stuck anywhere, please feel free to ping either myself or Matt (@thegalah) on our Discord (https://discord.gg/tRUMgdfC).
If the game is completely symmetrical, it would be funny if someone comes up with an "AI" which just mirrors the opponent's moves. If it acts faster than the frames, we would always get a draw.
That is definitely a strategy :) Have seen people try imitation learning in other AI programming challenges too -- although usually they perform worse than the original agent they imitate.
Where CodinGame and TopCoder are great platforms for competitive programming, solvable programming problems and short-form competitions, our longer-term focus is more on open-ended, ongoing sandbox simulations that evolve over time. We think this format will lend itself more to challenging real-world simulations and ML approaches (think self-driving cars, drones, and challenging games like StarCraft II).
While Kaggle is great for classification-type problems and even recently started running their own simulation competitions, we feel there's a lot of room for a platform that is 100% purpose-built for simulation-type competitions (e.g. better visualisers, Twitch streams, matchmaking).
For now we'll be running weekly leaderboard matches, with the first one being for bots submitted by 11.59PM this Sunday UTC. After the matches, you'll be able to review your bot replays.
The longer-term goal would be to have an automated system where bots will play matches as soon as they've been submitted.
Thanks for the feedback! We're working on improving the onboarding flow. Sorry about the Docker link issue - it should link you to a copy of the environment binary so that you can play around without Docker (no docs for this workflow just yet unfortunately).
It's essentially a Bomberman-inspired game, where you program the agents to play in it and can play against other users' agents. You can try it out without an account by cloning one of the starter kits here: https://github.com/CoderOneHQ/bomberland and following the usage instructions (but you'll need to create an account to use the visualizer and to submit agents).
We recommend the Docker flow, but if you get stuck feel free to reach out to me (Joy) or @thegalah (Matt) on our Discord: https://discord.gg/NkfgvRN
Prizes are still TBD - most likely will be a mix of cash prizes and/or merch. We're planning a proper tournament launch with prize pool and streams ETA ~Dec. You can think of this as an early-access playground to get familiar with the environment, and share any tooling needs/feedback/bugs before we officially launch and lock-in the major game rules.
I'm Joy from Coder One. This is an early version of our upcoming Bomberman-inspired AI competition. Bomberland is an intentionally challenging environment for ML featuring non-trivial problems like real-time decision-making, large search space, and both adversarial + cooperative play.[1]
Longer term, we're building a place where anyone can explore cutting-edge algorithms like deep RL, GAN, MCTS etc on meaningful real-world challenges. Think OpenAI Gym, but with active competitions and open-ended multiplayer simulations.[2]
We'd love to hear what you think!
[1] Bomberland write-up: https://www.gocoder.one/blog/bomberland-multi-agent-artifici...
[2] About us: https://www.gocoder.one/blog/next-generation-open-ai-gym