We already saw from last year that the AI is much better 1-on-1 against humans, but what about 5v5?
Dendi vs. OpenAI
Note that they will reduce the hero pool during this challenge.
Since people are complaining about current restrictions (no warding, no rosh, etc.), repeat: They will only restrict the heroes in this challenge in August.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"
Poll: Will OpenAI beat 5 professionals in a team?
Yes (122)
47%
No (136)
53%
258 total votes
Your vote: Will OpenAI beat 5 professionals in a team?
Quite a bit of DotA comes down to the draft. Much of the rest comes down to things like communication and team cohesion.
If they want this to be a true test, they should give the human team a better draft, of heros they are familiar with, and choose a current team that is performing well and will take the task seriously instead of an 'allstar' team that won't play well together and will treat it as a random pub.
I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
Even with those limitations, I don't think it's possible at the current moment, if it's an actual team (ie Liquid or Fnatic or something. If it's like a Na'vi reunion, I'll be kinda confused).
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
Yeah I'm in agreement. Especially when the heroes they picked are five of the lowest skill ceiling/highest skill floor heroes in Dota.
The laning phase will favor AI. Teamfight will likely be AI win as well with superior coordination. It's the midgame that may be a slight toss up. Let's see if human ingenuity can beat the odds.
Yeah too many restrictions for me to be that interested honestly. I'd rather see the bots be coerced towards a strategy that they can execute and see if human players can identify it and respond in time. No wards / rosh / smoke and fixed lineup really just glosses over a lot of what makes dota a beautiful game.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes" https://blog.openai.com/openai-five/
Guys the plan is to only restrict the heroes in August.
I believe the humans will get absolutely crushed in lane, you guys talk about draft and restrictions and stuff but at the core level dota is about microing your one unit and the bots will be... overwhelming.
I think this can be a bit disappointing actually, in that the match will be "over" very fast, instead of showcasing "a great machine mastermind" level of decision making, coordination, and itemization, which would come later in the game and require a somewhat equal footing.
But still funny when the punny humans will be reduced to clowny strats to try to overcome the unfair and uncaring opposition, in the laning phase.
Well, my 2 cents, maybe I'm wrong. Very excited and curious to see how it plays out, at any rate :-)
edit : my deepest sympathies to the guy(s) who will lane vs an OpenAI Sniper. Yummy.
Laning is probably even more relevant because of the selected 5 heroes and it being a mirror match. These heroes don't exactly make it easy to outmaneuver the enemy if you are behind. They are not mobile, they are pretty bad farmers, they have a really hard time initiating fights and instead have to just walk in. All the skill and item builds are also hard coded, which isn't nearly as viable if you had to consider the strengths and weaknesses of your lineup vs the enemy.
What I suspect the bots will do is just a pretty 5 man heavy strategy. It would be quite enlightening to see the bots play against themselves to see what sort of strategical understanding they've gained or if they just basically do the same thing and whoever does it a bit better wins.
Some people who saw the bots live seemed to claim that the bots were not actually that good last hitters and so on, and instead they beat who they beat by grouping up and fighting together better than the humans. Having laning weaknesses would be pretty bad against top level players, but I'd guess the bots have learned to be better in a month. There may be something in the way their reward functions are shaped that they don't learn super efficient laning easily. But if they have trouble, I'd guess the research team will again try to introduce some sort of semi-manual solutions like they did with creep blocking last year. They explicitly made the bot to learn to do that rather than it realizing doing that brings an advantage in terms of getting the other rewards it wants 20 seconds later.
Another thing is that imo humans should have the ability to see the bots play before a benchmark happens. Dota can be very snowbally, and if you don't have any understanding of where the enemy is really good, it can be hard to come back from a screw up. Would have been great if they released footage of entire games with this press release rather than carefully selected clips.
I'm curious to see how this sort of approach scales for them and what sort of ways they will find to expand to the rest of the game. I saw some machine learning people estimate that in terms of computational resources it costs around 1.1 million dollars to run their training configuration for 19 days (the time they trained the bots to beat whoever they beat this far).
On June 29 2018 16:47 spudde123 wrote:I'm curious to see how this sort of approach scales for them and what sort of ways they will find to expand to the rest of the game. I saw some machine learning people estimate that in terms of computational resources it costs around 1.1 million dollars to run their training configuration for 19 days (the time they trained the bots to beat whoever they beat this far).
Wow.
Now THAT, at least in my mind, would make an actually compelling community goal for a future TI prizepool.
Like, if we get to 30M, then we can give X amount to openAI and help them run their trainings for that much longer, and hopefully get the program ready to the real deal, aka no restrictions.
Well... ok, the current reward/cost for Valve is handing out 10 virtual levels, so I guess... they might not love the idea. Would be compelling to me, anyway
I voted "no" but then 2 posts down from OP's I saw the list of restrictions and I'm feeling hesitant now >_>; And I fully agree that I am not that really interested in this match or its results until these restrictions will be gone. Pidgeon-holing players into that specific of a scenario and handicapping them seems to make this match pointless. Why not make the human just play with their feet? Why not pick from the attendees and audience 1K scrubs heralds, get them drunk, and let them play vs AI without restrictions?
The reason for the restrictions are there to help the AI to learn 5v5 faster. Unlike humans, the AI doesnt have foresight so in order to get better, it needs to try every little thing until it learns. Humans dont need 150+ years to learn Dota because we have foresight, we dont need to try stupid things like suicide on the tower (although some do for different reasons lol) to know that's not gonna give you an advantage. Last year it was 1v1, this year a heavily restricted 5v5, but I expect it to be on human level in a few years. The one thing I'm mostly interested in, how is the AI going to respond to major balance changes and how long will it take it to adjust.
I'd personally enjoy this a lot more if there were no restrictions. Yes, no one would expect the bots to win in a straight up 5v5, but it would show how far the AI has come from the previous year. Even ask the pros to go easy on the bots, or event some silly challenges like win mid as CM for funsies.
Restrictions seems to be in place to help the developpement team cope up, more than anything.
It remains a human made program, and it's... quite the endeavour, you should go read the blog, I think it gives a better understanding of why restrictions are still needed at this stage.
On June 29 2018 18:37 Latham wrote: I'd personally enjoy this a lot more if there were no restrictions. Yes, no one would expect the bots to win in a straight up 5v5, but it would show how far the AI has come from the previous year. Even ask the pros to go easy on the bots, or event some silly challenges like win mid as CM for funsies.
This just feels like monkeys and typewriters.
Oh sure, it will be much more realistic if there were no restrictions, but I dont think that will affect my enjoyment so much, YMMV. Im still curious to see how the AI will perform in this heavily restricted scenario.
On June 29 2018 16:47 spudde123 wrote:I'm curious to see how this sort of approach scales for them and what sort of ways they will find to expand to the rest of the game. I saw some machine learning people estimate that in terms of computational resources it costs around 1.1 million dollars to run their training configuration for 19 days (the time they trained the bots to beat whoever they beat this far).
Wow.
Now THAT, at least in my mind, would make an actually compelling community goal for a future TI prizepool.
Like, if we get to 30M, then we can give X amount to openAI and help them run their trainings for that much longer, and hopefully get the program ready to the real deal, aka no restrictions.
Well... ok, the current reward/cost for Valve is handing out 10 virtual levels, so I guess... they might not love the idea. Would be compelling to me, anyway
I'm not a real expert on the methods but I don't think the current way they are approaching it is scalable to the rest of the game and just throwing more computation at it isn't the solution. Sure they may beat humans with this set of heroes, but it would be even more interesting if they could learn general concepts, move from hero to hero easily, show strategical understanding for various kinds of lineups, etc.
As far as I understood their main achievement this far is showing that their already available methods can work with pretty heavily delayed rewards when enough computational power is put behind them. Though worth noting in terms of real world applicability that they need a simulated environment to practice in, which is hard to get in areas other than games. Also they have a mechanism that allows for teamwork to emerge even if the bots act completely on their own because each of the bots is rewarded when their team is doing better than the enemy team on average.
One interesting thing is the reward function they crafted. The bots don't only learn from what wins and what doesn't, but rather they get rewarded for last hits, denies, damage given/taken, buildings being alive/dead and so on. But the researchers have to pretty carefully manually define how big rewards the bots should get from each of these things for the learned behavior to actually be good for winning the game. But can the rewards actually be the same for all possible lineups? As an example, a push lineup with worse lategame should put a lot of emphasis on getting buildings down quickly. If they just trade farm and don't try to end, they will eventually end up losing. Understanding these sorts of longer term effects is not relevant in a mirror match. Not to even mention how all sorts of patch changes affect the way the various rewards should be prioritized. Whenever there's a patch that changes how much gold different types of creeps give or how much towers give, the bots can't just go and learn the game again but rather the researchers have to manually fine tune the reward function first.
This doesn't mean I'm not interested in it, but I don't think they should try to frame it as "we are beating humans in Dota". It's not yet the same game humans play. It would be great to see several full games of the bots playing against humans, and even bot vs bot matches, to get a good idea of all the things the bots have learned.
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.
No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from. But for now the material they've released is a post about the methodology, some clips to show where the bots did something nice (no clips from where they lost), and plans to do showmatches against humans.
At some point it would be cool if they'd arrange something similar to what has been done in poker. Basically bring a number of the best players to a location, and really play a lot of human vs ai games live on stream. I'm not sure we will be able to tell even after the matches in a month or so how good the bots actually are because most game states won't come up in a single game of dota. I assume they'll play at least a few games, but even that isn't really enough to really see what the bots know. It wouldn't be interesting just from the perspective of whether they can beat humans, but just out of general interest for how weird things they can learn through self-play.
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.
Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense.
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.
Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense.
Harder is a statement about ordering. The fact that it is harder than Go by orders upon orders of magnitude is irrelevant. Also, your real sports comparisons are the embodiment of "bad sports comparison" trope, those adjustments you talk about are not subverting rules of the game itself, while locking Dota into a fixed and mirror 5v5 without wards is. Now, if they are going to throw away most of these for the showmatches themselves, it may just be watchable.
On June 29 2018 21:40 spudde123 wrote: No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from.
They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that
On June 29 2018 21:40 spudde123 wrote: No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from.
They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that
Yeah it'll be interesting to see how it actually looks.
I suspect it has something to do with the way their reward function is defined. The bots don't learn what is optimal for winning but rather what the researchers have defined as the proxy for winning (the reward function). Not that there is anything wrong with this really, humans also pursue various intermediate steps rather than all the time thinking what is truly optimal for winning the entire game. Many of us never get past that stage and instead just hide in the jungle farming creeps while our team is losing.
But anyway every bot is trying to maximize their own reward. The mean reward of both teams is taken into account there, but they are also just trying to increase their own rewards, for example by farming creeps. I don't know whether the resource distribution arises because the bots determine doing that brings better long term rewards for the entire team, or because every bot is basically competing for farm against their teammates to an extent.
Theoretically interesting sure, but optimizations that the AI makes to the economy of the game is totally irrelevant again when it's not dota's economy. No wards/sentries/smokes/courier and only 5 specific heroes means that literally every conclusion the AI comes to cannot be extended to "dota"
That's the biggest reason why people are bothered by the restrictions. We don't care if the bots can barely beat 2k mmr players, we want them to be playing dota so that the things that they learn, develop and do are relevant and interesting to us.
On June 29 2018 21:40 spudde123 wrote: No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from.
They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that
The ultimate goal is to get the AI 5v5 functional, then ship it to valve so they can literally run a simulation of a proposed patch to see how the meta shakes out.
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
I know perfectly well that the AI can't handle the entire game yet. And I'd much rather see a limited AI fail to keep up by humans exploiting the full game mechanics, than watch a gimmick match that is handicaps the humans so the AI can look good.
And as the development continues, we'd actually get to see the progress of learning in a real environment.
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone:
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No warding - No Roshan - No invisibility (consumables and relevant items) - No summons/illusions - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan
I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.
For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):
High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.
So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.
There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.
Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense.
You can't compare go to dota like this: one wrong move in go and you can lose the game (like not selecting the right move at a given time), it's far from being the case in dota.
I feel like they'll come up with fewer and fewer restrictions with time. I suspect a lot of the challenge of this system is making it work with machines that aren't giant server farms. Maybe for TI9 we'll see them show up with the ability to play DotA, and possibly draft the year after if everything else works.
edit: last year the system was able to play 1 hero, be curious if it's 5 independent systems, or some 1 master for 5 slaves, or some sort of 5-computer system that communicates with other bots. In any of these cases, I'm sure that increasing things from 1v1 SF to 5v5 {fixed heroes} was quite a challenge in that aspect alone. I'm just really curious how they'll continue to grow the system from now on, be it a slowly extending hero pool, or with more features in the game.
On June 30 2018 02:08 Fleetfeet wrote: The ultimate goal is to get the AI 5v5 functional, then ship it to valve so they can literally run a simulation of a proposed patch to see how the meta shakes out.
THEN we'll be the future.
Cool idea but there's no guarantee human pros will adapt the same meta as the AI.
On June 29 2018 23:03 Nevuk wrote: The only restrictions they are planning on for the actual game are limited heroes
What? Where did you see that?
From bot the OP and their blog :
Since people are complaining about current restrictions (no warding, no rosh, etc.), repeat: They will only restrict the heroes in this challenge in August.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"
Later on :
Our team is focused on making our August goal. We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot.
On June 30 2018 05:51 Murlox wrote: Aye but I mean... warding, invisibilty, illusions (bait), scanning, BoT plays... reading the blog, those seem like a big deal, programming-wise.
I'm surprised they're working so fast :-)
Ideally for most of that they wouldn't need to do that much programming in addition to what their system is now. You expose the bots to that information / possibility of that action, put them to play against themselves for 5000 years, and maybe they will have figured it out.
It's hard to evaluate what sort of situations the bots can deal with atm so I don't really know what I should expect in terms of how easily different things can be learned. For example maybe the bots already have some behaviors that somewhat imitate warding, like a hero standing up a hill to give vision to the team. But they might also just be 5 manning relatively brainlessly, in which case I wouldn't have as much faith in them learning warding decently by TI. It's also possible that to some things like scanning they end up hard coding some solutions like they've done with item builds or creep blocking in 1v1.
Hopefully they will lift as many restrictions as possible even if the bots turn out to be bad at dealing with something. I agree with WolfInTheSheep in the sense that I would much rather see the game be as close to regular Dota as possible, even if the bots are bad at some parts. That would make it an interesting thing to follow if the game restrictions stayed pretty much the same from benchmark to benchmark and we could see how the bots are improving. But I could see them keeping stuff banned if it turns out that humans can easily beat the bots using certain things.
The hero restriction is understandable in the sense that this concept of learning purely through self play seems to run into scalability problems if you start allowing all the different heroes. It seems like you have to figure out how to transfer knowledge across heroes to some extent to make it viable.
On June 30 2018 05:48 Murlox wrote: Oh yeah, thank you. I had read it too, guess it went out of my head again after reading some comments here.
But... yeah, that's already happening. Like, those were not small restrictions, that's quite impressive to hope to be able to remove them that fast.
Yeah, I read it too and had to double check since so many people were complaining about restrictions that won't be there
I mean you've kind of pushed it in the opposite end now. They don't say those restrictions will be gone. They say they're aiming to do so, and say that they don't know if that goal is achievable.
On June 30 2018 05:48 Murlox wrote: Oh yeah, thank you. I had read it too, guess it went out of my head again after reading some comments here.
But... yeah, that's already happening. Like, those were not small restrictions, that's quite impressive to hope to be able to remove them that fast.
Yeah, I read it too and had to double check since so many people were complaining about restrictions that won't be there
I mean you've kind of pushed it in the opposite end now. They don't say those restrictions will be gone. They say they're aiming to do so, and say that they don't know if that goal is achievable.
True, true. I took them to be meaning "hope to be able to beat pros without restrictions", but I'm unsure where the emphasis is on that sentence - the "beat pros" or "without restrictions"
Have read some comments but my 2cent is that even the mid 1v1 sf had a bunch of stupid restrictions like no wand and no bottle etc. This clearly means they dont even have a bot that masters 1v1 only mid, only sf. I knew 100% they would never be able to play 5v5 without a ton of restrictions since even their 1v1 is flawed. Their entire methodology needs to change if they want to make a 5v5 bot even close to fighting pro teams.
On June 30 2018 06:56 sertas wrote: Have read some comments but my 2cent is that even the mid 1v1 sf had a bunch of stupid restrictions like no wand and no bottle etc. This clearly means they dont even have a bot that masters 1v1 only mid, only sf. I knew 100% they would never be able to play 5v5 without a ton of restrictions since even their 1v1 is flawed. Their entire methodology needs to change if they want to make a 5v5 bot even close to fighting pro teams.
AI learning is iterative. You can shoot for the moon from the start and have nothing for a very long time, or you can start with small goals and have steady progress checkpoints.
On June 30 2018 06:47 Nevuk wrote: I took them to be meaning "hope to be able to beat pros without restrictions", but I'm unsure where the emphasis is on that sentence - the "beat pros" or "without restrictions"
Probably both
They're essentially the same, are they not. Getting the stuff advanced enough.
On June 30 2018 05:48 Murlox wrote: Oh yeah, thank you. I had read it too, guess it went out of my head again after reading some comments here.
But... yeah, that's already happening. Like, those were not small restrictions, that's quite impressive to hope to be able to remove them that fast.
Yeah, I read it too and had to double check since so many people were complaining about restrictions that won't be there
I mean you've kind of pushed it in the opposite end now. They don't say those restrictions will be gone. They say they're aiming to do so, and say that they don't know if that goal is achievable.
True, true. I took them to be meaning "hope to be able to beat pros without restrictions", but I'm unsure where the emphasis is on that sentence - the "beat pros" or "without restrictions"
I think "without restrictions" is what they want to focus on next. My intuition tells me that they're trying to incorporate a wider range of capabilities at a time, and then work on the quality later. After all, if you overtrain an AI to do something like play without wards, that AI may have to unlearn stuff when faced without that restriction.
Meh, it's just brute forcing and ultra powerful computers in a very limited version of dota. It's disappointing but also very reassuring. I don't want to be in the generation with dematerialized humans, though it's probably our best option to get out of the solar system.
On July 04 2018 02:26 nojok wrote: Meh, it's just brute forcing and ultra powerful computers in a very limited version of dota. It's disappointing but also very reassuring. I don't want to be in the generation with dematerialized humans, though it's probably our best option to get out of the solar system.
I would love to last until those times, living forever in a world where you're not limited by physics but program seems good to me
What is great is that the playstyle from last year's 1v1 OpenAI bot is now being adapted by the pros in 1v1. Even Arteezy was impressed. I wonder how the bot will decide when to use wards and all the other tiny details that we haven't even thought of.
I look forward to seeing the progress they make until TI, things are progressing so quickly atm. Recent work for a similar problem by deepmind for anyone who's interested https://deepmind.com/blog/capture-the-flag/
On July 07 2018 03:25 0fuulm wrote: Let the AI train AGAINST HUMANS.
BUT DONT let the humans train against AI. Then call the AI smart.
Ver scientific
The AI isn't in "training mode" when playing against humans
As in, it doesn't use any results against humans to evaluate it's network and tune anything. Basically, playing against humans does not change the bot in any way (similar to playing against regular bots, which have no learning capacity).
In addition, it's actually impossible for a bot to train against humans without those humans in turn "training" against bots. Humans are always learning. Everything we do is indelibly marked on our brains and impacts every future decision in some (potentially infintesimal) way. Only computers can play without in any way changing their knowledge of the game.
Lastly, the computer plays 180 years of dota per day against itself. Compared to that, training against humans in real time would be incredibly inefficient. The bot learns SLOWLY from individual games, whereas humans can learn very rapidly from individual games. If a weak bot played weak humans and both were trying to learn from each other, the humans would vastly outpace the bot in learning.
Looking forward to an 'easy' OpenAI mode perhaps becoming available after TI.
Machine learning is always interesting because they find ways to do/make sense of things that we would never think of. They'll be coming up with their own metas
I think that's because of foresight, which is something the AI doesnt heave, at least not yet. Foresight helps us learn faster, by simply not trying every single stupid thing, whereas the AI does try every single stupid thing and thats why it takes so long to learn. But the fact that we use foresight to not try everything, we are missing out on some things and those things the AI will figure out and then in turn we'll learn from the AI. So the way I see it, we can only benefit from the AI, because unlike humans, the AI has the luxury (time) to try all possible scenarios and analyze them in more depth.
- Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich. OK. - No summons/illusions OK. - 5 invulnerable couriers, no exploiting them by scouting or tanking OK, lol. - No Scan OK. - No invisibility (consumables and relevant items) Ehh, how about invis rune? - No warding - No Roshan - No Divine Rapier, Bottle, Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop IS THIS EVEN DOTA LMAO
AI will win because it will train in these stupid conditions nonstop. While there is no single incentive for humans (especially, pro players) to do so - this is completely different from a 1v1 SF challenge (which is limited by a nature). Basically, in a race of retards it doesn't matter that you win.
If humans simply avoid fights and farm Burning style, they will eventually win too - bots suck both at last-hiting and item builds choosing. But it won't make it into newspapers headlines, so I bet on AI manhandling pure meatbags once again, so researchers can receive a proper funding for next year improvements.
How many times does it need to be said? This is why AI will win, because humans can't read.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"
Even without this restrictions people are misinterpreting, I don't see the negative or positive extremes people have expressed. Humans constantly build things that can outperform ourselves in one way or another. Doing so here neither invalidates the efforts of human players nor represents some unimaginable feat.
On July 09 2018 08:21 polgas wrote: How many times does it need to be said? This is why AI will win, because humans can't read.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"
Yeah, humans really can't read.
Our team is focused on making our August goal. We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot.
On July 10 2018 23:22 Kyir wrote: Even without this restrictions people are misinterpreting, I don't see the negative or positive extremes people have expressed. Humans constantly build things that can outperform ourselves in one way or another. Doing so here neither invalidates the efforts of human players nor represents some unimaginable feat.
Personally, I just think it's boring having humans play by AI-centric rules, rather than forcing the AI to play in environments where they're handicapped.
If the vision game and Roshan are such a massive advantage, I want to see that in action and see how those advantages play out, and how far the AI still has to go.
On July 09 2018 08:21 polgas wrote: How many times does it need to be said? This is why AI will win, because humans can't read.
"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"
Our team is focused on making our August goal. We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot.
On July 10 2018 23:22 Kyir wrote: Even without this restrictions people are misinterpreting, I don't see the negative or positive extremes people have expressed. Humans constantly build things that can outperform ourselves in one way or another. Doing so here neither invalidates the efforts of human players nor represents some unimaginable feat.
Personally, I just think it's boring having humans play by AI-centric rules, rather than forcing the AI to play in environments where they're handicapped.
If the vision game and Roshan are such a massive advantage, I want to see that in action and see how those advantages play out, and how far the AI still has to go.
Their goal is to have the bot team beat the humans. Meaning, when they play, the only restrictions will be limited set of heroes. The team's goal is to get the bots up to speed to be able to handle things like warding, Rosh, consumables, etc. by August so the AI wins. So unless the team goes back on their word and puts the restrictions back in place because they struggle to implement warding, Rosh, etc. by August, the only limitations are going to be the smaller hero pool.
Nobody is suggesting that openAI won't change the rules between now and august, the complaints revolve around the suggestion that what openAI have going right now is "dota".
Its merely a case of "Oh so you say you are going to do X? great, come back when you've actually done it instead of some facsimile thereof that is interesting but not the same"
It's annoying when you have stuff like bill gates tweeting that AI is beating humans at "dota". That's the kind of falsehoods that people are forced to decry.
On July 13 2018 02:26 Sn0_Man wrote: Nobody is suggesting that openAI won't change the rules between now and august, the complaints revolve around the suggestion that what openAI have going right now is "dota".
Its merely a case of "Oh so you say you are going to do X? great, come back when you've actually done it instead of some facsimile thereof that is interesting but not the same"
It's annoying when you have stuff like bill gates tweeting that AI is beating humans at "dota". That's the kind of falsehoods that people are forced to decry.
That's perfectly fine. I even think the "AI bots beat professional Dota 2 players" stuff from last year is totally overblown and when looking below the surface, total bs. There are a lot of comments and people though, who seem to believe the AI bots' match coming up will have the same restrictions as the very first one (no warding, no rosh, etc.). That's who my post is trying to inform and correct, My post is not stating anything about whether openAI is playing "real Dota", just attempting to clarify misinterpretations of what the rules are supposed to be during the upcoming match.
Again, the exact quote from the devs is "We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot."
If they can meet that goal then all the power to them, but they make it pretty clear that removing all but the hero restrictions is a very tough grind.
On July 13 2018 03:06 WolfintheSheep wrote: Again, the exact quote from the devs is "We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot."
If they can meet that goal then all the power to them, but they make it pretty clear that removing all but the hero restrictions is a very tough grind.
OK, so we're interpreting that quote completely differently. I'm reading the "While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes." as a statement that they will, no matter what, play against a team of top pros in August with the only restrictions being a limited set of heroes. The "We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot." to me means they don't know if winning against the pros will be achievable and that's what they're working toward.
Here's a quote from their blog; "While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes. We may not succeed: Dota 2 is one of the most popular and complex esports games in the world, with creative and motivated professionals who train year-round to earn part of Dota’s annual $40M prize pool (the largest of any esports game)." I view the "not succeed" being "beat a team of top professionals". It's not about what restrictions they'll play with, but whether OpenAI will beat the pros at TI subject only to a limited pool of heroes.
My point is that the July 28 and August games are supposed to be done with only restrictions on the hero pool, but there are many posts who seem to be missing that detail and complaining about the restrictions that were in place when OpenAI first got notoriety. Now, could the devs back out of what they said and put more restrictions back in place (no warding, no Rosh, etc.) if they don't get the bots to a point they feel comfortable with? They could, but I don't think they will (and hope not).
You have to remember, this is not a competitive DotA project, it's a learning AI project. I think it was mentioned before, but beating pro players without restrictions and playing without restrictions are one and the same goal. As they say:
Most of the restrictions come from remaining aspects of the game we haven’t integrated yet. Some restrictions, in particular wards and Roshan, are central components of professional-level play. We’re working to add these as soon as possible.
Judging from the graphs they've posted, on average it takes a week for the game simulations to crunch the variables and reach close-to-peak performance. If the AI is still (today) unable to handle something like Roshan, the bottleneck is not because of CPU time, but because they haven't created the right conditions for the AI to learn the importance or value of Roshan yet.
On July 14 2018 04:57 pNRG wrote: Garbage test with so many restrictions. Waste of time. Show me an ACTUAL Dota game with AI vs. a top-tier team.
Why do you consider it to be garbage? It is standard to consider subproblems of a problem in many areas of mathematics to solve the original problem.
This is not really about dota I think; it's more about getting the experience of designing and implementing an AI in a competetive setting which has real time. It is quite likely that they can export much of it with some redesign to real robots.
Emphasis, via italics, on making. It's not perfect yet, it's not even done, but it's still mucho better than last year. Just take it for what it is, imo. No it's not human level yet, and it still maybe needs some restrictions (rather, the team does need them), but it's getting closer.
The product ain't finish yet, but it's dam cool to see the progress they made. At least imo.
On July 14 2018 04:49 WolfintheSheep wrote: You have to remember, this is not a competitive DotA project, it's a learning AI project. I think it was mentioned before, but beating pro players without restrictions and playing without restrictions are one and the same goal. As they say:
Most of the restrictions come from remaining aspects of the game we haven’t integrated yet. Some restrictions, in particular wards and Roshan, are central components of professional-level play. We’re working to add these as soon as possible.
Judging from the graphs they've posted, on average it takes a week for the game simulations to crunch the variables and reach close-to-peak performance. If the AI is still (today) unable to handle something like Roshan, the bottleneck is not because of CPU time, but because they haven't created the right conditions for the AI to learn the importance or value of Roshan yet.
You mocked people for not reading correctly, when what was pointed out was that the August game against pros will be with only limited heroes as a restriction as opposed to all the people complaining about the upcoming game being a mirror match with a bunch of rules in place (which again, won't be the case during the August match).
They've set a goal for the August match to be able to beat pros with only a limited hero pool as a restriction. I take that to mean no matter what, they'll play that game with wards and Rosh to show where the AI is at, even if it's not operating at optimum capacity. Yes, their end goal no matter what will be for OpenAI to beat pros without any restrictions, no matter what benchmarks (like the August match with just a limited hero pool) they place along the way.
Maybe if they don't feel confident the AI can beat the pros they'll back out or put more restrictions in place. From what they've stated however, the only restrictions in August's match will be limited heroes, even if OpenAI isn't 100% ready for it.
On July 14 2018 04:49 WolfintheSheep wrote: You have to remember, this is not a competitive DotA project, it's a learning AI project. I think it was mentioned before, but beating pro players without restrictions and playing without restrictions are one and the same goal. As they say:
Most of the restrictions come from remaining aspects of the game we haven’t integrated yet. Some restrictions, in particular wards and Roshan, are central components of professional-level play. We’re working to add these as soon as possible.
Judging from the graphs they've posted, on average it takes a week for the game simulations to crunch the variables and reach close-to-peak performance. If the AI is still (today) unable to handle something like Roshan, the bottleneck is not because of CPU time, but because they haven't created the right conditions for the AI to learn the importance or value of Roshan yet.
You mocked people for not reading correctly, when what was pointed out was that the August game against pros will be with only limited heroes as a restriction as opposed to all the people complaining about the upcoming game being a mirror match with a bunch of rules in place (which again, won't be the case during the August match).
They've set a goal for the August match to be able to beat pros with only a limited hero pool as a restriction. I take that to mean no matter what, they'll play that game with wards and Rosh to show where the AI is at, even if it's not operating at optimum capacity. Yes, their end goal no matter what will be for OpenAI to beat pros without any restrictions, no matter what benchmarks (like the August match with just a limited hero pool) they place along the way.
Maybe if they don't feel confident the AI can beat the pros they'll back out or put more restrictions in place. From what they've stated however, the only restrictions in August's match will be limited heroes, even if OpenAI isn't 100% ready for it.
They're quite clear that those DotA mechanics are not implemented in the AI at all, not that they're not working optimally. And thus far humans have only been allowed to play the OpenAI with mirrored restrictions (for both the 1v1 bot or 5v5 bots).
But barring a definitive statement, or until August, it's just splitting hairs right now. You say beating pros is the goal, I say playing without restrictions is.
They announced that they will allow picking from a pool of 18 heroes and some restrictions like wards and roshan are gone. At least this makes it much more interesting to watch, and the humans will be able to draft a sensible lineup instead of playing a mirror match. They'll be playing a team with Cap, Blitz, Fogged, Merlini +1 in the end of this month.
Interesting to see how it looks. It's one thing to lift the restrictions and it should be done to make the game more like actual dota, but it'll be interesting to see how the bots actually use wards for example. I'd be quite surprised if the bots truly excel at all aspects of the game. Also not sure how wards for example are implemented. Previously the research team hardcoded item builds for the bots, so I assume they also hard code when they buy wards or not rather than the bots yet being able to learn it themselves.
Nice! Didn't expect them to expand the hero pool so much. AI must be learning so fast!
August 5 Benchmark Test Restrictions: - Random Draft using a pool of 18 heroes (Crystal Maiden, Death Prophet, Earthshaker, Gyrocopter, Lich, Lion, Necrophos, Queen of Pain, Razor, Riki, Shadow Fiend, Slark, Sniper, Sven, Tidehunter, Viper, or Witch Doctor) Mirror match of Necrophos, Sniper, Viper, Crystal Maiden, and Lich - No Divine Rapier, Bottle , Quelling Blade, Boots of Travel, Tome of Knowledge, Infused Raindrop - No summons/illusions - 5 invulnerable couriers, no exploiting them by scouting or tanking - No Scan - No warding - No Roshan - No invisibility (consumables and relevant items)
On July 19 2018 02:35 Sn0_Man wrote: It'll be interesting to see what the draft looks like.
They have almost 3 weeks to train this version of the bot though, which is a long time with the amount of computing power they throw at it.
Not sure how "Random Draft" works with only a pool of 18 heroes. Either way, will be fun to see which roles are prioritized by AI.
Glad to see warding, Roshan and invisibility have been integrated now. Minus the invincible 5 couriers, it'll look close to a real match scenario. Viper and Shadow Fiend are really the only heroes there that might semi-regularly use Manta Style.
I think they might have actually overcompensated with the reaction time. A lot of pros probably have a sub 200ms reaction, even accounting for input lag.
I think random draft with 18 heroes implies all 18 heroes are probably available, with ranked random draft selection roles, rather than something like captain's draft. I wonder if the bots actually react to what player picks are though based on their own data.
I'm very interested though with a lot of those restriction lifted. I hope they get a bo3 in against the caster crew too. Wonder who their last guy is gonna be. Lacoste maybe? Or a pro NA player in SF?
I assume the pro players also get their own set of 5 invulnerable couriers?
On July 20 2018 18:49 sertas wrote: Why dont they have liquid or any other pro team play against the bot 5v5?
Considering that this event is just 10 days before TI starts, I imagine liquid will be a little preoccupied around then (as should any team at TI). I know that blitz asked some other TI players to join him and got turned down.
I believe at the international itself, they intend on having a real showmatch vs a real professional team (similar to last year's 1v1 vs a real pro mid, dendi), although more than that hasn't been announced. Currently we are discussing their most recent update, which is the showmatch vs Blitz/Fogged/Cap/Merlini +1 on august 5th.
I’m actually a bit sad they nerfed the bots reactions to 200 ms.. i wanted to see some nutty plays, like a bot deflecting every finger of death by lion with a lotus orb. Or a bot perfectly dropping tranquils between auto attacks whilst fighting some dude.
Seeing some next lvl warding by the bots is hopefully going to be cool though.
I actually think the reaction time thingy could have been changed so that the bots get more practice in; i feel like with lesser polling per time they get more meaningful data.
With the original restrictions they had won against some 5.5k mmr stack already. But at least it seems that they can do decently with the new setup too.
It's quite hard to assess what they still have to be able to improve because we have really no idea how the bots won and lost. Simply saying that they won in some matches isn't so interesting to me. It would be nice if they would release these sorts of matches for viewing. The audience event would also be a better benchmark if the humans could see the bots in action before they play them, even if it's a previous version.
This is cool, and I'm very impressed with how quickly they've been crossing restrictions off their list.
The one thing I wish they'd do is play one game without the restrictions, so we can watch how the bot fails when it goes up against something it doesn't know how to handle. I'm sure this would lead to plenty of glitchy, exploitable behavior, but it sounds both entertaining and educational to look at what happens when you turn it loose on the full game.
I'm quite curious to see it, just wish it was more extensive than 3 games like I've said before. Not really sure what I expect as the result. 1-2 weeks ago the bots played against some 5k - 6.5k players and went 2-2 in 4 games. There should still be somewhat of a jump from that to beating this group of players (better players, they have more experience playing properly in a team setting, etc). But it seems conceivable that the bots could win after the latest training run, especially if their play is quite different to humans. Humans may need some time to adjust if the bots are beating them in a way that the humans had a hard time anticipating.
Rather than just the result, I'm most curious about what sort of capabilities the bots show in different areas of the game. And I would be interested to know what parts of the game turn out to be the hardest for the bots to learn. I wouldn't be surprised if the bots are very good at 5 man heavy playstyles and teamfighting in general. But I would be a bit surprised if the bots can beat the humans in a pickoff heavy game with Slark, can abuse Slark's night vision or can abuse fog of war really well with something like Earthshaker to make humans constantly play scared.
Also hopefully we see the bots on various different heroes out of the pool of 18. If they just end up playing Viper, Necro, Lich etc. it won't be as exciting as them playing Earthshaker, Slark, SF or Sven. Actually I'm probably most interested in seeing the bots play with a draft where they have clearly the worse teamfight. Irregardless of whether they'd win in that case, it would be very interesting to see how they would approach it.
I think the main point of issue with this showmatch is the 5 invul couriers. They can perpetually ferry mangos and sustain their push easily.
That being said it's still impressive to see some of their laning and teamfight coordination. They're so aggressive and apply so much pressure, it's crazy.
On August 06 2018 04:56 Wineandbread wrote: I think the main point of issue with this showmatch is the 5 invul couriers. They can perpetually ferry mangos and sustain their push easily.
That being said it's still impressive to see some of their laning and teamfight coordination. They're so aggressive and apply so much pressure, it's crazy.
Yeah I didn't expect the couriers to be a big deal, but now that I see how they play with salves it's pretty important.
Top on the list for OpenAI team: removing 5 couriers precondition. On why no illusions and summons: they didn't want the game to be won because OpenAI can micro better than humans, they want the focus to be more on strategy.
Not sure how the draft works, but I hope not every draft is similar to the one against audience members. 5 man heavy lineup with superior teamfight and constant salve ferrying with invulnerable couriers is sort of exactly what I was afraid of. Even worse because I also didn't realize that they would use the couriers like that
Though certainly what the bots did early on with their really quick rotations was quite interesting and I'm curious to see how it works against the top players. Also their spellcasting was clearly good as expected. Not sure about laning in general yet. But after they got ahead in that game it was almost a formality with the regen ferrying and the limited counter push the audience members had.
Feels like humans need to fight less in the next games. Engaging against the bots seems completely hopeless unless you have first outmaneuvered them or you can somehow surprise them from fog
On August 06 2018 06:07 plasmidghost wrote: Humans need a hard farmer that can dodge fights
Yeah. Though this game also they could have done that more but they understandably played like they probably would against humans and the bots are really hard to fight against.
Humans need to get BKBs to fight probably.
Alright Fogged with blink now. Let's see Lion insta hex him lol
Very interesting to see how many consumables the bots use. But it's kind of lame that they mostly win just by perfectly executing teamfights and having superior reaction time.
You could almost have even regular dota bots beat pro teams if you gave them 0 reaction time and similar scripts that cheaters use.
The bots diving and skill usage is near perfect. The human team got caught with their pants down so often due to perfect timing on follow-ups to skills like assassinate/hex/stun
Razor unfortunately useless that game, CM and Necro were also pretty non-impactful as well.
To think they started a trilane with DP/Gyro/Lion and put Lich in their safelane farm against Necro lol. Hope to see some good adjustments by the casters in the next game.
We'll have to see if the caster crew can stabilize at all and if the bots change up the gameplan, but right now it's really just the VG strat. Sustain to stay aggressive and emphasize levels to win midgame.
Creep pulls should screw them over heavily, they didn't seem to react at all to creeps being anywhere except lane. Observer wards in lane instead of jungles should spot the early rotations, then the players should just hang back.
On August 06 2018 06:18 cecek wrote: Very interesting to see how many consumables the bots use. But it's kind of lame that they mostly win just by perfectly executing teamfights and having superior reaction time.
5 couriers makes it very hard to play a "normal" dota game vs OpenAI, I would love to see that restriction removed
I'm not really following how they can pick Lich in that case. They think Lich gives them 43.8% chance to win, surely there is a hero that they think gives them more than 50% chance to win as 1st pick?
Yeah, pretty clear the bots don't understand lane control or movement blocking yet. Which is weird considering they learned to creep block at the start.
I think the bots are going to have a hard time against top teams that have better coordination/execution. Rushing bkb's seems like a good way to win against them.
genuinely dont understand how they arent spamming hoods/pipes against pure magic damage teams that completely rely on being able to blow stuff up with spells
On August 06 2018 06:39 WolfintheSheep wrote: Yeah, pretty clear the bots don't understand lane control or movement blocking yet. Which is weird considering they learned to creep block at the start.
they learned to creep block given an incentive to do so. openai team probably decided there wasnt a strong reason to keep it around.
On August 06 2018 06:36 spudde123 wrote: I'm not really following how they can pick Lich in that case. They think Lich gives them 43.8% chance to win, surely there is a hero that they think gives them more than 50% chance to win as 1st pick?
Maybe the bot just thinks that picking second is inherently better?
On August 06 2018 07:03 Aerisky wrote: This is best of 5, right? Damn though this is pretty crazy, even if it's kind of cheesy with the couriers and stuff.
On August 06 2018 06:36 spudde123 wrote: I'm not really following how they can pick Lich in that case. They think Lich gives them 43.8% chance to win, surely there is a hero that they think gives them more than 50% chance to win as 1st pick?
Maybe the bot just thinks that picking second is inherently better?
Yeah this game has been quite interesting because the AI has had to play completely differently and they showed some ability to pressure multiple parts of the map and pull the humans apart. Wish we got to see a few more games with different drafts.
They seem to have a rather simplistic concept of the game at this stage, its mostly pick heroes with long range nukes that enables them to make kills and/or zone out opponent heroes and then push as 5. Not sure how far they can get just by playing bot vs bot matches, perhaps allowing them to analyze top level replays from pro teams might give them more ideas/concepts that they can utilize.
On August 06 2018 08:38 FreakyDroid wrote: They seem to have a rather simplistic concept of the game at this stage, its mostly pick heroes with long range nukes that enables them to make kills and/or zone out opponent heroes and then push as 5. Not sure how far they can get just by playing bot vs bot matches, perhaps allowing them to analyze top level replays from pro teams might give them more ideas/concepts that they can utilize.
True, but I believe the goal of the team is to make a self learning machine. Being extremely good at dota can come second.
So maybe there are willing to accept that the machine is not optimal in some situations because it didn't have enough time to encounter those situations yet, rather than "help it get good faster" by feeding it data.
But... maybe they'll talk about it on the panel. Team members have not been very technical yet, though, I found. The blog was much more useful.
All in all, i'd say humans did pretty bad in the early game, feels like better and more experience players would do so much better in the game than high mmr casters. Bots for sure played good and would beat maybe most random 5 man team, but this was not the show i feel like they advertised it would be, bots didn't blow my mind with their mechanics.
It was a treat to see how machine learning playing by itself can be this good. It felt like the AI always has the initiative and coupled with its efficiency it just snowballs to a win. Also quite scary to actually see AI's aggression.
This was cool, and the bot is impressive, but I remain unconvinced it would beat a real pro team in its current state. A lot of the advantages it got in those first two games looked like unforced errors from the human players; in particular, they seemed to take a really long time to learn not to try and defend pushes with 2-3 heroes, even after getting dived and dying over and over again.
Of course, the bot will probably also be better by the time it plays a real pro team, so who knows?
On August 06 2018 09:32 polgas wrote: It was a treat to see how machine learning playing by itself can be this good. It felt like the AI always has the initiative and coupled with its efficiency it just snowballs to a win. Also quite scary to actually see AI's aggression.
I think this was the most boring part of the AI strat, IMO. Coordinating spells and calculating damage is fairly basic behaviour, and should be a fairly direct line of learning once kill rewards are input as variables. Ditto to charging down towers, it should just be reward based decision making, since those are the most valuable targets at that point in the game (and that's not even considering abstract value like map control).
How that factors into draft was interesting, but ultimately limited by the hero pool. We saw from the draft probabilities that almost everything for the bots came down to a single gameplan.
The 3rd game split-pushing and tactical feeding was much more interesting overall, as that goes well beyond X > Y decision making. Obviously they need some further development to learn how to not feed, and how to prioritize saving major objectives over staying alive, but it showed some actual reactionary strategy and game-state evaluation.
Yea there seem to be some weaknesses that may be exploited by pros. I need to watch the games again, but as far as I recall humans did very well in terms of last hits in solo lanes. Laning seems like something the bots may be exposed in more. Though the bots seemed very good at playing their aggro lane so the humans have to be pretty careful in terms of how they play that lane so they don't get run over. Because of the limited pool and the way the bots like to pick, the game can snowball pretty badly if the humans screw up early on. And the bots were indeed very good at keeping the pressure up and immediately jumping at heroes that were even a bit out of position.
For pros it's also very useful that they've now seen the bots play so they have some idea of what to expect even if it's not the same version. Pretty clearly today the human team was taken by surprise by a lot of things and they played various situations as they would against humans and just got owned by the spell casting from the bots.
I'm interested to see what the bots do once they don't all get a courier. How do they prioritize couriers? Is it similar to the way human teams do it? All the little innovations that they do are pretty slick.
On August 06 2018 09:32 polgas wrote: It was a treat to see how machine learning playing by itself can be this good. It felt like the AI always has the initiative and coupled with its efficiency it just snowballs to a win. Also quite scary to actually see AI's aggression.
I think this was the most boring part of the AI strat, IMO. Coordinating spells and calculating damage is fairly basic behaviour, and should be a fairly direct line of learning once kill rewards are input as variables. Ditto to charging down towers, it should just be reward based decision making, since those are the most valuable targets at that point in the game (and that's not even considering abstract value like map control).
How that factors into draft was interesting, but ultimately limited by the hero pool. We saw from the draft probabilities that almost everything for the bots came down to a single gameplan.
The 3rd game split-pushing and tactical feeding was much more interesting overall, as that goes well beyond X > Y decision making. Obviously they need some further development to learn how to not feed, and how to prioritize saving major objectives over staying alive, but it showed some actual reactionary strategy and game-state evaluation.
Agree 3rd game was interesting but in the end it felt like the AI was playing to delay the inevitable rather than to win. I believe if the bots were not scripted to buy items based on a guide it might have been more challenging.
all the games were draft wins, the computer came up with the best strategy/draft for that tiny 18 hero draft pool. It's impossible to counterpick from such a small hero pool that people don't have experience choosing from.
On August 06 2018 09:32 polgas wrote: It was a treat to see how machine learning playing by itself can be this good. It felt like the AI always has the initiative and coupled with its efficiency it just snowballs to a win. Also quite scary to actually see AI's aggression.
I think this was the most boring part of the AI strat, IMO. Coordinating spells and calculating damage is fairly basic behaviour, and should be a fairly direct line of learning once kill rewards are input as variables. Ditto to charging down towers, it should just be reward based decision making, since those are the most valuable targets at that point in the game (and that's not even considering abstract value like map control).
How that factors into draft was interesting, but ultimately limited by the hero pool. We saw from the draft probabilities that almost everything for the bots came down to a single gameplan.
The 3rd game split-pushing and tactical feeding was much more interesting overall, as that goes well beyond X > Y decision making. Obviously they need some further development to learn how to not feed, and how to prioritize saving major objectives over staying alive, but it showed some actual reactionary strategy and game-state evaluation.
Agree 3rd game was interesting but in the end it felt like the AI was playing to delay the inevitable rather than to win. I believe if the bots were not scripted to buy items based on a guide it might have been more challenging.
Probably a lot less challenging for human opponents. If the bots were scripted to buy recommended items, then they probably have no idea how to evaluate item buys, or the recommended guide is actually superior to their current AI.
Which makes sense. Future predictions are not a simple task, and getting an AI to evaluate the risk/reward of buying immediate item benefits vs saving for strong future items suited to the enemies is going to be a giant hurdle.
On August 06 2018 12:21 acie wrote: all the games were draft wins, the computer came up with the best strategy/draft for that tiny 18 hero draft pool. It's impossible to counterpick from such a small hero pool that people don't have experience choosing from.
Maybe every win could be attributed to draft? Maybe that's actually a thing, where 1 team could theoretically have 80%+ chance to win based off draft but because they're human (and make mistakes) they don't actually have that 90%+ chance to win that the bots were often able to achieve.
On August 06 2018 08:38 FreakyDroid wrote: They seem to have a rather simplistic concept of the game at this stage, its mostly pick heroes with long range nukes that enables them to make kills and/or zone out opponent heroes and then push as 5. Not sure how far they can get just by playing bot vs bot matches, perhaps allowing them to analyze top level replays from pro teams might give them more ideas/concepts that they can utilize.
True, but I believe the goal of the team is to make a self learning machine. Being extremely good at dota can come second.
So maybe there are willing to accept that the machine is not optimal in some situations because it didn't have enough time to encounter those situations yet, rather than "help it get good faster" by feeding it data.
But... maybe they'll talk about it on the panel. Team members have not been very technical yet, though, I found. The blog was much more useful.
Well yeah, but there's clearly a pretty big gap in their learning process, it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
In the first two games I think our casters were intimidated by their aggressiveness and spell usage and didnt know how to respond to that, but I'll bet my life on it that if they play another bo3 they'll crush the bots. They kept saying the bots learn 180 years per day, or rather the equivalent of it, but I think that's a very misleading number of years, its simply hours of game play of bots doing stupid things, but the actual learning they get from that is very little compared to what a human can learn in the same time. We learned in 3 games or 90 minutes how to deal with them. What Im saying is that is a bit dishonest from them to equate machine learning time with human learning time and that 180 years per day is not a good metric.
On August 06 2018 12:21 acie wrote: all the games were draft wins, the computer came up with the best strategy/draft for that tiny 18 hero draft pool. It's impossible to counterpick from such a small hero pool that people don't have experience choosing from.
Maybe every win could be attributed to draft? Maybe that's actually a thing, where 1 team could theoretically have 80%+ chance to win based off draft but because they're human (and make mistakes) they don't actually have that 90%+ chance to win that the bots were often able to achieve.
The bots are still just as flawed as humans. Maybe more so, because their flaws are completely predictable. The win % is just programmed cockiness, no more or less accurate than Dota Plus.
Concerning draft wins, of course no draft is exactly 50-50. From a human perspective it's often hard to evaluate because humans can never play the same draft over and over again, the game changes all the time, and how good a draft is also depends on who is playing it. Some players are more comfortable on certain heroes and playstyles than others. The bots base their win probabilities on their self-play results. It's not some ultimate truth either, but rather reflects to an extent what sort of lineups the bots have found easier to learn and what sort of situations they know how to play better.
And indeed I think it's quite wrong to think about the bots as some machines that don't make mistakes. The bots have a clear advantage for example in terms of teamfighting, largely because they can observe and process everything instantly while humans have to move their camera around, click on targets, press buttons, communicate to their teammates and so on. Often the easily observable things like missed spells are what people call mistakes, but there are a lot of other things you can do wrong. Did they put their lanes correctly, did they play the laning matchups perfectly (here for example the answer is obviously no), did they choose the right times to go high ground in games 1 and 2, did they have a coherent plan to come back in game 3? I think the teamfight advantage is likely the big reason why the bots were so confident about the matches. I suspect it's quite hard for the bots to learn how to overcome a disadvantage against a team that has clearly the better teamfight. In the last game it looked to me like the bots managed to get through the early game even surprisingly well and got levels and some farm on all their heroes but then they didn't really know how to play a pickoff and farm heavy game with a later timing in mind that well.
Wait...so does this mean that the OpenAI actually predicted that they would lose with a ton of confidence too? Somehow this makes it feel like the bots still won even when they lost lol
Wait...so does this mean that the OpenAI actually predicted that they would lose with a ton of confidence too? Somehow this makes it feel like the bots still won even when they lost lol
I think VP/Liquid/LGD couldn't stomp the caster crew with those lineups. Actually, most of the TI teams probably could've pulled it off...
And the SEA and SA teams would intentionally draft those.
If they really want to impress they should include OpenAI drafting and all of the heroes. To limit the hero pool and then give OpenAI an early game push lineup is just ridiculous. If OpenAI outdrafted humans through a normal draft, then I would fully support our new robot overlords.
Wait...so does this mean that the OpenAI actually predicted that they would lose with a ton of confidence too? Somehow this makes it feel like the bots still won even when they lost lol
Their prediction is based on the other team also being the same bots. I think partly this is because clearly the humans had the far better 5 man teamfight (without significant items) in the last game, so it's very hard for the bots to win against themselves in that kind of a situation. Especially given that their lineup didn't have a laning advantage or anything like that.
But I was quite impressed in how the bots played the early part of the game, and I think the game was winnable for them around the 15min mark when all their heroes had some sort of farm and levels. They had also managed to take more towers than the humans, which should make it easier for them to farm efficiently and set up pickoffs.
I think what a good human team would do in that situation is to let Sven be largely separate from his team and just farm, set up aggressive wards and look for pickoffs with Riki, Axe, Slark and QoP. Then when you get to Sven's BKB timing, Slark has SB+next item, etc you can start fighting properly. The bots didn't really seem to know how to play that sort of game and instead got into bad fights or got picked off. Largely this seemed to be because they didn't know how to use wards and Riki to scout the enemy out and be more efficient. They seem to be very hard to beat when they have the teamfight advantage and they can push. A lot of the best high ground defenders are not in the pool so it's pretty hard to specifically pick against that sort of play also.
agreed bots didnt know how to play that lineup, they did ok earlygame but then they needed to split, dodge and look for teamfights. They sort of did that but not nearly as good as a top human team would use this draft.
Nice analysis of OpenAI's gameplay, although some of the strategies may be viable because of the 5 couriers: - Prioritize drafting heroes that can nuke early - Does not buy regen items on start, buys stats or damage instead - Does not get stick or wand, buys salves and clarities instead - Focuses on crushing the enemy safe lane in laning stage - Solo lanes for heroes with good teamfight abilities (Lich, Tide) - Rotates other heroes on solo lanes to catch up on levels - Balances the levels on all heroes - Focuses on burst damage and control - Good at punishing positional mistakes
Some observations. Seems like they base their win probability too much on hero composition, in the third game at one point ten minutes in they were actually slightly winning and yet their win probability was still only 10% or so. I think they don't take into account networth, skill and map control that much into account.
This made them play the third game way too risky in a sense, as they kept on sending solo heroes to push and cut the waves, allowing the human team to catch these heroes out and kill them, thus bots kept falling further behind. I think if the bots learned or were programed to base their prediction more off of actual gameplay they might have even won the 3rd game if they played a more normal game and team fought more, rater than the solo risky pushes and creep cutting.
The biggest advantage for the bots is the 5 invulnerable couriers though. As good as their team fight seems to be, they are able to get to that point by winning their lanes quite hard by abusing the five couriers. This allowed the bots to basically start off with zero regen and then just fairy hp and mana regen over and over, allowing them to stay on the lanes and constantly fight all the time, while the humans weren't used to 5 couriers and played a game more suited to what we are used to and that is only 1 vulnerable courier.
I think the AI would have lost all games if it had to play with just 1 normal courier. Again the strategy the AI executed was based around the 5 invulnerable couriers, basically using max regen all the times in the lanes to constantly fight and push and takes towers early.
So yeah. I think the AI will need to change its strategy completely once they have to play a fully normal game of dota. These strats and gameplay won't work in a game with a normal courier.
Another point is the limited hero pool, obviously the AI gas trained extensively with those 18 heroes and knows the best strategies and how to utilize those heroes the best, unlike the humans who play with 120+ heroes and certainly don't have as much experience with just those 18 heroes.
So yeah, the AI is definitely advanced and it does seem to be high level, but only within the context of the limitations. I think if this AI version was put to play the complete normal game it would lose every single time to any decent 4k+ team.
Nice analysis of OpenAI's gameplay, although some of the strategies may be viable because of the 5 couriers: - Prioritize drafting heroes that can nuke early - Does not buy regen items on start, buys stats or damage instead - Does not get stick or wand, buys salves and clarities instead - Focuses on crushing the enemy safe lane in laning stage - Solo lanes for heroes with good teamfight abilities (Lich, Tide) - Rotates other heroes on solo lanes to catch up on levels - Balances the levels on all heroes - Focuses on burst damage and control - Good at punishing positional mistakes
The regen part is due to 5 couriers. Normally you are extremely limited in amount of courier time you can take up to ferry your regen.
And there's that typical "losing base but probability says we can't fight so just ignore base" behaviour.
I think it's a different flavor; their training rewards them for taking towers even if they lose, so the reaction to a certain loss is to push another lane and try to take a tower.
Neat! How did you know OpenAI only learns within a 5 minute window?
I think this is one of the reasons why the 3rd game went the way it did. The bots still can't match human's long term strategic thinking, and just keeps sacrificing themselves for short term gain (towers) instead of waiting for items. Humans have a chance yet!
It's not that OpenAI only learns within a 5 minute window, but rewards in the future are discounted, so that rewards in 5 minutes are worth 63% as much as the present, rewards in 10 minutes are worth 40% as much, and only 25% for 15 minutes in the future.
There's very little detail on the OpenAI TI match; no mention of it at all on the official TI site. I speculate that the winner of the All Star match will get to play OpenAI Five.
It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.
So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.
Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.
Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.
The lower bound for the number of games is porportional to (1 / (1 - gamma))^3
where gamma is the "discount-factor" where it allows you to see more into the future.
So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy
OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.
I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.
So in a sense yes, the computation is the bottle neck
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.
So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.
Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.
Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
In OpenAI's case the bots are not directly trying to win the game but rather they learn to maximize their manually crafted reward. Certainly some things they do require something like foresight, like prioritizing getting a tower kill in 30 seconds over farming individually right now. But there are some things that seem quite difficult to fully learn with this sort of approach.
Take warding for example. Placing a ward alone doesn't give them any reward. So they would have to learn that placing a ward in a specific sort of game situation in a specific spot allows them to play better over the next six minutes or so. It'll be interesting to see how the warding develops over time. Sentries to counter invis heroes seems a bit more simple because it can be more short term (though obviously even there it's often good to use obs+sentry combos to spot movements) and I suspect they could learn how to use wards behind a tower to improve their tower pushes.
Also another thing that came to mind is how they deal with the prospect of losing a game. The bots aren't taught to deal with the ancient blowing up as a complete disaster but rather it's just one aspect that gives a negative reward at some point in the future. When behind, a human team is likely to take a risk and go for a rosh fight for example because they know the enemy will likely be able to push their base if they get the aegis. As far as I gather, this sort of approach isn't necessarily what the bots would do. They may think that taking a fight now would result in them losing it and they just continue farming because it is what maximizes their reward for the next few minutes. However, they may end up straight up losing shortly after that, but from their perspective it isn't worse than probably getting crushed in a 5v5 fight right now.
But not sure what exactly OpenAI's plans are. Will they continue to use Dota as a test bed to test their ability to adapt to all sorts of weird situations? Or will they just stop the project once they beat pros at TI?
On August 18 2018 17:37 WolfintheSheep wrote: Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.
So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.
Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.
Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
I think GPU computation will soon surpass CPU computation, and it already has in some areas, so my money would be on that rather than on quantum computers, at least in the near future. I use GPU rendering for 3D work and while it is faster than CPU (arguably), its got some nasty limitation with RAM, not being able to combine the RAM from multiple GPU's, however Im not sure if thats the case with GPUs for machine/deep learning. The new ones with tensor cores from Nvidia seem to be tailored towards these kind of tasks, but seeing as they have 0 competition at the moment, I dont think they are in any rush to improve that technology or make it available at cheaper prices.
Yeah, that was my read on the AI, which didnt knock my socks off. I know the OpenAI team were happy with the result, but I personally didnt see the bots as anything special because the only thing that can impress me is planning and strategy rather than godlike execution. However now that I understand the limitations it has due to compute power, I wonder what will it take to have a break through on this field: better algorithms or better compute power, or maybe both.
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.
The lower bound for the number of games is porportional to (1 / (1 - gamma))^3
where gamma is the "discount-factor" where it allows you to see more into the future.
So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy
OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.
I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.
So in a sense yes, the computation is the bottle neck
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
However, even if the AI had all the computation power it needs, I still dont understand how the problem of planning ahead or devising a longer strategy would help it achieve a more human like read on the game, when they are still limited by the immediate rewards. Maybe Im asking the wrong question dunno.
I just watched this video, but sadly its way too technical for me to understand it
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.
The lower bound for the number of games is porportional to (1 / (1 - gamma))^3
where gamma is the "discount-factor" where it allows you to see more into the future.
So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy
OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.
I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.
So in a sense yes, the computation is the bottle neck
I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D
However, even if the AI had all the computation power it needs, I still dont understand how the problem of planning ahead or devising a longer strategy would help it achieve a more human like read on the game, when they are still limited by the immediate rewards. Maybe Im asking the wrong question dunno.[/QUOTE] As I understand these neural networks, which to be honest I may not fully, the AI learns but creating a gigantic set of data and finding the optimum actions amongst that set.
So the AI won't actually be planning ahead or devising a better strategy on the fly. It needs to have those already in its playset, and then the game needs the match the conditions to use that strategy.
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.
The lower bound for the number of games is porportional to (1 / (1 - gamma))^3
where gamma is the "discount-factor" where it allows you to see more into the future.
So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy
OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.
I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.
So in a sense yes, the computation is the bottle neck
Can you count to ten at least?
2^100 is ridiculously large number. I will give you some multiplication.
Imagine that 1 game takes 1 microsecond and that you use all x86 computers in the world.
Intel produces ~400 million CPUs a year. Let say this translates to 400 million computers a year. Let say we combine all computers created in last 50 years. That's 20 billions computers. Now we used them to play dota2 for last 8 years. 8 years = 252460800 seconds = 2524608 * 10^2 seconds = 2524608 * 10^8 microseconds <= 2.6 * 10^14 seconds.
Combined game power is 20 billions - 2 * 10^10 - computers times combined time 2.6 * 10^14.
That's 5.2 * 10^24. That's under 2^83. In other words even if someone run simulation on a completely unrealistic number of computers for completely unrealistic time assuming one game takes unrealistically short time to compute, we still are nowhere near 2^100 games, much less 2^100 * 37037037037 games.
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.
So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.
Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.
Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
That is so out of data what you just said. You can absolutely compute every move in Chess, computers have been able to do that for the past 20 years. Thing is modern AI systems use intuition, its essentially really specified brain power to do the same thing over and over, how it learns stuff is by rewards, the simpler the game the easier it is to maximize play faster, the more complex the game the more rewards, the harder it is to master.
Anyways Google's open mind AI has already beaten the chess computer blue fish or whatever the name was. In fact OpenAI didn't even lose a single game against the computer that calculates literally up to 10k moves ahead, openAI uses intuition and is able to squeeze out victories.
The biggest problem for chess for example is computing the exact amount of moves, it worthless to compute 1000 moves ahead, the game has 1/10k chance to end like that, so its about understanding how your opponent plays and figuring out what move is he playing for, for humans this is usually 2-5 moves ahead for professionals, 0-2 moves for amateurs.
So a modern AI is essentially a brain that specializes in a certain thing. With repetition it is able to master things like Chess, GO, even Dota it seems now. The biggest problem I see with today's AI is that it needs too much repetition to learn stuff. A normal human being can learn to play decent dota with like 20-50 games, I'm talking about not doing dumb shit like suicide into towers, suicide into rosh at level 4, using salve's while enemy heroes are hitting you, just basic stuff like that. For a human it would take 20-50 games to be at a decent level to not to dumb shit, for AI it takes 2-5 million hours to learn that.
OpenAI has played hundreds of millions of hours and that is with limited hero pool and certain game modifications.
AI advantage is that it can play 180 years worth of play in a single day, every day. Though you do need super expensive supercomputers and gazzilion of hard drives to store all of that data and obviously that is a lot of electricity expanded for a very specific task.
I'm no expert on AI, but everything about chess in the above post is really just wrong. You absolutely cannot 'compute every move' in chess. The conservative estimate for the amount of typical (ie 40-move-long) chess games is 10^120, a number that far outstrips the amount of atoms in the universe. Even the estimate of reasonable chess games (ie. excluding obvious blunders) is 10^40. We aren't even close to being able to solve chess via brute force (ie. a minimax algorithm) and if we were, no 'intuitive AI' would ever find any edge vs a traditional computer as the game would just be solved.
Saying 'a computer that literally calculates up to 10k moves ahead' is just as misleading as claiming GMs 'usually calculate 2-5 moves ahead'. Chess calculation is not a singular string of white and black moves, it's a game tree starting from the present position that includes all legal moves (the brute-force method, an optimised version of which is used by traditional engines) or all candidate moves (the human method), and going as far as computational or brain power allow. Calculating 4 ply (2 moves each) ahead in a complex middlegame position with 10+ candidate moves per ply is far more difficult (or often even a pointless approach for humans) than calculating 30 ply ahead for a mate in 15 where every single move is forced.
But anyway, I realise the post was more about AI than chess or traditional chess computers and I do agree that AlphaZero only taking 4 hours to surpass (an unoptimal version of) Stockfish is incredibly impressive. Though I'd be careful with words like intuition when talking about an AI. While it's true that AlphaZero evaluated far fewer positions than Stockfish, there could be many reasons for that. We simply don't know exactly why AlphaZero is as good as it is. It's one of the most fascinating things to me in all this: With traditional chess computers, we know exactly how they reach their conclusions (we even gave them the algorithms by which they judge positions). With an AI, we really don't, at least as far as I understand it (evan?). We give them the tools to learn, we understand how they learn, but the results of that learning are a black box that can surpass us.
======================== I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D =========================
On August 22 2018 06:30 evanthebouncy! wrote: ======================== I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D =========================
It's (1 / (1 - gamma) )
so if gamma is 1
you get 1 / (1 - 1) = 1 / 0 = infinite
I believe the point is that you can never have gamma = 1, yes
On August 22 2018 06:30 evanthebouncy! wrote: ======================== I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D =========================
It's (1 / (1 - gamma) )
so if gamma is 1
you get 1 / (1 - 1) = 1 / 0 = infinite
I believe the point is that you can never have gamma = 1, yes
I mean if you just want gamma = 1 you'd just use the episodic count H = 20000
It is fairly agreeable in the RL community that (1 / (1 - gamma) ) ~ H
One for a stationary RL (i.e. you expect the game go on forever and has no definitive ending) One for episodic RL (i.e. you expect the game to end upon a certain point, i.e. Go, or short-ish game of dota)
On August 22 2018 06:30 evanthebouncy! wrote: ======================== I've never been particularly good at math, but if gamma is 1, then wouldnt that equation be 0, which means the AI would have to play 0 games? Perhaps im misunderstanding the equation ... so if thats a dumb question, just leave it :D =========================
It's (1 / (1 - gamma) )
so if gamma is 1
you get 1 / (1 - 1) = 1 / 0 = infinite
1/0 does not equal infinite
1/0 is undefined and therefore mathematically impossible.
1. Bots are really good at team fights. (Although maybe because their draft was all teamfight?) 2. Bot wards are horrible. Triple sentry in front of rosh pit? Vision ward at on top of their standing tower? Wards in front of fountain? 3. Once late game hit, bots wasted tons of ults farming. (Maybe they knew they were losing and were trying to farm their way back?) This ended up hurting them seriously during late game fights. They also wasted ults to kill 1 hero. 4. Bots are really good at diving, but this caused them to overextend sometimes. 5. Really inefficient buyback usage, at one point CM did a "rage buyback".
Then again, despite the bots playing horribly in late game, they still made the game look really close. Hopefully humans still come out ahead over the next few days, but we'll see. It's weaknesses are really glaring though.
I have yet to see the game fully but based on what we saw before indeed teamfights are their biggest strength. And because of that it's probably quite hard for humans to just roll over them. It's quite scary to engage against them without a significant advantage either through items or just positioning. From the clips I saw the bots for example dodged w33's Axe calls several times with blinks, euls and whatnot because their reaction time is shorter than the call cast time. If you can't even do a reliable initiation with something like Axe, you sort of have to slowly win by moving around the map better and only taking favourable fights.
Problem with OpenAI in this game was not attempting to breach high ground when they had a slight lead: - Doesn't seem to know how to use aegis for fights in the high ground, even when they got aegis multiple times. - Giving aegis to Lich. - Letting Gyro get picked off alone when it had aegis. - Blowing their ults on creeps at random times.
Maybe OpenAI calculated that it wasn't able to win the fight uphill and went into farm mode.
OK, it can't be helped that they have to play an older version of the game.
The altered rules are all bullshit, though. It means the pros can't play at their best, so there will always be doubts even if the bot wins.
So they make a weird compromise where Blitz draws up a supposedly fair draft. AND THEN the bot goes haywire and loses the big stage match. OUCH.
OpenAI isn't a massive company, so it's understandable that they haven't quite surpassed Google in AI technology. I still can't help but to be a little disappointed at all this.
On August 24 2018 02:08 Tanukki wrote: OK, it can't be helped that they have to play an older version of the game.
The altered rules are all bullshit, though. It means the pros can't play at their best, so there will always be doubts even if the bot wins.
So they make a weird compromise where Blitz draws up a supposedly fair draft. AND THEN the bot goes haywire and loses the big stage match. OUCH.
OpenAI isn't a massive company, so it's understandable that they haven't quite surpassed Google in AI technology. I still can't help but to be a little disappointed at all this.
I thought they are now playing the current patch? The match against Blitz & co was on some old patch, but I thought I read/heard that it's now playing the TI patch.
Concerning the draft, I think something like what they did is the best way to go for a one-off match. If you allow the bots to always pick their lineup, they will likely only have to play a certain kind of game. It is also interesting how well they've learnt the game overall, and not only whether they can beat humans with some specific kind of strategy. Especially as it's just a one-off match where the humans have no possibility to see the bots play first and they don't play multiple games to allow the human team to adapt. As far as I understood Blitz selected the lineups (and I presume OpenAI people also checked that their bots find them pretty balanced) and then they randomly selected which lineup goes to which team.
Of course it would be great if it would be completely without restrictions, but now that they got rid of the invulnerable couriers I think the games are alright. I think the bigger asterisk on any possible bot win is it being just a one-off match where the humans can't play against the bots first. Dota snowballs really fast, so if you lose one teamfight with a certain kind of draft it may be really hard for you to do anything. Clearly the bots are different to play against than humans, so humans need some time to adjust to what works against them and what doesn't.
On August 24 2018 02:08 Tanukki wrote: OK, it can't be helped that they have to play an older version of the game.
The altered rules are all bullshit, though. It means the pros can't play at their best, so there will always be doubts even if the bot wins.
So they make a weird compromise where Blitz draws up a supposedly fair draft. AND THEN the bot goes haywire and loses the big stage match. OUCH.
OpenAI isn't a massive company, so it's understandable that they haven't quite surpassed Google in AI technology. I still can't help but to be a little disappointed at all this.
If I may ... the rules are really not THAT restrictive. If a pro can't play with those small constraints ... I mean imagine, pros played before a lot of these things existed in the game. =)
But then again it doesn't matter. It's not about beating humans. It's about seeing what can be learned in such ways. And we've seen that pro teams can pick up things from the bots. Next TI possibly even cool strats and insight into how teamplay can win the game! The scenarios of application are endless ... not to spreak about other real life applications like their cool robotic hand.
All the shortcomings of the bots come obviously from them playing only themselves. The myriads of possible situations are just too much to go through (not to speak about going through them more then once to actually make a difference in the probability distributions and thus their update!), even with playing 180 hours a day! But look at how far they have come with that!! WOW. I am in awe. Imagine them learning with more human guidance. Imagine the cool new stuff we could see. For now those "poor" bots (:D) play against an alien race ... which we are to them! We play so strange for them, so imperfect. That makes them loose. It's like having two seperate island regions train for years and then let them compete against each other. That would be weird. We have a metagame that is completely separate from theirs. They haven't seen the stuff they see from us humans now! And if they have, they even estimate their chance of winning/loosing very nicely with it. But the stuff they do ... impressive, amazing. I love them changing the way we see roles already! Support, mid, carry, offlane? They don't give a shit. They trilane with sups only and win with it. That's what I want to see. Cool stuff.
On a last note: The fixed teams is good for the humans. If the bots choose themselves, they will be already much better, I think.
On August 24 2018 04:54 Baradrist wrote: If I may ... the rules are really not THAT restrictive. If a pro can't play with those small constraints ... I mean imagine, pros played before a lot of these things existed in the game. =)
...The fixed teams is good for the humans. If the bots choose themselves, they will be already much better, I think.
I agree that the bot ELO goes up significantly if it gets to draft. That's because with the limited hero pool, it's like a different game mode that humans have not practiced.
But then again it doesn't matter. It's not about beating humans. It's about...
Results are everything. Maybe some programming wizards are excited about the new tech OpenAI has developed, but losing the game isn't going to generate them any hype.
I got pretty damn excited about DeepMind in the last couple years with their achievements, and it felt like soon there'll be nothing AI cannot do. But now, if anything, I'm seeing the limitations of AI.
What exactly are the restrictions the bot is currently playing with? It looks like the "5x immortal couriers" thing is gone, and maybe they have more heroes than before? But they're no longer drafting.
I took a look back at their page and didn't see the updated ruleset; is it explained somewhere?
Team compositions are limited to the following two variations: Team A: Lich, Crystal Maiden, Death Prophet, Tidehunter, Gyrocopter Team B: Lion, Witch Doctor, Necrophos, Axe, Sniper
Team sides and hero composition are decided by coin toss. No Divine Rapier, Bottle. No summons/illusions. No Scan.
Seems like the bots have no clue how to play past 25 minutes or so. They are really good at the early game, they are really good at team fighting, but if they are not winning by a large margin by the mid game they don't know how to play.
Past 25 minutes they start doing really stupid and noob mistakes.
To me the biggest issue for the bots seems to be that they can't communicate with one another. I think giving the bots just a simple ping option, so they can basically ping to each other like humans do will improve their game a lot.
I don't know how difficult it would be to add this, might be that there is no way for the bots to learn how to actually communicate with each other and ping when they are ganking or whatever, or even if they did learn to ping, how hard would it be for them to understand what the pings mean.
We as humans have concepts way beyond the game, so pinging makes sense, if fact most of low to mid tier games are player with little to no actual communication, other than pinging.
So yeah, the bots would benefit immensely if they could learn to communicate with one others through pings.
On August 24 2018 12:09 PlayerofDota wrote: Seems like the bots have no clue how to play past 25 minutes or so. They are really good at the early game, they are really good at team fighting, but if they are not winning by a large margin by the mid game they don't know how to play.
Past 25 minutes they start doing really stupid and noob mistakes.
There could be a number of reasons for it.
Could be the average game time in their self-play is short, so that a longer game goes into uncharted territory.
Could be that the existing tweaks in the reward function are no longer sufficient when the game runs late: looking at the reward function description,
there is a (arbitrary ?) scaling that lowers the value of last hitting/denying/killing/surviving as time goes by in favor of objective taking, which could lead to incorrect decisions providing the maximum reward
not sure how it learns of the negative impact of buying back since costs have no negative value and buying back always provides a better opportunity for short term reward
there could be something in hero prioritization late that is also hard to learn given a death/kill provides the same reward regardless of the target
Could be that the deciding factors in the endgame suffer from a horizon effect (waiting for buyback to be available or waiting for next rosh take often more than 5 min).
On August 24 2018 12:09 PlayerofDota wrote: Seems like the bots have no clue how to play past 25 minutes or so. They are really good at the early game, they are really good at team fighting, but if they are not winning by a large margin by the mid game they don't know how to play.
Past 25 minutes they start doing really stupid and noob mistakes.
To me the biggest issue for the bots seems to be that they can't communicate with one another. I think giving the bots just a simple ping option, so they can basically ping to each other like humans do will improve their game a lot.
I don't know how difficult it would be to add this, might be that there is no way for the bots to learn how to actually communicate with each other and ping when they are ganking or whatever, or even if they did learn to ping, how hard would it be for them to understand what the pings mean.
We as humans have concepts way beyond the game, so pinging makes sense, if fact most of low to mid tier games are player with little to no actual communication, other than pinging.
So yeah, the bots would benefit immensely if they could learn to communicate with one others through pings.
The problem probably isn't communication, it looks as if it's playing a good early game because it just brings every hero to every fight, which ends up making it "win" the early game but in reality the proes were just doing a basic 4 protect 1 (as dictated by the hero choices really) and the core is way ahead.
On August 24 2018 12:09 PlayerofDota wrote: Seems like the bots have no clue how to play past 25 minutes or so. They are really good at the early game, they are really good at team fighting, but if they are not winning by a large margin by the mid game they don't know how to play.
Past 25 minutes they start doing really stupid and noob mistakes.
There could be a number of reasons for it.
Could be the average game time in their self-play is short, so that a longer game goes into uncharted territory.
Could be that the existing tweaks in the reward function are no longer sufficient when the game runs late: looking at the reward function description,
there is a (arbitrary ?) scaling that lowers the value of last hitting/denying/killing/surviving as time goes by in favor of objective taking, which could lead to incorrect decisions providing the maximum reward
not sure how it learns of the negative impact of buying back since costs have no negative value and buying back always provides a better opportunity for short term reward
there could be something in hero prioritization late that is also hard to learn given a death/kill provides the same reward regardless of the target
Could be that the deciding factors in the endgame suffer from a horizon effect (waiting for buyback to be available or waiting for next rosh take often more than 5 min).
Good human teams also coordinate Item Timings and Item Choices as they go late-game, rather than being individually optimized. That's the part that will take a long time for the Bots to truly learn to do.
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
So reinforcement learning has a very bad "sample complexity" which is to say, to learn a close-to optimal policy(strategy) you will need to play a ton of games.
The lower bound for the number of games is porportional to (1 / (1 - gamma))^3
where gamma is the "discount-factor" where it allows you to see more into the future.
So the closer gamma is to 1.0, the more into the future you can see, but the more samples you would need. This makes sense because the more and more you're planning into the future, the more and more data / games you would need to play in order to grasp what's the best startegy
OpenAI is using gamma of 0.9997, which translates to at lease 37037037037 dota games.
I emphasize AT LEASE because this is a LOWER bound, the real number is likely much much higher than this, probably 2^100 times bigger than this if not more.
So in a sense yes, the computation is the bottle neck
On August 17 2018 22:39 FreakyDroid wrote: It was too obvious they have no long term strategy, my first impressions were these:
On August 06 2018 08:38 FreakyDroid wrote: it seemed as if they do stuff that directly rewards them, ie take tower, last hit creep, make a kill, prevent them from hitting my tower etc. Dota is way more nuanced than that, the reward doesn't always have to follow a few simple linear steps, it involves a lot of prediction/foresight something which these bots didnt have.
After reading Evan's blog, Im glad that my observations without having any prior knowledge of how the AI worked are pretty spot on. So my question to Evan or anyone who knows about it is, is it hard to to code an AI that can, for the lack of better word, remember a more complex network of steps, or maybe plan/plot a more complex strategy that isnt that dependent on the immediate reward[s]? Basally try to mimic foresight and perhaps sacrifice an immediate reward in order to gain an advantage later on. Or is compute power (or perhaps storage) a problem?
Basically computational power is the limit. Chess/Go AI still have that problem. Theoretically a computer would be unstoppable because they'd just compute every possibility and win from there. But this takes so much computer power that it probably you'll never have a complete analysis like that until quantum computers.
So for Chess, much like how actual Grandmasters would play, it's simpler and better to start with a full set of game openers, then the end game scenarios, and everything off-script would analyze 2-3 moves ahead.
Dota is the same, but a lot worse. Though, to a bit fair, a complex strategy and plan is already in play with OpenAI. The end goal is to kill the enemy ancient, the opposition is 5 completely random and unpredictable agents, and the AI had to create a plan and strategy to still reach that goal.
Obviously you're talking about higher complexity of micro-strategies, but it's somewhat important to note that because of how learning AI works, it's not actually just risk/reward evaluating the game. How it acts now is the result of thousands of simulations with only supplied knowledge of the game mechanics. The AI is playing what it has determined to be the best strategy, but it's just not flexible enough to do anything else.
I think you two are looking at the problem too narrowmindedly here. Or perhaps just responding to the question within the framework it was posed under.
You're right in the sense that, with infinite compute, OpenAI Five would almost certainly be far beyond human-level. But saying "computational power is the limit" isn't entirely correct either. According to OpenAI, these bots have played the equivalent of a total game time longer than the existence of human civilization. Yet humans are still beating them.
The corollary is that either humans have some mystical powers that can't be modeled mathematically (unlikely) or that there exist more efficient algorithms to learn Dota than the ones that the OpenAI Five are currently applying (or humanity has yet discovered, for that matter).
Most of the progress in deep learning in the past 10 years has come from improving architectures. Computer vision is possible with a single hidden layer and "sufficient compute" (and data), but computer vision became a field once convolutional neural networks started getting applied to the problem (and you can even get pretty far training them on a single GPU). Natural language processing's recent advances in the past ~10 years have stemmed more from the introduction of embeddings and attention than a revolution in computing.
I suspect reinforcement learning will follow a similar development path. Quantum computing is the big x-factor to look out for though.
As an example, take blink-call on Axe. OpenAI Five never attempted it iirc. There are two expansions I can think of: 1) at AI reaction speeds (where training occurred), blink-call often isn't a good tactic; or 2) blink-call as an offensive tactic is such a specific sequence of events that the AI never attempted it in training.
For the sake of discussion, let's think about the latter. Clearly, a human has some method of intuition-based theorizing that leads to experimentation and learning (and this method is incredibly efficient relative to current RL methods). Even with intermediate rewards and randomness, it seems plausible that it would take longer than is realistically possible to "accidentally" figure out that blink-call is a good tactic. If we could describe, mathematically, the "algorithm" that humans apply to perform such learning, it seems reasonable they AI could learn at a speed (in games played) comparable to humans -- without the limitation of humans taking 45 min to play a single game of Dota. It would seem such an AI would likely defeat humans (especially given its inherent informational and coordination advantage).
In short, I'm just not comfortable with the statement "the problem boils down to compute."
On August 25 2018 16:31 mozoku wrote: I think you two are looking at the problem too narrowmindedly here. Or perhaps just responding to the question within the framework it was posed under.
You're right in the sense that, with infinite compute, OpenAI Five would almost certainly be far behind human-level. But saying "computational power is the limit" isn't entirely correct either. According to OpenAI, these bots have played the equivalent of a total game time longer than the existence of human civilization. Yet humans are still beating them.
The corollary is that either humans have some mystical powers that can't be modeled mathematically (unlikely) or that there exist more efficient algorithms to learn Dota than the ones that the OpenAI Five are currently applying (or humanity has yet discovered, for that matter).
Most of the progress in deep learning in the past 10 years has come from improving architectures. Computer vision is possible with a single hidden layer and "sufficient compute" (and data), but computer vision became a field once convolutional neural networks started getting applied to the problem (and you can even get pretty far training them on a single GPU). Natural language processing's recent advances in the past ~10 years have stemmed more from the introduction of embeddings and attention than a revolution in computing.
I suspect reinforcement learning will follow a similar development path. Quantum computing is the big x-factor to look out for though.
As an example, take blink-call on Axe. OpenAI Five never attempted it iirc. There are two expansions I can think of: 1) at AI reaction speeds (where training occurred), blink-call often isn't a good tactic; or 2) blink-call as an offensive tactic is such a specific sequence of events that the AI never attempted it in training.
For the sake of discussion, let's think about the latter. Clearly, a human has some method of intuition-based theorizing that leads to experimentation and learning (and this method is incredibly efficient relative to current RL methods). Even with intermediate rewards and randomness, it seems plausible that it would take longer than is realistically possible to "accidentally" figure out that blink-call is a good tactic. If we could describe, mathematically, the "algorithm" that humans apply to perform such learning, it seems reasonable they AI could learn at a speed (in games played) comparable to humans -- without the limitation of humans taking 45 min to play a single game of Dota. It would seem such an AI would likely defeat humans (especially given its inherent informational and coordination advantage).
In short, I'm just not comfortable with the statement "the problem boils down to compute."
But you're not really discussing the computational problem that we brought up. Neural networks and learning AI didn't eliminate the computation problem, they just reduced the load.
The Blink-Call example is a perfect example of what the bot is capable of doing (and yesterday we did indeed see the bots Blink->Call). If the learning model is shaped correctly, the bots will simulate thousands of games, find that having a Blink Dagger and jumping into range of a target and casting Call will have a higher success rate of stunning, thus leading to more consistent rewards in XP and Gold.
That's all short term analysis and reward modelling, and not the limit of the OpenAI.
What is a limitation is long-term reward analysis. For example, the 4p1 strat. There, 4 players have to continuously make conscious decisions to take less immediately rewarding actions (creep stacking, not farming, being in risky areas, etc.) for 20-30 minutes for the goal of getting another hero to a point where it will win the game.
Whether it's live or in post-data analysis, more time and more actions will lead to exponentially more factors to include. If the AI can only efficiently account for 5 minutes of analysis or prediction, it will never take the actions that require more time to be rewarded. So stacking a camp 4 times to be farmed 10 minutes later to build and item 20 minutes later would be beyond it.
His point was that the fact that humans figure out the usefulness of blink into call in much fewer games than the ai does, shows that there is room to improve the ai beyond just throwing more computational power towards it. If the learning of the ai is improved in such a way, it is likely they get better at other aspects of the game as well.
On August 25 2018 22:52 Sr18 wrote: His point was that the fact that humans figure out the usefulness of blink into call in much fewer games than the ai does, shows that there is room to improve the ai beyond just throwing more computational power towards it. If the learning of the ai is improved in such a way, it is likely they get better at other aspects of the game as well.
It's important to note that, while not exactly 1:1 comparable, the human brain is still vastly more powerful than any super computer.
What is a limitation is long-term reward analysis. For example, the 4p1 strat. There, 4 players have to continuously make conscious decisions to take less immediately rewarding actions (creep stacking, not farming, being in risky areas, etc.) for 20-30 minutes for the goal of getting another hero to a point where it will win the game.
Whether it's live or in post-data analysis, more time and more actions will lead to exponentially more factors to include. If the AI can only efficiently account for 5 minutes of analysis or prediction, it will never take the actions that require more time to be rewarded. So stacking a camp 4 times to be farmed 10 minutes later to build and item 20 minutes later would be beyond it.
This is only true when you're trying to indiscriminately process the entire action space though. It's like trying to do computer vision with multilayer perceptrons. Computer vision underwent its revolution when researchers started applying architectures (convolutional neural networks) that, more or less, force the network to learn and identify objects (i.e. abstract concepts) in images and then identify relationships between the objects present in the image and the objects' locations in the image. Object identification is aided in CNNs by taking advantage of the fact that the pixels that make up objects in images tend to be clustered together. (Note: I'm using a very loose definition of objects here).
It's easy to imagine that similar abstractions likely exist that enormously reduce the computational load necessary to make a strategic Dota bot. For example, using the relative strengths of the drafts at various game stages can be used to influence whether the AI team should spend more time fighting or farming in the early-midgame. How to nudge the network to learn useful abstractions in an elegant and flexible way is surely a difficult problem, but it's not the same as saying "OpenAI's limiting factor is compute." And seeing as recent "AI" breakthroughs have largely come from similar cases of facilitating abstraction as well other architectural improvements, I think it's reasonable to expect that any potential OpenAI Five breakthroughs are more likely to come from that direction.
^ Yeah, I think that to have the bots simulate human like dota game play, the break through has to come from that direction. I called it foresight, but I guess abstraction is a more appropriate term. Otherwise they'll be stuck with what they have now, which are basically (or rather equivalent of) 500 mmr players with godlike mechanical skills, that's how I see these bots. That personally doesn't impress me at all.
thats not really a fair comparision since every human has to learn from scratch. The million legend players dont contribute anything to "humanity" in terms of understanding dota
Its pretty interesting how similar computers are in all games, theyre just absolutely crushing when it comes to tactical stuff (in this case team fight execution and mechanics) and worse at long term stuff.
All thats needed is to make them less retarded at long term stuff and they just roll over evrething in for example chess where it took a while for computers to be less bad long term and then it was over for the humans since tactically it was never even close.
I think the reason is that once they win, the game ends. So brute force often doesn't even reach the lategame scenario and is as a result less optimized for these situations.
I'm actually really excited for this event. The showing at TI was... less then great but now the AI has had even more time to train I can't wait to see what it can do.
Every event has been a revelation to me, the benchmark test with the AI's overwhelming victory, then in TI with the humans outsmarting the AI. I don't know how OpenAI will solve the long time horizon problem but if they can do it, that would be impressive. If they can't, I hope they will continue to give us future showmatches.
On April 14 2019 03:46 Warfie wrote: I'm confused as to whether to root for AI since it's cool n all or root for humans since we are flesh and blood n all
On April 14 2019 03:46 Warfie wrote: I'm confused as to whether to root for AI since it's cool n all or root for humans since we are flesh and blood n all
Just enjoy this parody of Dota.
well i was disappointed to see rules are not more lenient this time around
It must be pretty frustrating to play against, 9 times out of 10, OG thinks they have a jump, but they are surrounded and overwhelmed in only a few seconds... The decision making is really on point. I missed the beginning, was there any restriction on lineups/heroes ?
On April 14 2019 04:52 Nouar wrote: It must be pretty frustrating to play against, 9 times out of 10, OG thinks they have a jump, but they are surrounded and overwhelmed in only a few seconds... The decision making is really on point. I missed the beginning, was there any restriction on lineups/heroes ?
On April 14 2019 05:59 Wineandbread wrote: Very curious how the bots will play alongside humans. Wish they used better players than Sheevs/OD but I'll take it
I think the thing they're showing off is how it responds to people playing suboptimally (or what it considers suboptimal, rather)
Notail does give a lot of good insights into what the AI is doing right and wrong. Definitely room for improvement for humans and AI in this game. Really hope there is a future showmatch again.
"The only other result that OG has put up with topson and friends was a 2:0 loss to frickin OpenAI - a bot team that isn’t even close to pro teams. For a team with a history of underperforming, expectations for OG are at an all time low!"
No illusions or summons = nerf the shit out of n0tail & OG; plus OG couldn't cope last year when DP was in meta...Topson & ana don't really play SF/slark/sniper [but why would they go all out & show any real strats anyway ]
Anyway I liked 4 cores plus 7.21b CM draft [even had more stuns/disables than OG] & I've always thought more even networth distribution has been the way to evolve.
Like I've always thought all 5 reaching level 6 with some items is quite strong agst an opponent with more uneven levels & networth.
Would humans be better than the AI in microing illusions and summons? OG was also informed in advance about the rules, not like they sprang a surprise during the event that oh sorry you can only pick between these 17 heroes.
The OpenAI Final version beat the OpenAI TI8 version 99.9% of the time. And it got good just by scaling it: let it train by itself more. Their graph also says that the OpenAI is not plateauing and is still getting better.
Would anyone be able to beat it online this April 18-21?
Worth noting that it has that winrate on the current game version, which the TI bot never trained on. The TI version trained on the wrong numbers on a bunch of spells and even the map and the gold/xp mechanics have changed. Evaluating how good different versions of the bot are is pretty tricky because of the significant changes to the game. Their graph shows that a significant portion from the gains in performance on the current version came from simply training on 7.20 and 7.21, which seems pretty logical.
Hopefully some pro teams try to play a number of games against the bot so we see some interesting games. The issue with the benchmark once again was that 2 games isn't enough to see the bot in various different situations. Obviously their teamfighting was again great and from what I saw they moved around the map pretty nicely most of the time. Given that they played against other pro teams it would have been nice to see recordings of those games.
The humans don't have any data on the current version, so the bots drafting their own lineup from a 17 hero pool is a big advantage for them. Not sure if it's just a coincidence that the only time the bots lost was when they weren't allowed to pick their best lineup and instead had to deal with something else. Of course seeing them at their best is interesting as well, but for evaluating how well they've learned the game as a whole it would also be important to see them on other types of lineups.
Yup, it will be interesting how many pro teams can beat it once it goes online. We'll have enough of a sample size then to really gauge the strength of the AI.
I wonder if a further step in evolution could be to have the AI play on many different versions (random version each game). Two things could happen: 1) The AI learns to actually check numbers on things like hero stats/abilities and items and adapts towards that (for example: on this patch sniper has worse stats and I seem to lose more with him, better pick something else). 2) The AI learns to focus more on general strategies that work well in general and are more resistant to patch changes.
On April 16 2019 03:30 polgas wrote: Would humans be better than the AI in microing illusions and summons? OG was also informed in advance about the rules, not like they sprang a surprise during the event that oh sorry you can only pick between these 17 heroes.
AI doing their utmost in microing illusions & summons would be just as unrealistic/unfun as WC3 DotA AI map where computer always know which illusion is real, and computer Puck always phase-shift dodges my Luna Lucent Beam (as computer knows my keyboard/mouse input before it actions).
It may be hard to program AI to use illusions & summons like a human without seeming too "superhuman"? I'd like them to put more heroes in, but at same time still simulate what human players do as it's super interesting to think AI can come up with some interesting combos/strats.
OG plays a lot of those micro kinda heroes compared to other teams & it's not about whether they know the hero pool or not - there's probably at least 8 other teams that would be more suited to the AI hero pool 'meta'. If Paris Disney EU quals were restricted to those 17 heroes, I'm not sure if OG would have qualified.
On April 16 2019 03:30 polgas wrote: Would humans be better than the AI in microing illusions and summons? OG was also informed in advance about the rules, not like they sprang a surprise during the event that oh sorry you can only pick between these 17 heroes.
AI doing their utmost in microing illusions & summons would be just as unrealistic/unfun as WC3 DotA AI map where computer always know which illusion is real, and computer Puck always phase-shift dodges my Luna Lucent Beam (as computer knows my keyboard/mouse input before it actions).
That stuff won't happen simply because OpenAI is not actually a WC3 Dota AI.
If anything that illusion/summons is just another restriction that makes sure that what OpenAI won at was a custom map OG never played on, that's about it.
That stuff won't happen simply because OpenAI is not actually a WC3 Dota AI.
The emphasis from my post should really be on "as unrealistic/unfun as WC3 DotA AI map"
- meaning it's not fun when computer cheats by doing stuff humans can't - like if they program a computer player that can almost immediately tell without error which one of you is real vs illusion , or if they can give multiple simultaneous commands to main plus multiple illusions at the same time. Pros are really good at detecting which is the real one & I wouldn't want a computer that is much better at that.
I was responding to polgas's statement about "Would humans be better than the AI in microing illusions and summons?" with "I don't want AI to be much better in microing illusions and summons to the extreme of above which feels like cheating"
- AI map is better stated as "map where computer player had built-in cheats", not "DotA AI"
Experimental Custom map with 17 heroes & no illus/summons - yes
Am I the only one noticing many more buybacks in midgame? I've had it happen quite a few games, and just seen it on some competitive match as well.
I think this is what it boils down to, for me. What will our AI overlord teach us. Because I'm pretty sure it will outperform us (to infinity), so even if it is interesting to see that happening, I think the real benefit is to see what we can learn from it.
Just like the first iteration changed the way the mid matchup is played (much more regen ferrying), I'm curious to see if this iteration will impact the pro teams, and therefore the plebs after that. For instance more buybacks, or even... a more horizontal farm spread (ie AI farms all their heroes, no pos6 shinanigans), into, why not, death-ball push.
The immediate buybacks from the AI were noticeable. When Notail was asked about it after the match, he thought they were not good buybacks. I get what he is saying, that the risks do not seem to be worth it. This could be something that the humans can punish.
That stuff won't happen simply because OpenAI is not actually a WC3 Dota AI.
The emphasis from my post should really be on "as unrealistic/unfun as WC3 DotA AI map"
- meaning it's not fun when computer cheats by doing stuff humans can't - like if they program a computer player that can almost immediately tell without error which one of you is real vs illusion , or if they can give multiple simultaneous commands to main plus multiple illusions at the same time. Pros are really good at detecting which is the real one & I wouldn't want a computer that is much better at that.
I was responding to polgas's statement about "Would humans be better than the AI in microing illusions and summons?" with "I don't want AI to be much better in microing illusions and summons to the extreme of above which feels like cheating"
- AI map is better stated as "map where computer player had built-in cheats", not "DotA AI"
Experimental Custom map with 17 heroes & no illus/summons - yes
But that's the point: OpenAI is not going to actually do stuff like that (instant identifying of illusions unless they attack something or get attacked by something is impossible, period). At best they can probably learn to use summons to tank aggro perfectly while tower diving or box-in heroes, but is it something unrealistic for humans to do?
Seems like OpenAI is losing its first game right now. Earlier one stack was pretty close too but lost despite being like 12k up at 33min or so. Afaik no pro teams have yet tried but these stacks that are having competitive games against it consist of pretty high level pub players from what I can tell.
Didn't see either game from the beginning so not sure exactly what humans did, but at least the end result was that the humans pretty heavily outfarmed the bots despite not necessarily being up in kills
Another pretty good game right now on stream. This is another SEA stack. Seems to me that taking Sven and CM away from the bots makes their regular style quite a bit weaker.
On April 20 2019 00:28 asker71 wrote: Hello, everybody! Can someone explain to me how Split Dota tournament is played. Who plays who after the group stage? Thank you...
Details for the Split Minor Playoffs have not been announced yet iirc. But this is not really the thread to ask in
I finally got around to watching the games and I believe the decision making of Open AI is extremely sensible. there were 3 Stark differences I've noticed they capitalized on which ideally humans cannot.
1. Perfect Coordination Almost no ganks were uncoordinated, you could always trust there was a follow-up
2. Even Gold Distribution it makes sense that if every hero got farmed, your team as a whole would be stronger than the opponents, even more so if the entire team has perfect synergy.
3. Role Cycling at different points, the AI would allocate different roles to where the tempo of the game was going. If you had more or less gold, they would interchange between even core roles and support roles. This is entirely logical and make sense but due to humans attributing main 1 dimensional roles to most heroes, something like this goes beyond their heads.
I'm extremely impressed with where the AI is currently at. The item choices even resemble human thought processes, and it's quite scary that we're almost to the point of getting replaced.
I wonder how long it would take for AI to expand their hero pool and find winning patterns in more chaotic/unfavorable situations & drafts.
On April 23 2019 12:44 saocyn wrote: I finally got around to watching the games and I believe the decision making of Open AI is extremely sensible. there were 3 Stark differences I've noticed they capitalized on which ideally humans cannot.
1. Perfect Coordination Almost no ganks were uncoordinated, you could always trust there was a follow-up
2. Even Gold Distribution it makes sense that if every hero got farmed, your team as a whole would be stronger than the opponents, even more so if the entire team has perfect synergy.
3. Role Cycling at different points, the AI would allocate different roles to where the tempo of the game was going. If you had more or less gold, they would interchange between even core roles and support roles. This is entirely logical and make sense but due to humans attributing main 1 dimensional roles to most heroes, something like this goes beyond their heads.
I'm extremely impressed with where the AI is currently at. The item choices even resemble human thought processes, and it's quite scary that we're almost to the point of getting replaced.
I wonder how long it would take for AI to expand their hero pool and find winning patterns in more chaotic/unfavorable situations & drafts.
There is a shift in farming priorities in human teams as well. I am not sure why you think it wouldnt. IIRC they said at some point that most item-choices including regen are still scripted.
Regarding the gold distribution, I really like the idea myself, for a human reason. I know I like to get gold and items myself, I believe that's a big part of the fun of the game. So I want my teammates to have gold too if they want it, be happy, and then play in a happy environment.
I believe that, in pubs at least, your chances of winning go way up when everyone is happy to play. People are more resilient to lane losses, bad plays, are more willing to communicate, coordinate, etc.
Same goes with buying wards, I know I do buy them no matter my position, if I feel we need some at the moment and no one is doing it. I notice a lot of people who want to win do it as well, it's just simpler and faster.
All in all, I've never been a fan of the sacrificial 5 or 4. I believe that is a trend that has been developed in competitive dota, where people are willing to forgo their fun in pursuit of something bigger, like actual money, fame, carrier, idk. I never fell this was ok in pubs, en large people seem way more enclined to pick cores than supports.
Oddly enough, I think the AI gold distribution is much more human that what we humans are currently doing...
On April 23 2019 14:43 RolleMcKnolle wrote: IIRC they said at some point that most item-choices including regen are still scripted.
Yeah, maybe there's a comprehensive video/text somewhere that lists the current limitations, but I wish it was made more clear.
Same with their coordination, or planning, at least back in last iteration (TI), there was none of that. If I got this right, they simply did not talk to each other, at all.
Yet people (casters, even Bill Gates in his tweet) keep bringing up those points. In the last event, Notail was openly wondering how far can the AI see, how many moves in advance do they have. It is my belief that as a group, there's no planning at all (because they don't talk to each other), and as individuals, there's a very little planning ahead, rather a superhuman assessment of the current situation, all the time.
I wish the openAI engineers were a bit more informative / open about that. Maybe dared to correct people, at the price of scratching the magic a little bit.
Yeah the item choices are selected by the devs / follow some guide. They are not part of the learning process. Also I think all the behaviors the bots show should be seen in the context of their training process. They don't learn by directly trying to win the game and seeing what works, but rather they have a human crafted reward functions, which includes things like gold, xp, buildings being alive and so on. Originally they just prioritize their own reward which hopefully leads to them learning to gather gold and so on, and later on they start prioritizing the average reward of the team. Some of the "gold for everyone" behavior may just be remnants of the bots learning to gather gold for themselves, rather than it being strictly optimal for winning.
But of course everything they end up doing differently is worth considering, and I think especially in their teamfight centric lineups there is merit in distributing resources more evenly. In some other lineup which relies on a last pick AM or some pickoff heavy lineup, I think it would be a bit different. Sometimes some heroes really do benefit more from the resources. Though certainly in pubs people should be a lot more flexible about it.
And as someone said above in pro dota I don't think the farm distributions are all that static at all. Sometimes it's all about a hyper carry, sometimes 2 or even 3 heroes are pretty equal in how much room they get. In the past at some point the "support" KotL was farming all the time until aghs, now sometimes it's the Enigma that allows the team to have 4 really farmed heroes. When Furion is played as 5 he often isn't all that poor because he is the one dealing with pushing out other lanes
On April 23 2019 15:12 Murlox wrote: Same with their coordination, or planning, at least back in last iteration (TI), there was none of that. If I got this right, they simply did not talk to each other, at all.
Yet people (casters, even Bill Gates in his tweet) keep bringing up those points. In the last event, Notail was openly wondering how far can the AI see, how many moves in advance do they have. It is my belief that as a group, there's no planning at all (because they don't talk to each other), and as individuals, there's a very little planning ahead, rather a superhuman assessment of the current situation, all the time.
I wish the openAI engineers were a bit more informative / open about that. Maybe dared to correct people, at the price of scratching the magic a little bit.
I could be wrong, it's just... what I gathered.
I think this depends a bit on what people mean by "planning". From what I understand the bots don't explicitly plan ahead. However, in their learning process they end up trying to optimize longer term rewards, averaged for the entire team. So after a long training time they may exhibit behavior which may look like "they are planning to do something in 30 seconds" because they are foregoing some immediate rewards like killing creeps to move around for a push or to limit the enemy teams farm.
I think one aspect where the lack of a "real" planning process sort of shows is when the bots are losing. Humans know that in the end the ancient is the big thing that matters, but for the bots that's not the case at all. They have learnt to care about gold, xp, towers and so on. Their ancient dying is just something extra that gives a negative reward and not something fundamental. From watching quite a bit of the Arena this weekend, it seems to me that when the bots are close to losing they still just try to "minimize their losses", which means not taking fights they are likely to lose and so on. Humans often have these moments where they know just being passive leads to a loss and they go for a last effort smoke play to contest rosh or to try to get a good angle of initiation, but the bots don't really understand such things for now.