It remains a human made program, and it's... quite the endeavour, you should go read the blog, I think it gives a better understanding of why restrictions are still needed at this stage.
OpenAI's Dota 2 bots vs. 5 top professionals in TI - Page 2
| Forum Index > Dota 2 General |
|
Murlox
France1699 Posts
It remains a human made program, and it's... quite the endeavour, you should go read the blog, I think it gives a better understanding of why restrictions are still needed at this stage. | ||
|
FreakyDroid
Macedonia2616 Posts
On June 29 2018 18:37 Latham wrote: I'd personally enjoy this a lot more if there were no restrictions. Yes, no one would expect the bots to win in a straight up 5v5, but it would show how far the AI has come from the previous year. Even ask the pros to go easy on the bots, or event some silly challenges like win mid as CM for funsies. This just feels like monkeys and typewriters. Oh sure, it will be much more realistic if there were no restrictions, but I dont think that will affect my enjoyment so much, YMMV. Im still curious to see how the AI will perform in this heavily restricted scenario. | ||
|
spudde123
4814 Posts
On June 29 2018 18:15 Murlox wrote: Wow. Now THAT, at least in my mind, would make an actually compelling community goal for a future TI prizepool. Like, if we get to 30M, then we can give X amount to openAI and help them run their trainings for that much longer, and hopefully get the program ready to the real deal, aka no restrictions. Well... ok, the current reward/cost for Valve is handing out 10 virtual levels, so I guess... they might not love the idea. Would be compelling to me, anyway ![]() I'm not a real expert on the methods but I don't think the current way they are approaching it is scalable to the rest of the game and just throwing more computation at it isn't the solution. Sure they may beat humans with this set of heroes, but it would be even more interesting if they could learn general concepts, move from hero to hero easily, show strategical understanding for various kinds of lineups, etc. As far as I understood their main achievement this far is showing that their already available methods can work with pretty heavily delayed rewards when enough computational power is put behind them. Though worth noting in terms of real world applicability that they need a simulated environment to practice in, which is hard to get in areas other than games. Also they have a mechanism that allows for teamwork to emerge even if the bots act completely on their own because each of the bots is rewarded when their team is doing better than the enemy team on average. One interesting thing is the reward function they crafted. The bots don't only learn from what wins and what doesn't, but rather they get rewarded for last hits, denies, damage given/taken, buildings being alive/dead and so on. But the researchers have to pretty carefully manually define how big rewards the bots should get from each of these things for the learned behavior to actually be good for winning the game. But can the rewards actually be the same for all possible lineups? As an example, a push lineup with worse lategame should put a lot of emphasis on getting buildings down quickly. If they just trade farm and don't try to end, they will eventually end up losing. Understanding these sorts of longer term effects is not relevant in a mirror match. Not to even mention how all sorts of patch changes affect the way the various rewards should be prioritized. Whenever there's a patch that changes how much gold different types of creeps give or how much towers give, the bots can't just go and learn the game again but rather the researchers have to manually fine tune the reward function first. This doesn't mean I'm not interested in it, but I don't think they should try to frame it as "we are beating humans in Dota". It's not yet the same game humans play. It would be great to see several full games of the bots playing against humans, and even bot vs bot matches, to get a good idea of all the things the bots have learned. | ||
|
YourGoodFriend
United States2197 Posts
On June 29 2018 08:33 WolfintheSheep wrote: I'm honestly not interested in this until these restrictions are gone: I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game. For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog): High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250. So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans. | ||
|
lolfail9001
Russian Federation40190 Posts
On June 29 2018 21:00 YourGoodFriend wrote: For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog): So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans. There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now. | ||
|
spudde123
4814 Posts
At some point it would be cool if they'd arrange something similar to what has been done in poker. Basically bring a number of the best players to a location, and really play a lot of human vs ai games live on stream. I'm not sure we will be able to tell even after the matches in a month or so how good the bots actually are because most game states won't come up in a single game of dota. I assume they'll play at least a few games, but even that isn't really enough to really see what the bots know. It wouldn't be interesting just from the perspective of whether they can beat humans, but just out of general interest for how weird things they can learn through self-play. | ||
|
YourGoodFriend
United States2197 Posts
On June 29 2018 21:15 lolfail9001 wrote: There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now. Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense. | ||
|
Nevuk
United States16280 Posts
| ||
|
lolfail9001
Russian Federation40190 Posts
On June 29 2018 21:43 YourGoodFriend wrote: Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense. Harder is a statement about ordering. The fact that it is harder than Go by orders upon orders of magnitude is irrelevant. Also, your real sports comparisons are the embodiment of "bad sports comparison" trope, those adjustments you talk about are not subverting rules of the game itself, while locking Dota into a fixed and mirror 5v5 without wards is. Now, if they are going to throw away most of these for the showmatches themselves, it may just be watchable. | ||
|
TomatoBisque
United States6290 Posts
On June 29 2018 21:40 spudde123 wrote: No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from. They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that | ||
|
spudde123
4814 Posts
On June 30 2018 00:08 TomatoBisque wrote: They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that Yeah it'll be interesting to see how it actually looks. I suspect it has something to do with the way their reward function is defined. The bots don't learn what is optimal for winning but rather what the researchers have defined as the proxy for winning (the reward function). Not that there is anything wrong with this really, humans also pursue various intermediate steps rather than all the time thinking what is truly optimal for winning the entire game. Many of us never get past that stage and instead just hide in the jungle farming creeps while our team is losing. But anyway every bot is trying to maximize their own reward. The mean reward of both teams is taken into account there, but they are also just trying to increase their own rewards, for example by farming creeps. I don't know whether the resource distribution arises because the bots determine doing that brings better long term rewards for the entire team, or because every bot is basically competing for farm against their teammates to an extent. | ||
|
Sn0_Man
Tebellong44238 Posts
That's the biggest reason why people are bothered by the restrictions. We don't care if the bots can barely beat 2k mmr players, we want them to be playing dota so that the things that they learn, develop and do are relevant and interesting to us. | ||
|
spacecoke
Sweden112 Posts
On June 30 2018 00:08 TomatoBisque wrote: They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that Just like Alliance did back in the days. | ||
|
Fleetfeet
Canada2683 Posts
THEN we'll be the future. | ||
|
WolfintheSheep
Canada14127 Posts
On June 29 2018 21:00 YourGoodFriend wrote: For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog): So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans. I know perfectly well that the AI can't handle the entire game yet. And I'd much rather see a limited AI fail to keep up by humans exploiting the full game mechanics, than watch a gimmick match that is handicaps the humans so the AI can look good. And as the development continues, we'd actually get to see the progress of learning in a real environment. | ||
|
Furikawari
France2522 Posts
On June 29 2018 21:43 YourGoodFriend wrote: Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense. You can't compare go to dota like this: one wrong move in go and you can lose the game (like not selecting the right move at a given time), it's far from being the case in dota. | ||
|
intotheheart
Canada33091 Posts
edit: last year the system was able to play 1 hero, be curious if it's 5 independent systems, or some 1 master for 5 slaves, or some sort of 5-computer system that communicates with other bots. In any of these cases, I'm sure that increasing things from 1v1 SF to 5v5 {fixed heroes} was quite a challenge in that aspect alone. I'm just really curious how they'll continue to grow the system from now on, be it a slowly extending hero pool, or with more features in the game. | ||
|
Murlox
France1699 Posts
On June 29 2018 23:03 Nevuk wrote: The only restrictions they are planning on for the actual game are limited heroes What? Where did you see that? | ||
|
FreakyDroid
Macedonia2616 Posts
On June 30 2018 02:08 Fleetfeet wrote: The ultimate goal is to get the AI 5v5 functional, then ship it to valve so they can literally run a simulation of a proposed patch to see how the meta shakes out. THEN we'll be the future. Cool idea but there's no guarantee human pros will adapt the same meta as the AI. | ||
|
Nevuk
United States16280 Posts
From bot the OP and their blog : Since people are complaining about current restrictions (no warding, no rosh, etc.), repeat: They will only restrict the heroes in this challenge in August. "While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes" Later on : Our team is focused on making our August goal. We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot. | ||
| ||
