OpenAI's Dota 2 bots vs. 5 top professionals in TI - Page 2

Murlox

France1699 Posts

June 29 2018 09:45 GMT

#21

Restrictions seems to be in place to help the developpement team cope up, more than anything.

It remains a human made program, and it's... quite the endeavour, you should go read the blog, I think it gives a better understanding of why restrictions are still needed at this stage.

FreakyDroid

Macedonia2616 Posts

June 29 2018 09:52 GMT

#22

On June 29 2018 18:37 Latham wrote:
I'd personally enjoy this a lot more if there were no restrictions. Yes, no one would expect the bots to win in a straight up 5v5, but it would show how far the AI has come from the previous year. Even ask the pros to go easy on the bots, or event some silly challenges like win mid as CM for funsies.

This just feels like monkeys and typewriters.

Oh sure, it will be much more realistic if there were no restrictions, but I dont think that will affect my enjoyment so much, YMMV. Im still curious to see how the AI will perform in this heavily restricted scenario.

spudde123

4814 Posts

June 29 2018 10:06 GMT

#23

On June 29 2018 18:15 Murlox wrote:

Show nested quote +

Wow.

Now THAT, at least in my mind, would make an actually compelling community goal for a future TI prizepool.

Like, if we get to 30M, then we can give X amount to openAI and help them run their trainings for that much longer, and hopefully get the program ready to the real deal, aka no restrictions.

Well... ok, the current reward/cost for Valve is handing out 10 virtual levels, so I guess... they might not love the idea. Would be compelling to me, anyway

I'm not a real expert on the methods but I don't think the current way they are approaching it is scalable to the rest of the game and just throwing more computation at it isn't the solution. Sure they may beat humans with this set of heroes, but it would be even more interesting if they could learn general concepts, move from hero to hero easily, show strategical understanding for various kinds of lineups, etc.

As far as I understood their main achievement this far is showing that their already available methods can work with pretty heavily delayed rewards when enough computational power is put behind them. Though worth noting in terms of real world applicability that they need a simulated environment to practice in, which is hard to get in areas other than games. Also they have a mechanism that allows for teamwork to emerge even if the bots act completely on their own because each of the bots is rewarded when their team is doing better than the enemy team on average.

One interesting thing is the reward function they crafted. The bots don't only learn from what wins and what doesn't, but rather they get rewarded for last hits, denies, damage given/taken, buildings being alive/dead and so on. But the researchers have to pretty carefully manually define how big rewards the bots should get from each of these things for the learned behavior to actually be good for winning the game. But can the rewards actually be the same for all possible lineups? As an example, a push lineup with worse lategame should put a lot of emphasis on getting buildings down quickly. If they just trade farm and don't try to end, they will eventually end up losing. Understanding these sorts of longer term effects is not relevant in a mirror match. Not to even mention how all sorts of patch changes affect the way the various rewards should be prioritized. Whenever there's a patch that changes how much gold different types of creeps give or how much towers give, the bots can't just go and learn the game again but rather the researchers have to manually fine tune the reward function first.

This doesn't mean I'm not interested in it, but I don't think they should try to frame it as "we are beating humans in Dota". It's not yet the same game humans play. It would be great to see several full games of the bots playing against humans, and even bot vs bot matches, to get a good idea of all the things the bots have learned.

YourGoodFriend

United States2197 Posts

June 29 2018 12:00 GMT

#24

On June 29 2018 08:33 WolfintheSheep wrote:
I'm honestly not interested in this until these restrictions are gone:

Show nested quote +

I don't want a limited environment that the AI are equipped to handle. Far more interested in seeing the AI getting crushed until it's reached a point where they can excel at the real game.

For all of those complaining about restrictions, I don't think you understand how difficult it is for bots to play dota. It was huge when a bot beat the best human player at go now lets put that in respect to Dota (from their blog):

High-dimensional, continuous action space. In Dota, each hero can take dozens of actions, and many actions target either another unit or a position on the ground. We discretize the space into 170,000 possible actions per hero (not all valid each tick, such as using a spell on cooldown); not counting the continuous parts, there are an average of ~1,000 valid actions each tick. The average number of actions in chess is 35; in Go, 250.

So total valid actions for an entire game of Go is about 250, meanwhile there is 4 times that for a single hero in a single second in Dota. This is not an easy problem to solve and they are already doing some amazing things with AI yet we live in a microwave age where people complain about a bot needing restrictions after only 2 years of work to play a stupidly complex game and have a chance against humans.

lolfail9001

Russian Federation40190 Posts

June 29 2018 12:15 GMT

#25

On June 29 2018 21:00 YourGoodFriend wrote:

Show nested quote +

There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.

spudde123

4814 Posts

June 29 2018 12:40 GMT

#26

No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from. But for now the material they've released is a post about the methodology, some clips to show where the bots did something nice (no clips from where they lost), and plans to do showmatches against humans.

At some point it would be cool if they'd arrange something similar to what has been done in poker. Basically bring a number of the best players to a location, and really play a lot of human vs ai games live on stream. I'm not sure we will be able to tell even after the matches in a month or so how good the bots actually are because most game states won't come up in a single game of dota. I assume they'll play at least a few games, but even that isn't really enough to really see what the bots know. It wouldn't be interesting just from the perspective of whether they can beat humans, but just out of general interest for how weird things they can learn through self-play.

YourGoodFriend

United States2197 Posts

June 29 2018 12:43 GMT

#27

On June 29 2018 21:15 lolfail9001 wrote:

Show nested quote +

There is nothing wrong in bot needing certain restrictions. The issue is that those restrictions turn Dota into harder Go and if we wanted to play Go, we wouldn't be here now.

Not even close, its like you didn't read my post at all, its not even close to comparable with Go. Math 1000 * 5 (number of players in a game) * 2700 (seconds in a 45 minute game) is the equivalent to playing 13,500,000 games of Go in one dota match, calling that a harder Go is ignorance. Also look at humans how do we learn? I restricted scenarios, take any sport tee-ball -> baseball, height of basketball goals gets higher and 3 point line moves out the older you get, Football (soccer) fields and goals go from smaller to larger. Its the process of how we learn being replicated into machines, the point of this match is to show how far it has grown so restrictions make sense.

Nevuk

United States16280 Posts

June 29 2018 14:03 GMT

#28

The only restrictions they are planning on for the actual game are limited heroes

lolfail9001

Russian Federation40190 Posts

June 29 2018 14:55 GMT

#29

On June 29 2018 21:43 YourGoodFriend wrote:

Show nested quote +

Harder is a statement about ordering. The fact that it is harder than Go by orders upon orders of magnitude is irrelevant. Also, your real sports comparisons are the embodiment of "bad sports comparison" trope, those adjustments you talk about are not subverting rules of the game itself, while locking Dota into a fixed and mirror 5v5 without wards is. Now, if they are going to throw away most of these for the showmatches themselves, it may just be watchable.

TomatoBisque

United States6290 Posts

June 29 2018 15:08 GMT

#30

On June 29 2018 21:40 spudde123 wrote:
No doubt it's an impressive achievement and even with the restrictions it's a very difficult problem (as evidenced by the computing power necessary to get results). I don't think people doubt that, and it's interesting from a research perspective. But for it to be really interesting from a player's perspective (at least for me), I think they would have to play more or less the same game as humans play. Or maybe if we had access to some tens of games the bots play even with restrictions, maybe we could see if there's something in what they do that we can learn from.

They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that

spudde123

4814 Posts

June 29 2018 15:22 GMT

#31

On June 30 2018 00:08 TomatoBisque wrote:

Show nested quote +

They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that

Yeah it'll be interesting to see how it actually looks.

I suspect it has something to do with the way their reward function is defined. The bots don't learn what is optimal for winning but rather what the researchers have defined as the proxy for winning (the reward function). Not that there is anything wrong with this really, humans also pursue various intermediate steps rather than all the time thinking what is truly optimal for winning the entire game. Many of us never get past that stage and instead just hide in the jungle farming creeps while our team is losing.

But anyway every bot is trying to maximize their own reward. The mean reward of both teams is taken into account there, but they are also just trying to increase their own rewards, for example by farming creeps. I don't know whether the resource distribution arises because the bots determine doing that brings better long term rewards for the entire team, or because every bot is basically competing for farm against their teammates to an extent.

Sn0_Man

Tebellong44238 Posts

June 29 2018 15:40 GMT

#32

Theoretically interesting sure, but optimizations that the AI makes to the economy of the game is totally irrelevant again when it's not dota's economy. No wards/sentries/smokes/courier and only 5 specific heroes means that literally every conclusion the AI comes to cannot be extended to "dota"

That's the biggest reason why people are bothered by the restrictions. We don't care if the bots can barely beat 2k mmr players, we want them to be playing dota so that the things that they learn, develop and do are relevant and interesting to us.

spacecoke

Sweden112 Posts

June 29 2018 15:44 GMT

#33

On June 30 2018 00:08 TomatoBisque wrote:

Show nested quote +

They mentioned on the article that open ai prioritizes early gold/exp on supports more than humans do to hit faster timings to take fights. It'd be cool to see how exactly they go about that

Just like Alliance did back in the days.

Fleetfeet

Canada2683 Posts

June 29 2018 17:08 GMT

#34

The ultimate goal is to get the AI 5v5 functional, then ship it to valve so they can literally run a simulation of a proposed patch to see how the meta shakes out.

THEN we'll be the future.

WolfintheSheep

Canada14127 Posts

June 29 2018 17:42 GMT

#35

On June 29 2018 21:00 YourGoodFriend wrote:

Show nested quote +

I know perfectly well that the AI can't handle the entire game yet. And I'd much rather see a limited AI fail to keep up by humans exploiting the full game mechanics, than watch a gimmick match that is handicaps the humans so the AI can look good.

And as the development continues, we'd actually get to see the progress of learning in a real environment.

Furikawari

France2522 Posts

June 29 2018 18:21 GMT

#36

On June 29 2018 21:43 YourGoodFriend wrote:

Show nested quote +

You can't compare go to dota like this: one wrong move in go and you can lose the game (like not selecting the right move at a given time), it's far from being the case in dota.

intotheheart

Canada33091 Posts

June 29 2018 18:25 GMT

#37

I feel like they'll come up with fewer and fewer restrictions with time. I suspect a lot of the challenge of this system is making it work with machines that aren't giant server farms. Maybe for TI9 we'll see them show up with the ability to play DotA, and possibly draft the year after if everything else works.

edit: last year the system was able to play 1 hero, be curious if it's 5 independent systems, or some 1 master for 5 slaves, or some sort of 5-computer system that communicates with other bots. In any of these cases, I'm sure that increasing things from 1v1 SF to 5v5 {fixed heroes} was quite a challenge in that aspect alone. I'm just really curious how they'll continue to grow the system from now on, be it a slowly extending hero pool, or with more features in the game.

Murlox

France1699 Posts

June 29 2018 20:01 GMT

#38

On June 29 2018 23:03 Nevuk wrote:
The only restrictions they are planning on for the actual game are limited heroes

What? Where did you see that?

FreakyDroid

Macedonia2616 Posts

June 29 2018 20:39 GMT

#39

On June 30 2018 02:08 Fleetfeet wrote:
The ultimate goal is to get the AI 5v5 functional, then ship it to valve so they can literally run a simulation of a proposed patch to see how the meta shakes out.

THEN we'll be the future.

Cool idea but there's no guarantee human pros will adapt the same meta as the AI.

Nevuk

United States16280 Posts

June 29 2018 20:41 GMT

#40

On June 30 2018 05:01 Murlox wrote:

Show nested quote +

What? Where did you see that?

From bot the OP and their blog :

Since people are complaining about current restrictions (no warding, no rosh, etc.), repeat: They will only restrict the heroes in this challenge in August.

"While today we play with restrictions, we aim to beat a team of top professionals at The International in August subject only to a limited set of heroes"

Later on :

Our team is focused on making our August goal. We don’t know if it will be achievable, but we believe that with hard work (and some luck) we have a real shot.

Prev 1 2 3 4 5 14 15 16 Next All

Please or register to reply.

OpenAI's Dota 2 bots vs. 5 top professionals in TI - Page 2

Completed

Ongoing

Upcoming