DeepMind sets AlphaGo's sights on SCII - Page 6

HellHound

Bulgaria5962 Posts

March 28 2016 17:19 GMT

#101

On March 28 2016 13:12 Ilikestarcraft wrote:

I don't think an apm cap would be completely arbitrary. Somewhere around the average apm of top pros or maybe a little higher I think is reasonable.

EPM not APM.

ZigguratOfUr

Iraq16955 Posts

March 28 2016 17:39 GMT

#102

On March 29 2016 02:19 HellHound wrote:

Show nested quote +

EPM not APM.

Even EPM is "inflated" since many of the actions everyone does are tiny corrections on earlier clicks (due to misclicks and whatnot) which a computer wouldn't need.

diabcockiful

22 Posts

March 28 2016 17:39 GMT

#103

On March 28 2016 12:53 Circumstance wrote:
The real-time aspect will be critical. Go, being turn-baseed, is purely a matchup of mind against "mind". If They allow the machine unlimited APM, then this won't be much of a match.

Also, you'd need representatives from all 3 races, wouldn't you?

Yeah, but SC2 requires a little more creativity than GO or Chess...I think eventually an AI could be better of course, but I don't think we're anywhere near that point yet. There aren't predetermined tiles or spots for pieces to move...SC2 units don't have simple rules, and there are almost infinite possibilities. I suppose you could teach an AI to macro like an animal and then A-move over someone, or maybe even attempt to setup concaves. But man, there are a lot more factors involved in winning an SC2 engagement than taking someone's rook in chess. And while I don't know the rules of Euro.Go, it looks like you have one type of unit and a grid to move on... No setting up army compositions, or gaining vision of your opponent, etc.

Maybe I'll be surprised as to how far AIs have come, but methinks they will be mechanically perfect but fairly retarded in many ways.

Pseudorandom

United States120 Posts

March 28 2016 17:58 GMT

#104

perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim.

Incognoto

France10239 Posts

March 28 2016 18:22 GMT

#105

On March 29 2016 02:58 Pseudorandom wrote:
perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim.

Yeah I think it's possible that deepmind wins solely off of micro.

Unless they implement an APM limiter or something

Hexe

United States332 Posts

March 28 2016 18:36 GMT

#106

On March 29 2016 03:22 Incognoto wrote:

Show nested quote +

Yeah I think it's possible that deepmind wins solely off of micro.

Unless they implement an APM limiter or something

And all the zerg human hero needs to do is force a bunch of wasted stims and threaten to counter

Clonester

Germany2808 Posts

March 28 2016 18:40 GMT

#107

On March 29 2016 03:36 Hexe wrote:

Show nested quote +

And all the zerg human hero needs to do is force a bunch of wasted stims and threaten to counter

Deepmind will just send over his starting SCVs and win a 12 SCV vs 15 Drones fight.

The_Masked_Shrimp

425 Posts

March 28 2016 18:45 GMT

#108

For people wondering if blizzard will "tolerate" Google developing this since it is like a 3rd party program / mod, you don't realize that there isn't a single company that will spit on the amount of exposure a company like google can bring to you. A good AlphaSC2 would save Blizzard millions of $ in advertisement and will probably bring more players on the scene.

Oh and also, an AI cannot waste stims. You can program it so that it only stims when it knows it can reach yours units for sure even if they run away, and there is no turning back on individually controlled stimmed units lol.

The AI won't need a fancy late game strategy to win, simple rush builds with perfect micro should be enough to defeat humans. Doing the same in a long macro game would be a lot more difficult

The Bottle

242 Posts

March 28 2016 18:49 GMT

#109

On March 29 2016 02:14 Mendelfist wrote:

Show nested quote +

Why do you assume that the best way to solve this problem is to throw every single game state variable or pixel at a neural net and then hope that it somehow works out?

Show nested quote +

Then imagine one. One that learns for example by self play. Do you really think that it would have a hard time finding "the right moves" just because they are infinite in number? Edit: And the question is not if it can be better than a script, which can be near perfect. The question is if you think it's a hard problem.

Show nested quote +

Yes, THIS is the problem, and once you have done this the "number of possible moves" in the original problem is irrelevant. That only tells you that you have THIS problem on your hands and that you can't solve it by ordinary search algorithms. You have to find a way to reduce the original problem by levels of abstraction. I don't know if there is a way for an AI find these abstractions by itself. Maybe that's how Alphago works, In any case, I'm not in the least convinced that it's as hard as you are trying to make it sound. In the simplest form you can have an ordinary scripted bot that asks an AI for advice. "Attack now", "build sentries", "expand there" etc. Or you could throw every pixel at it, like you want. I don't think that would work. Or something in between. How about that?

What you explained at the beginning is actually similar to how they train the initial state of the Go algorithm, before they get into the reinforced learning. It's not sufficient to create the algorithm as intelligent as it is, but that is what they do in their initial stages.

To clarify, they don't feed all possible permutations of the board, because that's obviously intractable. But they do feed it a large set of board positions from many games, with a target variable of which side won the game, and train a neural network on that. The input data for that is not hard to encode at all. It's simply a data set of 361 trinary points (black, white, or blank) and a binary target variable (which side won). Then in practice, given a board state, you calculate the probability of victory of all possible subsequent board states (all single turn moves you can make from the given state) using the NN trained by the above process. Such a method was used as the initial stage of the Go algorithm, as explained in this paper
https://vk.com/doc-44016343_437229031?dl=56ce06e325d42fbc72
before they started the reinforced learning, but this would be impossible to do for Starcraft. (I mean this particular method, not any supervised learning method.)

But listen, because I think you're still misunderstanding me. I know that they're taking shortcuts to greatly reduce the search space of possible moves in Go, the paper states this pretty clearly. But the problem is still that, because of the sheer number of possible moves you can make, and the stark difference in outcome between those moves, it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible. The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft. Yes, they will have to take new shortcuts, of the sort that they didn't take in Go. But every time they do such a thing, they have to be incredibly careful not to remove certain specific crucial moves, or coarse grain it in with other moves that have drastically different results. (An example of this is in zvz ling bane wars, where a couple pixels different motion can be the difference between 2 dead lings and 20 dead lings, or aiming a disruptor shot, or things like that).

I should clarify, I don't think this task is impossible. For sure, in principle it's very feasible to imagine a self trained Starcraft algorithm that can beat any human. But I'm trying to explain why it's monumentally more difficult than training a Go algorithm. And why the actual difference in depth of strategy between the two games from a heuristic standpoint is not nearly as important as the complexity of move permutations. You say it's a different problem. Well it's a different problem in the same sense that doing long division and proving Fermat's Last Theorem are different problems.

As for the billiard example. I did explain how the training set of a deep learning NN algorithm of billiard can be encoded, and why it's incredibly easy to do this in comparison to the other problems. I can clarify, but from your response I feel like you didn't read that bit. In your defense, it was sneakily put in beside my other point, so maybe I'll let it sit a little longer.

Karis Vas Ryaar

United States4396 Posts

March 28 2016 18:50 GMT

#110

I want to know if they do end up doing this do they go with a famous name to play or do they go with whoevers the best at the time.

WinterViewbot420

345 Posts

March 28 2016 19:15 GMT

#111

Terran drops could be very abusive by this robot if we assume it's not restricted.

Mendelfist

Sweden356 Posts

March 28 2016 19:32 GMT

#112

On March 29 2016 03:49 The Bottle wrote:
The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft.

And I'm saying that you're just making things up. An intractable large number of possible moves (or number of input variables) doesn't necessarily mean that the problem is hard (although it is a requirement) and reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either. At least you haven't showed any arguments for it. Once you have moved to a high abstraction level Starcraft IS simple compared to Go, which cannot be reduced to builds or strategies in any similar way. This is the reason why I think it's at least possible that Starcraft is even easier to master than Go for an AI.

it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible

We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.

endy

Switzerland8970 Posts

March 28 2016 19:41 GMT

#113

does sc2 even have a public API in order to code an AI?

andrewlt

United States7702 Posts

March 28 2016 19:42 GMT

#114

On March 29 2016 02:39 diabcockiful wrote:

Show nested quote +

It seems like there is a huge divide in this thread between the people who know Go and those who don't...

Musicus

Germany23576 Posts

March 28 2016 19:44 GMT

#115

On March 29 2016 04:41 endy wrote:
does sc2 even have a public API in order to code an AI?

Since they are already talking, I'm sure Blizzard will provide Google with whatever they need. The publicity for sc2 will be insane.

kingjames01

Canada1603 Posts

March 28 2016 19:48 GMT

#116

On March 29 2016 03:49 The Bottle wrote:
But listen, because I think you're still misunderstanding me. I know that they're taking shortcuts to greatly reduce the search space of possible moves in Go, the paper states this pretty clearly. But the problem is still that, because of the sheer number of possible moves you can make, and the stark difference in outcome between those moves, it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible. The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft. Yes, they will have to take new shortcuts, of the sort that they didn't take in Go. But every time they do such a thing, they have to be incredibly careful not to remove certain specific crucial moves, or coarse grain it in with other moves that have drastically different results. (An example of this is in zvz ling bane wars, where a couple pixels different motion can be the difference between 2 dead lings and 20 dead lings, or aiming a disruptor shot, or things like that).

Though I think you have an understanding of the workings of AlphaGo beyond the average layperson, I just want to point out that some of the language you use is incorrect. You seem to imply that DeepMind will influence AlphaGo's decisions which is incorrect.

Beyond a ladder (a large board-scale trap) calculator, the DeepMind team did not provide AlphaGo any hints, tips or tricks. Through many iterations, AlphaGo learned to place stronger weights on specific paths to search.

Mendelfist

Sweden356 Posts

March 28 2016 20:00 GMT

#117

On March 29 2016 04:48 kingjames01 wrote:
Beyond a ladder (a large board-scale trap) calculator, the DeepMind team did not provide AlphaGo any hints, tips or tricks. Through many iterations, AlphaGo learned to place stronger weights on specific paths to search.

I'm going off a slight tangent here, but someone on the DeepMind team spoke of a possible future development, and that would be doing Alphago again but without the first training step of the neural nets with lots of human games. While they didn't teach it any specific tricks, these games may have taught it bad habits. Go is actually a very little researched game, which Go Seigen proved in the middle of the last century by turning everything upside down. I would very VERY much want them to do this instead of trying Starcraft. It could have vast implications for our understanding of Go. Maybe the best starting move is right in the middle?

purakushi

United States3301 Posts

March 28 2016 20:08 GMT

#118

On March 29 2016 04:44 Musicus wrote:

Show nested quote +

Since they are already talking, I'm sure Blizzard will provide Google with whatever they need. The publicity for sc2 will be insane.

Yep, we will probably see an influx of "ded gaem" comments.

Unless AlphaStar is using computer vision, Blizzard should just release the SC2 API that will be required for this to happen to everyone.

Incognoto

France10239 Posts

March 28 2016 20:11 GMT

#119

Go has nothing to do with chess. chess can be brute forced, go cannot

MyLovelyLurker

France756 Posts

March 28 2016 20:13 GMT

#120

On March 29 2016 05:00 Mendelfist wrote:

Show nested quote +

The issue with removing the first 'supervised' learning process for the policy network - namely, training AlphaGo without copying human moves at the very first - is that it might then take months and months before converging. Arguably policies learned them might be stronger, but it might be that networks take too long, or simply fail to converge. So, that approach is dependent on future progress in unsupervised machine learning methods.

Prev 1 4 5 6 7 8 16 Next All

Please or register to reply.

DeepMind sets AlphaGo's sights on SCII - Page 6

Completed

Ongoing

Upcoming