|
On March 28 2016 13:12 Ilikestarcraft wrote:Show nested quote +On March 28 2016 13:04 Fran_ wrote:On March 28 2016 12:53 Circumstance wrote: The real-time aspect will be critical. Go, being turn-baseed, is purely a matchup of mind against "mind". If They allow the machine unlimited APM, then this won't be much of a match.
Also, you'd need representatives from all 3 races, wouldn't you? And if you don't allow the machine to have arbitrary APM, what APM will you allow? The choice is completely arbitrary. I don't think an apm cap would be completely arbitrary. Somewhere around the average apm of top pros or maybe a little higher I think is reasonable. EPM not APM.
|
On March 29 2016 02:19 HellHound wrote:Show nested quote +On March 28 2016 13:12 Ilikestarcraft wrote:On March 28 2016 13:04 Fran_ wrote:On March 28 2016 12:53 Circumstance wrote: The real-time aspect will be critical. Go, being turn-baseed, is purely a matchup of mind against "mind". If They allow the machine unlimited APM, then this won't be much of a match.
Also, you'd need representatives from all 3 races, wouldn't you? And if you don't allow the machine to have arbitrary APM, what APM will you allow? The choice is completely arbitrary. I don't think an apm cap would be completely arbitrary. Somewhere around the average apm of top pros or maybe a little higher I think is reasonable. EPM not APM.
Even EPM is "inflated" since many of the actions everyone does are tiny corrections on earlier clicks (due to misclicks and whatnot) which a computer wouldn't need.
|
On March 28 2016 12:53 Circumstance wrote: The real-time aspect will be critical. Go, being turn-baseed, is purely a matchup of mind against "mind". If They allow the machine unlimited APM, then this won't be much of a match.
Also, you'd need representatives from all 3 races, wouldn't you?
Yeah, but SC2 requires a little more creativity than GO or Chess...I think eventually an AI could be better of course, but I don't think we're anywhere near that point yet. There aren't predetermined tiles or spots for pieces to move...SC2 units don't have simple rules, and there are almost infinite possibilities. I suppose you could teach an AI to macro like an animal and then A-move over someone, or maybe even attempt to setup concaves. But man, there are a lot more factors involved in winning an SC2 engagement than taking someone's rook in chess. And while I don't know the rules of Euro.Go, it looks like you have one type of unit and a grid to move on... No setting up army compositions, or gaining vision of your opponent, etc.
Maybe I'll be surprised as to how far AIs have come, but methinks they will be mechanically perfect but fairly retarded in many ways.
|
perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim.
|
On March 29 2016 02:58 Pseudorandom wrote: perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim.
Yeah I think it's possible that deepmind wins solely off of micro.
Unless they implement an APM limiter or something
|
On March 29 2016 03:22 Incognoto wrote:Show nested quote +On March 29 2016 02:58 Pseudorandom wrote: perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim. Yeah I think it's possible that deepmind wins solely off of micro. Unless they implement an APM limiter or something And all the zerg human hero needs to do is force a bunch of wasted stims and threaten to counter
|
On March 29 2016 03:36 Hexe wrote:Show nested quote +On March 29 2016 03:22 Incognoto wrote:On March 29 2016 02:58 Pseudorandom wrote: perfect macro and perfect micro will solve a lot of the 'strategical' issues by simply winning fights that, as humans, we believe you should lose. Easiest is TvZ, off creep perfect micro means you shouldn't lose any marines to banelings as long as you have stim. Yeah I think it's possible that deepmind wins solely off of micro. Unless they implement an APM limiter or something And all the zerg human hero needs to do is force a bunch of wasted stims and threaten to counter
Deepmind will just send over his starting SCVs and win a 12 SCV vs 15 Drones fight.
|
For people wondering if blizzard will "tolerate" Google developing this since it is like a 3rd party program / mod, you don't realize that there isn't a single company that will spit on the amount of exposure a company like google can bring to you. A good AlphaSC2 would save Blizzard millions of $ in advertisement and will probably bring more players on the scene.
Oh and also, an AI cannot waste stims. You can program it so that it only stims when it knows it can reach yours units for sure even if they run away, and there is no turning back on individually controlled stimmed units lol.
The AI won't need a fancy late game strategy to win, simple rush builds with perfect micro should be enough to defeat humans. Doing the same in a long macro game would be a lot more difficult
|
On March 29 2016 02:14 Mendelfist wrote:Show nested quote +On March 29 2016 01:33 The Bottle wrote: I'm not talking about the number of possible game states. That's not important for a machine learning algorithm. Number of possible moves you can make is important. That is, the way you encode a particular action. This is essential for making a training data set for your algorithm to learn. Why do you assume that the best way to solve this problem is to throw every single game state variable or pixel at a neural net and then hope that it somehow works out? Show nested quote +Your billiard example doesn't work here, because there's no self-learning AI algorithm for billiard, at least not that I know of. Then imagine one. One that learns for example by self play. Do you really think that it would have a hard time finding "the right moves" just because they are infinite in number? Edit: And the question is not if it can be better than a script, which can be near perfect. The question is if you think it's a hard problem. Show nested quote +They will have to find clever ways to transform the game input data in order to remove redundancies, and coarsen the scale of discrete moves. I'm sure they did something like that with Go already, but it will be substantially harder for SC2. You say it's just a different problem, sure. But a much, much harder problem, one I'm not quite sure they'll solve, even knowing their success with Go. Yes, THIS is the problem, and once you have done this the "number of possible moves" in the original problem is irrelevant. That only tells you that you have THIS problem on your hands and that you can't solve it by ordinary search algorithms. You have to find a way to reduce the original problem by levels of abstraction. I don't know if there is a way for an AI find these abstractions by itself. Maybe that's how Alphago works, In any case, I'm not in the least convinced that it's as hard as you are trying to make it sound. In the simplest form you can have an ordinary scripted bot that asks an AI for advice. "Attack now", "build sentries", "expand there" etc. Or you could throw every pixel at it, like you want. I don't think that would work. Or something in between. How about that?
What you explained at the beginning is actually similar to how they train the initial state of the Go algorithm, before they get into the reinforced learning. It's not sufficient to create the algorithm as intelligent as it is, but that is what they do in their initial stages.
To clarify, they don't feed all possible permutations of the board, because that's obviously intractable. But they do feed it a large set of board positions from many games, with a target variable of which side won the game, and train a neural network on that. The input data for that is not hard to encode at all. It's simply a data set of 361 trinary points (black, white, or blank) and a binary target variable (which side won). Then in practice, given a board state, you calculate the probability of victory of all possible subsequent board states (all single turn moves you can make from the given state) using the NN trained by the above process. Such a method was used as the initial stage of the Go algorithm, as explained in this paper https://vk.com/doc-44016343_437229031?dl=56ce06e325d42fbc72 before they started the reinforced learning, but this would be impossible to do for Starcraft. (I mean this particular method, not any supervised learning method.)
But listen, because I think you're still misunderstanding me. I know that they're taking shortcuts to greatly reduce the search space of possible moves in Go, the paper states this pretty clearly. But the problem is still that, because of the sheer number of possible moves you can make, and the stark difference in outcome between those moves, it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible. The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft. Yes, they will have to take new shortcuts, of the sort that they didn't take in Go. But every time they do such a thing, they have to be incredibly careful not to remove certain specific crucial moves, or coarse grain it in with other moves that have drastically different results. (An example of this is in zvz ling bane wars, where a couple pixels different motion can be the difference between 2 dead lings and 20 dead lings, or aiming a disruptor shot, or things like that).
I should clarify, I don't think this task is impossible. For sure, in principle it's very feasible to imagine a self trained Starcraft algorithm that can beat any human. But I'm trying to explain why it's monumentally more difficult than training a Go algorithm. And why the actual difference in depth of strategy between the two games from a heuristic standpoint is not nearly as important as the complexity of move permutations. You say it's a different problem. Well it's a different problem in the same sense that doing long division and proving Fermat's Last Theorem are different problems.
As for the billiard example. I did explain how the training set of a deep learning NN algorithm of billiard can be encoded, and why it's incredibly easy to do this in comparison to the other problems. I can clarify, but from your response I feel like you didn't read that bit. In your defense, it was sneakily put in beside my other point, so maybe I'll let it sit a little longer.
|
I want to know if they do end up doing this do they go with a famous name to play or do they go with whoevers the best at the time.
|
Terran drops could be very abusive by this robot if we assume it's not restricted.
|
On March 29 2016 03:49 The Bottle wrote: The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft. And I'm saying that you're just making things up. An intractable large number of possible moves (or number of input variables) doesn't necessarily mean that the problem is hard (although it is a requirement) and reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either. At least you haven't showed any arguments for it. Once you have moved to a high abstraction level Starcraft IS simple compared to Go, which cannot be reduced to builds or strategies in any similar way. This is the reason why I think it's at least possible that Starcraft is even easier to master than Go for an AI.
it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.
|
does sc2 even have a public API in order to code an AI?
|
On March 29 2016 02:39 diabcockiful wrote:Show nested quote +On March 28 2016 12:53 Circumstance wrote: The real-time aspect will be critical. Go, being turn-baseed, is purely a matchup of mind against "mind". If They allow the machine unlimited APM, then this won't be much of a match.
Also, you'd need representatives from all 3 races, wouldn't you? Yeah, but SC2 requires a little more creativity than GO or Chess...I think eventually an AI could be better of course, but I don't think we're anywhere near that point yet. There aren't predetermined tiles or spots for pieces to move...SC2 units don't have simple rules, and there are almost infinite possibilities. I suppose you could teach an AI to macro like an animal and then A-move over someone, or maybe even attempt to setup concaves. But man, there are a lot more factors involved in winning an SC2 engagement than taking someone's rook in chess. And while I don't know the rules of Euro.Go, it looks like you have one type of unit and a grid to move on... No setting up army compositions, or gaining vision of your opponent, etc. Maybe I'll be surprised as to how far AIs have come, but methinks they will be mechanically perfect but fairly retarded in many ways.
It seems like there is a huge divide in this thread between the people who know Go and those who don't...
|
On March 29 2016 04:41 endy wrote: does sc2 even have a public API in order to code an AI?
Since they are already talking, I'm sure Blizzard will provide Google with whatever they need. The publicity for sc2 will be insane.
|
On March 29 2016 03:49 The Bottle wrote: But listen, because I think you're still misunderstanding me. I know that they're taking shortcuts to greatly reduce the search space of possible moves in Go, the paper states this pretty clearly. But the problem is still that, because of the sheer number of possible moves you can make, and the stark difference in outcome between those moves, it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible. The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft. Yes, they will have to take new shortcuts, of the sort that they didn't take in Go. But every time they do such a thing, they have to be incredibly careful not to remove certain specific crucial moves, or coarse grain it in with other moves that have drastically different results. (An example of this is in zvz ling bane wars, where a couple pixels different motion can be the difference between 2 dead lings and 20 dead lings, or aiming a disruptor shot, or things like that).
Though I think you have an understanding of the workings of AlphaGo beyond the average layperson, I just want to point out that some of the language you use is incorrect. You seem to imply that DeepMind will influence AlphaGo's decisions which is incorrect.
Beyond a ladder (a large board-scale trap) calculator, the DeepMind team did not provide AlphaGo any hints, tips or tricks. Through many iterations, AlphaGo learned to place stronger weights on specific paths to search.
|
On March 29 2016 04:48 kingjames01 wrote: Beyond a ladder (a large board-scale trap) calculator, the DeepMind team did not provide AlphaGo any hints, tips or tricks. Through many iterations, AlphaGo learned to place stronger weights on specific paths to search.
I'm going off a slight tangent here, but someone on the DeepMind team spoke of a possible future development, and that would be doing Alphago again but without the first training step of the neural nets with lots of human games. While they didn't teach it any specific tricks, these games may have taught it bad habits. Go is actually a very little researched game, which Go Seigen proved in the middle of the last century by turning everything upside down. I would very VERY much want them to do this instead of trying Starcraft. It could have vast implications for our understanding of Go. Maybe the best starting move is right in the middle?
|
On March 29 2016 04:44 Musicus wrote:Show nested quote +On March 29 2016 04:41 endy wrote: does sc2 even have a public API in order to code an AI? Since they are already talking, I'm sure Blizzard will provide Google with whatever they need. The publicity for sc2 will be insane.
Yep, we will probably see an influx of "ded gaem" comments. 
Unless AlphaStar is using computer vision, Blizzard should just release the SC2 API that will be required for this to happen to everyone.
|
Go has nothing to do with chess. chess can be brute forced, go cannot
|
On March 29 2016 05:00 Mendelfist wrote:Show nested quote +On March 29 2016 04:48 kingjames01 wrote: Beyond a ladder (a large board-scale trap) calculator, the DeepMind team did not provide AlphaGo any hints, tips or tricks. Through many iterations, AlphaGo learned to place stronger weights on specific paths to search.
I'm going off a slight tangent here, but someone on the DeepMind team spoke of a possible future development, and that would be doing Alphago again but without the first training step of the neural nets with lots of human games. While they didn't teach it any specific tricks, these games may have taught it bad habits. Go is actually a very little researched game, which Go Seigen proved in the middle of the last century by turning everything upside down. I would very VERY much want them to do this instead of trying Starcraft. It could have vast implications for our understanding of Go. Maybe the best starting move is right in the middle?
The issue with removing the first 'supervised' learning process for the policy network - namely, training AlphaGo without copying human moves at the very first - is that it might then take months and months before converging. Arguably policies learned them might be stronger, but it might be that networks take too long, or simply fail to converge. So, that approach is dependent on future progress in unsupervised machine learning methods.
|
|
|
|