On March 29 2016 03:49 The Bottle wrote: The more complex the game is (and I mean this in terms of permutations of moves, not in terms of heuristic strategy), the harder this task is. Thus it will be monumentally difficult for Starcraft.
And I'm saying that you're just making things up. An intractable large number of possible moves (or number of input variables) doesn't necessarily mean that the problem is hard (although it is a requirement) and reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either. At least you haven't showed any arguments for it. Once you have moved to a high abstraction level Starcraft IS simple compared to Go, which cannot be reduced to builds or strategies in any similar way. This is the reason why I think it's at least possible that Starcraft is even easier to master than Go for an AI.
it is incredibly difficult to reduce the space in an intelligent enough way to minimally reduce the information of best moves possible
We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.
You said "reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either". For scripting that's true. For supervised and reinforced learning... well... I really think you have to work a lot harder to justify this statement. Unless you think scripting is the way that they're going to beat a pro player, in which case... well, I agree with all of Boxer's objections to that.
I can try and explain why moving Starcraft to a high level of abstraction to make a training set for a learning algorithm is incredibly difficult, but it's hard to do so without explicitly explaining how a neural network or Monte Carlo search tree works. But without explaining it, I can say that with these algorithms, you need to associate some sort of metric with the set of legal moves for a given board state. For Alpha Go this is not hard to do, and in fact they explain exactly how they do it in the paper I linked in my last post. And if the space of actions is too large they can do sampling methods (hence the Monte Carlo tree search). However, the proportion of intelligent actions within the space of all actions cannot be so sparse that the vast majority of monte carlo iterations probably won't even find one in many samples. Not unless you significantly reduce that space, and coarse grain it.
The space reduction and coarse graining is extremely difficult, though. You can, for example, decide that sending a specific army composition to attack a general area is as fine as you want to make it. That would work fine for some army compositions (like a WoL-style protoss deathball) but it certainly wouldn't work for others (like ling-bane, tank-medivacs, or disruptors). You have to finely tune the degrees of freedom to quite an extent if you don't want an army to simply throw the game by not even using 10% of its utility. If you want to coarse grain the learning process to a very high level of macro strategy then it would be easy to train, but you would never get good micro (or any micro, really) without adding scripting to it. (Which may not be off the table.) However, if you think that with pure learning and no scripting, that if you coarse grain the training sets to very high level macro strategy and still train an algorithm that can beat pros, then we probably just completely disagree on the level of importance of small scale micro.
We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.
But you do that by approximating the best possible set of moves you can make. That's what a machine learning algorithm does. You start with an ideal (in this case, there exists a set of moves that are maximally good) and you try to approximate this ideal as well as you can.
On March 29 2016 04:15 WinterViewbot420 wrote: Terran drops could be very abusive by this robot if we assume it's not restricted.
Well, then the computer would be cheating. It needs to have the same restriction a human has, a mouse, a keyboard and monitor. Alternatively, I believe we'd be better off letting humans control the game with their mind. That would be a more fair match, with only the monitor being the limiting factor.
I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
On March 29 2016 05:24 The Bottle wrote: You said "reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either". For scripting that's true. For supervised and reinforced learning... well... I really think you have to work a lot harder to justify this statement. Unless you think scripting is the way that they're going to beat a pro player, in which case... well, I agree with all of Boxer's objections to that.
No, I'm not talking about a pure scripted bot, nor a pure AI. I'm imagining a hybrid as I hinted about earlier. And yes, I have basic laymans knowledge about both neural nets and the Monte Carlo search method. I have followed the development of Go programs since the very beginning and have seen revolution of Monte Carlo search myself, but I'm not even sure Monte Carlo or any variant of current tree search methods are applicable for Starcraft. I can't see how. I know I'm not being very specific but todays Starcraft scripts can play a decent game without ANY AI techniques at all, just hard coded rules. I think that using just a policy network would help a lot. At what "granularity" you want to apply it depends on how much money you want to throw at the problem.
We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.
But you do that by approximating the best possible set of moves you can make. That's what a machine learning algorithm does. You start with an ideal (in this case, there exists a set of moves that are maximally good) and you try to approximate this ideal as well as you can.
With unlimited resources perhaps, but a compromise may be good enough to beat a world champion, which Lee Sedol discovered. The games used to train the policy network were games from public Go servers. They weren't even professional games. It was good enough.
Yes, input needs to be limited. Its similar to chess. If we let the machine check an infinite amount of moves, it will find the winning move by brute force. That is a weak form of intelligence and has more in common with a machine. The interesting question is if it is able to find solution in less than N moves and in case of sc2 in less than N inputs.
Second point is unit recognition. A human focuses generally on the center screen. The ai will focus on the whole screen and have the clear advantage.
Third point is adaptation on the fly in an incomplete knowledge problem. The advantage goes clear to human brain because it is literally made for that task. Imagine the micro you do, even if its muscle memory, you adapt on the fly, e.g. skill recognition: How good does he micro? Can i outmicro him? How can i minimize my costs?
The question, I am interested in, is if it finds a better strategy than what we call standard today.
It's like everyone is just willfully ignoring lichter and the fact that this AI will have to learn how to select its command center, how to create hotkeys, and how to navigate the interface based only on visual input. They will make it use at least an emulation of a keyboard and mouse. It will not have access to the game code. Any deviation from this would defeat the entire purpose of deepmind AI. If you watch the video where they show off the atari AI, the first two hundred games it played of Breakout, it could hardly hit the ball with the paddle.
If google can manage an AI that teaches itself how to become the automaton 2000 with scouting and decision making from scratch I will still be completely amazed.
Watch this video from the 9:14 mark, anyone who is having doubts.
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
as artosis put it on 2 articles on ESPN, not limiting apm would be like putting a world class runner in a race with a car. completely unfair. (good articles by the way)
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
as artosis put it on 2 articles on ESPN, not limiting apm would be like putting a world class runner in a race with a car. completely unfair. (good articles by the way)
Putting humans against computers in chess is also completely unfair. That doesn't make the AI problem illegitimate, and it's not like the technology is even there to make the right decision fast enough. If AI takes one second to decide an optimal move, how much APM does it really have?
On March 29 2016 05:24 The Bottle wrote: You said "reducing Starcraft to a problem on a higher level of abstraction than pixels or coordinates isn't necessarily very hard either". For scripting that's true. For supervised and reinforced learning... well... I really think you have to work a lot harder to justify this statement. Unless you think scripting is the way that they're going to beat a pro player, in which case... well, I agree with all of Boxer's objections to that.
No, I'm not talking about a pure scripted bot, nor a pure AI. I'm imagining a hybrid as I hinted about earlier. And yes, I have basic laymans knowledge about both neural nets and the Monte Carlo search method. I have followed the development of Go programs since the very beginning and have seen revolution of Monte Carlo search myself, but I'm not even sure Monte Carlo or any variant of current tree search methods are applicable for Starcraft. I can't see how. I know I'm not being very specific but todays Starcraft scripts can play a decent game without ANY AI techniques at all, just hard coded rules. I think that using just a policy network would help a lot. At what "granularity" you want to apply it depends on how much money you want to throw at the problem.
We are not trying to find the best moves possible. We are trying to beat the world champion, or someone similar. You are again making this harder than it is.
But you do that by approximating the best possible set of moves you can make. That's what a machine learning algorithm does. You start with an ideal (in this case, there exists a set of moves that are maximally good) and you try to approximate this ideal as well as you can.
With unlimited resources perhaps, but a compromise may be good enough to beat a world champion, which Lee Sedol discovered. The games used to train the policy network were games from public Go servers. They weren't even professional games. It was good enough.
A hybrid of scripting at the small scale and deep learning at the large scale may be the best way to go, but there's still a huge problem, one that doesn't exist in simply defined discrete board games like Go. As you can probably predict I'm going to say, that problem is the coarse graining of the data itself.I think you're really underestimating the scale of this problem because it is a huge problem.
The problem is divided into two parts. Actually defining the set of "rules" (i.e. possible actions) that the computer can take, and then getting an algorithm to recognize, from the metadata of historic games, when those actions have actually been taken (i.e. transforming the set of discrete actions in the game into the actions that you created in your state space of moves).
The first is problematic because it's difficult to make the choices. You have to strike a balance between making the set of actions robust enough that the algorithm can actually learn interesting strategies on its own, yet constrained enough that the computation is tractable. Any time you design a new "move" you have to ask yourself, "how badly did I just restrict the freedom of the algorithm? I just set move A and move B to be equivalent, but how many scenarios exist in which the outcome of said moves are drastically different?" With a game like Starcraft, that's extremely sensitive to small missteps, you're pretty much never going to do this the "right" way. But will the cumulative flaws of your coarse graining scheme be small enough not to completely botch the execution? That's hard to answer, and I honestly don't know. But this is a huge problem, and it doesn't even exist for Go.
The second part is actually recognizing when the set of "moves" that you defined are actually executed based on the metadata of a game. The only real objective way to store the data of a game is to store the exact actions, since machines don't recognize heuristics such as "he transitioned from ultras to mutas and then diverted his forces to his main to harass his fourth". So you have to be able to recognize what set of actions correspond to what "move" in your coarse grained space, and that's well out of the realm of the methods they used for the Go algorithm.
Now maybe the above that I described is doable. (Assuming they do a mix of deep learning for high level strategy and scripting for small scale actions). In fact I hope it is, and I hope they make an amazing algorithm that can beat anyone. But it's way harder. And, what I have been trying to argue here is that the reason that it's harder is because of the sheer complexity of the space of possible actions in Starcraft, compared to Go. The entire reason the problems I listed above exist is because of that complexity. And the more sensitive the outcome of the game is to small deviations in those discrete actions, the harder it is to coarse grain them. And this is all just to create a workable training set for the model, whereas doing the same in Go is pretty much trivial, and explained in a couple of sentences in the paper.
We see videos of micro where a moved banelings follow marines off creep and lose. Maybe a human player won't a move his banelings off creep? Or where zerglings avoid splash from tanks, can they also do that against marines? Or medivac+tank micro against projectile attacks, maybe the human will target the medivac or build air units?
The amazing micro tricks an ai can do are limited in application but we see the videos and focus so much on how amazing one side of the micro is that we neglect to consider that the other player wouldn't take that engagement.
On March 28 2016 13:46 a4bisu wrote: APM is not the point. A large portion of human APM is meaningless and just for warming up purpose toward spikes in big fights.
SC is a totally different game to Go. People call it not "perfect information", meaning AI does not know exactly what its opponent doing in a SC game, not as completely as black and white stones on a Go board.
Suppose two medivacs are approaching the zerg AI's 2nd base, the AI does not know what's inside, marines, marines and mines, or empty? AI's need to predict the possible drop location, it is main base, or 2nd base, or one for each? All the possible scenarios requires defense strategies accordingly with consideration of limited resource and efficiency. All these scenarios are developing in real time and could shifting from each other within milliseconds as the medivacs make a boosted turn.
People keep bringing this up but how do people do it? They use their experience to predict whats going to happen. Sometimes they are wrong, sometimes they are right. The AI will be the same, probably a bit better because recall of existing information in the game is perfect (unlike a person who could forget some small thing).
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
as artosis put it on 2 articles on ESPN, not limiting apm would be like putting a world class runner in a race with a car. completely unfair. (good articles by the way)
Putting humans against computers in chess is also completely unfair. That doesn't make the AI problem illegitimate, and it's not like the technology is even there to make the right decision fast enough. If AI takes one second to decide an optimal move, how much APM does it really have?
Invalid comparison, computers in chess are not AI. They're exactly what we're trying to avoid.
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
That kind of micro is so stupidly effective that it would hardly need to play intelligently to secure the win. Defeats the purpose of what we're trying to achieve here.
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
as artosis put it on 2 articles on ESPN, not limiting apm would be like putting a world class runner in a race with a car. completely unfair. (good articles by the way)
Putting humans against computers in chess is also completely unfair. That doesn't make the AI problem illegitimate, and it's not like the technology is even there to make the right decision fast enough. If AI takes one second to decide an optimal move, how much APM does it really have?
Invalid comparison, computers in chess are not AI. They're exactly what we're trying to avoid.
Please explain how computer chess is not AI if you want to invalidate the comparison.
On March 29 2016 05:43 chocorush wrote: I think it will be more interesting to see how the AI does first before we start worrying about handicapping for it to be "fair" (is the point of showing an AI that can play Starcraft supposed to be fair?).
Considering how many seconds even the simplest move took to compute in Go, I'm not convinced that we would need to worry about too high of an APM rate to worry about the engineering problem more than the computer science problem.
The problem is the following. If you allow unlimited apm, it will move all units away from splash and keep units at distance. Thats a few lines of code, they wont even need a real AI and the human will be helpless. But that has more in common with a machine than with an intelligence. Its not the question we seek. We want to know if it can solve the problem in less than N inputs. That requires strategy!
Is that really the end result of unlimited apm? I expect the computer to learn that focusing only on micro won't win games. And how will it attack if it's scripted to stay out of range? It will still need to learn army composition and learn how to engage strategically.
Just requiring it to follow the same rules as humans like only being able to micro units while its focus is on the screen will require it to have to learn to ration its focus as a resource.
I don't expect it to be able to micro while thinking efficiently, unless it learns to do a lot of meaningless spamming while it decides the next best optimal move, which is pretty humanlike already.
as artosis put it on 2 articles on ESPN, not limiting apm would be like putting a world class runner in a race with a car. completely unfair. (good articles by the way)
Putting humans against computers in chess is also completely unfair. That doesn't make the AI problem illegitimate, and it's not like the technology is even there to make the right decision fast enough. If AI takes one second to decide an optimal move, how much APM does it really have?
Invalid comparison, computers in chess are not AI. They're exactly what we're trying to avoid.
Please explain how computer chess is not AI if you want to invalidate the comparison.
Apologies for answering something not directed at me but one of the main goals of deepmind AI is to create "General AI" rather than "Focused AI" so while a chess bot is certainly 'AI' it's not the TYPE of AI that deepmind is looking to create. The video I linked earlier explains this more in depth before the atari part.