|
I think you may be making a mistake here. If you cap AI mechanical performance to something reasonably high (350, say), then humans and AI are both approaching if not basically at the asymptotes for win% gain on the mechanical front. In other words, improving your AI's mechanics by a lot over these 1000 games per day isn't going to give you much of a gain in your AI's ability to win games. Most games among pros are not won on the basis of mechanics alone. Most of it is based on information, the inferences made from that information, and proper response. Mechanics is easy. How you approach any given situation given the information you have is hard.
The point that a lot of people keep bringing up in terms of the AI's shortcomings is the strategic and situational variability. Again, 1000 games is nice, but you need to be able to form good generalizations over those games in order for them to apply in a given circumstance. If you're playing 1000 games a day for 2 years of development, I can't see how you're not overfitting. Top pros aren't approaching the game from the standpoint of a massive chunk of data. They have already extracted the meaningful generalizations about most situations. 1000 games a day isn't going to do much but give the AI improvements in the marginal areas of win% gain. I say this because "strategy" and mechanics aren't so much where the game is won.
The bulk of the game is scouting and reacting. It's about knowing the right inferences to make for a relatively small amount of information. The right way to approach teaching an AI how to do that may or may not take the form of a massive chunk of data, that's an empirical question, but given the methods that will probably be used to train these AIs, tuning them to make the right inferences for an enormous space of possibilities is a huge challenge. But that's where games are won. Some are won with mechanics, sure, and some are won with strokes of brilliant strategy, but in reality, most games are won by making accurate inferences from little information and then knowing the right response and executing it.
That's basically the opposite of what AI is good at. AI is good at making accurate inferences from an enormous quantity of information, especially when there's no information asymmetry. It's a much tougher task than you're making it out to be.
I agree with most of what you said about "strategy" and mechanics and how scouting/reacting is most crucial to winning games. However, I think you may be thinking in the wrong perspective here as a human. Scouting/reacting is not human-exclusive abilities. They are still within the boundaries of learn-able information during the training. For example as a Zerg, the AI can generalize the strategy as: "if I didn't see a natural at X min, I need to sacrifice an overlord to scout. If I see Y amounts of certain units, I need to adopt plan B" etc. If the game samples for training is carefully chosen to cover a wide range of excellent scouting/reactive actions, then in theory the AI has no problem learning from them. It's no different than say, learning active actions like build-order wise "strategy" and mechanics.
To elaborate more, for the double medivac drop in TvZ, the Zerg AI can precisely keep track of the exact number of marines and another other units/SCVs and make optimized defense strategy based on map length, and thus able to maximize drone count before making defensive lings at the last moment. And it can have a lot of wiggle room to decide on the best number of lings depending on maps and other situations which even top human players are impossible to keep track of.
|
I came here for jokes about Innovation and found none. What has happened to all the quality shitposting in this place?!
|
On May 27 2017 00:42 Heartland wrote: I came here for jokes about Innovation and found none. What has happened to all the quality shitposting in this place?!
We're in mourning 
World's best Go player flummoxed by Google’s ‘godlike’ AlphaGo AI https://www.theguardian.com/technology/2017/may/23/alphago-google-ai-beats-ke-jie-china-go
After his defeat, a visibly flummoxed Ke – who last year declared he would never lose to an AI opponent – said AlphaGo had become too strong for humans, despite the razor-thin half-point winning margin.
“I feel like his game is more and more like the ‘Go god’. Really, it is brilliant,” he said.
Ke vowed never again to subject himself to the “horrible experience”.
|
United States889 Posts
On May 27 2017 00:40 cutha wrote:Show nested quote +I think you may be making a mistake here. If you cap AI mechanical performance to something reasonably high (350, say), then humans and AI are both approaching if not basically at the asymptotes for win% gain on the mechanical front. In other words, improving your AI's mechanics by a lot over these 1000 games per day isn't going to give you much of a gain in your AI's ability to win games. Most games among pros are not won on the basis of mechanics alone. Most of it is based on information, the inferences made from that information, and proper response. Mechanics is easy. How you approach any given situation given the information you have is hard.
The point that a lot of people keep bringing up in terms of the AI's shortcomings is the strategic and situational variability. Again, 1000 games is nice, but you need to be able to form good generalizations over those games in order for them to apply in a given circumstance. If you're playing 1000 games a day for 2 years of development, I can't see how you're not overfitting. Top pros aren't approaching the game from the standpoint of a massive chunk of data. They have already extracted the meaningful generalizations about most situations. 1000 games a day isn't going to do much but give the AI improvements in the marginal areas of win% gain. I say this because "strategy" and mechanics aren't so much where the game is won.
The bulk of the game is scouting and reacting. It's about knowing the right inferences to make for a relatively small amount of information. The right way to approach teaching an AI how to do that may or may not take the form of a massive chunk of data, that's an empirical question, but given the methods that will probably be used to train these AIs, tuning them to make the right inferences for an enormous space of possibilities is a huge challenge. But that's where games are won. Some are won with mechanics, sure, and some are won with strokes of brilliant strategy, but in reality, most games are won by making accurate inferences from little information and then knowing the right response and executing it.
That's basically the opposite of what AI is good at. AI is good at making accurate inferences from an enormous quantity of information, especially when there's no information asymmetry. It's a much tougher task than you're making it out to be. I agree with most of what you said about "strategy" and mechanics and how scouting/reacting is most crucial to winning games. However, I think you may be thinking in the wrong perspective here as a human. Scouting/reacting is not human-exclusive abilities. They are still within the boundaries of learn-able information during the training. For example as a Zerg, the AI can generalize the strategy as: "if I didn't see a natural at X min, I need to sacrifice an overlord to scout. If I see Y amounts of certain units, I need to adopt plan B" etc. If the game samples for training is carefully chosen to cover a wide range of excellent scouting/reactive actions, then in theory the AI has no problem learning from them. It's no different than say, learning active actions like build-order wise "strategy" and mechanics. To elaborate more, for the double medivac drop in TvZ, the Zerg AI can precisely keep track of the exact number of marines and another other units/SCVs and make optimized defense strategy based on map length, and thus able to maximize drone count before making defensive lings at the last moment. And it can have a lot of wiggle room to decide on the best number of lings depending on maps and other situations which even top human players are impossible to keep track of.
I don't think we really disagree here at a fundamental level. I agree that the AI can learn a lot of the things that are needed. At a general level, I was disagreeing with two ideas that I've seen presented. First, that an AI learning Starcraft is a "lots of data" question, which is the answer to a lot of learning problems but for various reasons I contest that in this case. Second, that it's in the margins of mechanics or strategic insight that the AI will win games. It's going to have to win games just like everybody else: making inferences from limited information. I think we probably agree on both of these points.
I think where we probably disagree is that I think the training method isn't probably going to be best done by a careful sample. I just really really don't think that Starcraft is the kind of problem that can be solved in the way that games like Go or Chess are. Those you can train with thousands if not millions of games and get great results. But at least in Chess if not Go, the whole board is known completely to both players. The AI doesn't have to make inferences about what the actual state of affairs is, because the actual state of affairs is known. When it has to start making those judgments, even if they are high reliability judgments like I didn't see natural at X minutes then do Y, you're opening up a brand new world of complexity.
|
Neural networks are already known to be strong classifiers of X or not X (ex. spam or not spam). Thus, they already make inferences from limited information.
|
While its true that AIs have a harder time in partially observable environments, I don't think it'll take more than a decade for AIs to beat humans at SC2. And that's a conservative timeline in my opinion. Go AIs weren't predicted to beat humans before another 30 years just 2 years ago.
But if I was to build a NN to determine if a mail is spam, I would feed it the whole email instead of a few binary values on wether a word is there or not. This sounds more like a naive bayes approach.
|
I heard Google's new AI "AlphaSC2" is ready and will be tested tomorrow in the GSL.
|
I don't know why it's put up as some mystical bonjwa inference mastery on predicting possibilities of your opponent's build order and strategy. I don't think it's all that complicated of a decision tree.
|
If AI is allowed unlimited or 1000+ APM at all times then no human will beat it within a year. If they were given a cap of 400 then I don't see an AI beating a human for a long time.
|
Nah, the AI will adapt. It might even use its extra computational power to, in 1 ms, assess which of 10-100 potential actions are likely to have the most effect on their chances of winning. Sort of a real-time Most Effective Actions calculator.
This would be interesting as it could be tuned to always maintain its APM lower than its opponent.
|
I think where we probably disagree is that I think the training method isn't probably going to be best done by a careful sample. I just really really don't think that Starcraft is the kind of problem that can be solved in the way that games like Go or Chess are. Those you can train with thousands if not millions of games and get great results. But at least in Chess if not Go, the whole board is known completely to both players. The AI doesn't have to make inferences about what the actual state of affairs is, because the actual state of affairs is known. When it has to start making those judgments, even if they are high reliability judgments like I didn't see natural at X minutes then do Y, you're opening up a brand new world of complexity.
I did misinterpret you in the previous post. But I think what I said still stands - all the winning strategy regardless of forms, be it reactive defense, aggressive all-in, or pure superior mechanics, are all very reasonable trainable knowledge. What you are basically saying here is that it is impossible to make "perfect" judgement due to fog of war, so there has to be always some kind of educated guessing and gambling involved in the game. And this is different from chess/Go, since all the pieces are always visible on board. However, from knowing exactly the current "state" of the game, AlphaGo is playing by its trained neural network which is based on human experience plus its own reinforced learning. There is no way to play it perfectly based on current state of the game because there are an unimaginably large number of variations for future moves. In this regard, that unknown factor due to large number of variations is similar to the unknown factor in Starcraft 2 due to fog of war. If you compare the strategic complexity of Go one player can employ given a certain state of the board, with the number of popular choices any top SC2 player would do given an in-game situation, it seems to me SC2 is complete childplay. Think it from another perspective, a top SC2 player needs to decide his reactive actions based on scouting information within seconds, but a top Go player may often need minutes at any turn. The hard part of SC2 for AI is how to achieve balanced performance among a multitude of different aspects like mechanics, micro based on restricted APM, reactive actions etc. But for the strategic part, if AlphaGo can conquer Go, SC2 is a no-brainer in my opinion.
|
this is laughable.
it would probably be pretty easy to make an AI that dominates humans.
-> If there is no APM limit, then i guess we all agree. For example just pick Zerg and go muta.
-> no apm limit, still go for attention-intensive strategies. Let's not forget that even tho the computer can only use a limited amout of APM, it can still 'think' a LOT about every single click. From the point of view of mechanics, it could be better than Flash playing the game on slowest speed setting.
|
Depends - would the AI be able to have unlimited APM? Or would there be a cap to APM. If there is an APM cap, then strategy would be more important, and it would have a tougher time.
One of the key ideas that made alpha GO work is that they looked at the probability either side would win given a position on the board, if the rest of the game were played out using random moves. They then did monte carlo simulations to play those out, and used that to evaluate how good a position was. That assumption won't work in a game like starcraft.
https://www.tastehit.com/blog/google-deepmind-alphago-how-it-works/
|
So, in the article
"AlphaGo relies on two different components: A tree search procedure, and convolutional networks that guide the tree search procedure. The convolutional networks are conceptually somewhat similar to the evaluation function in Deep Blue, except that they are learned and not designed. The tree search procedure can be regarded as a brute-force approach, whereas the convolutional networks provide a level on intuition to the game-play."
The monte carlo method that you mention is the tree searching, but, as above, there seems to be more to AlphaGo.
Of course, they will have to build new models for starcraft, otherwise the notion of a 'move' isn't well defined even.
|
Fiddler's Green42661 Posts
On May 27 2017 03:52 niteReloaded wrote: this is laughable.
it would probably be pretty easy to make an AI that dominates humans.
-> If there is no APM limit, then i guess we all agree. For example just pick Zerg and go muta.
-> no apm limit, still go for attention-intensive strategies. Let's not forget that even tho the computer can only use a limited amout of APM, it can still 'think' a LOT about every single click. From the point of view of mechanics, it could be better than Flash playing the game on slowest speed setting.
That defeats the entire exercise of making the AI. It's supposed to try to outsmart so the APM will be limited.
|
Can't wait to see what race the AI favors. This might even change depending on what APM setting it's on. Well, and the map come to think of it.
Apparently in Go, it gives a slight edge to the white stones (playing 2nd).
Unlike in the first round, AlphaGo played the black stones, which means it played first, something it views as a small handicap. "It thinks there is a just a slight advantage to the player taking the white stones,” AlphaGo’s lead researcher, David Silver, said just before the game. And as match commentator Andrew Jackson pointed out, Ke Jie is known for playing well with white.
Oh, it also defeated a team of 5 Champions today + Show Spoiler +
|
Let's not forget that even tho the computer can only use a limited amout of APM, it can still 'think' a LOT about every single click.
Ya, considering 400 apm gives it an average of 2.5 milliseconds per click and modern processors run at around 4 GHz, that's 10 million raw CPU cycles per click and "AlphaGo ran on 48 CPUs and 8 GPUs and the distributed version of AlphaGo ran on 1202 CPUs and 176 GPUs."
|
4 pool vs the computer. All that counts is micro. No macro can save it
|
On May 27 2017 04:28 mishimaBeef wrote: So, in the article
"AlphaGo relies on two different components: A tree search procedure, and convolutional networks that guide the tree search procedure. The convolutional networks are conceptually somewhat similar to the evaluation function in Deep Blue, except that they are learned and not designed. The tree search procedure can be regarded as a brute-force approach, whereas the convolutional networks provide a level on intuition to the game-play."
The monte carlo method that you mention is the tree searching, but, as above, there seems to be more to AlphaGo.
Of course, they will have to build new models for starcraft, otherwise the notion of a 'move' isn't well defined even.
Also in the article: value of a state = value network output + simulation result
I'd be interested to see how much they weighted the monte carlo vs the value network (the convolutional neural net). It sounds like trying either one solo did worse than the combination. So both are needed. But I don't think the monte carlo part wouldn't work in starcraft, because you can't just play random moves in an RTS. Furthermore, in a turn based game, you can only make one move per turn, so you can easily simulate resulting positions from a current position. In RTS, you can move multiple units, with different abilities, and combinatorial explosion would be disastrous.
Still, if I understand the article correctly, the neural net was used to evaluate positions and classify "good" or "bad" positions. It was trained by playing games against itself. The input to the neural net would presumably be the positions of the pieces. Currently neural networks take a long time to train, and every hidden layer you add the slower it gets. In a game like starcraft, there would be far more inputs needed to represent a current given position than in Go, and getting the NN to converge would take much longer.
|
Yeah if you consider move = click, then it explodes. But usually you think in terms of high level "moves" (tech to vessel, pump marine medic, deflect muta) and use clicks to implement the higher level strategic "moves".
|
|
|
|