[G] Starcraft Openings and Game Theory

Vetiton

United States7 Posts

September 27 2014 00:12 GMT

While there have been a handful of game-theory focused meta-game guides on TL over the years, we can make even more specific predictions if we are willing to wade a bit deeper into the mathematics. This guide partially reproduces some of the content from ninazerg's 2012 Broodwar Guide. Other threads like 00Visor's 2011 SC2 guide also draw general meta-game conclusions that are similar to this guide. Some attempts like CCGaunt's Vikings vs Banshees fail to fulfill the zero-sum condition, and don't make explicit predictions. However, by making explicit models and analyzing replays, we can go beyond qualitative explanation and make testable quantitative predictions about the nature of the meta-game.

In Starcraft II, the meta-game (often shortened to simply “meta”) refers to strategic decisions made made on limited information. While the meta has many components, one of the most important decisions is which units and buildings to build first. This is known as the “opening”, one of many Starcraft terms borrowed from chess terminology. Interestingly, there is no universal “best” opening for any of the three races (Protoss, Terran, or Zerg). In other words, if a player, A, knows ahead of time what opening their opponent, B, will choose, then A always has the opportunity to choose an opening which will give A a significant advantage. This is called “countering” B's strategy. In extreme cases where it is unreasonable to hope that B can overcome the disadvantage of being countered, B's eventual defeat is known as a “build order loss”. Conversely, A's victory is a “build order win”. While there is a large number of opening strategies, we can begin understand how they interact by simplification: we divide them into three broad categories.
* Aggressive openings – aggressive openings emphasize fielding military units early, and quickly attacking the opponent. In general, aggressive openings will sacrifice long-term economic development in exchange for short term military capability.
* Standard openings – standard openings are balanced strategies which both field enough troops to keep a player safe from enemy attacks, and a developed economy which will be able to support expensive armies later in the game.
* Greedy openings – greedy openings neglect safety from enemy attacks in order to expand the economy as fast as possible. If the opponent under-reacts, these openings are the ideal set up for a deadly late game army.

In the simplest conceptual model, these there categories of strategy interact just like a game of rock-paper-scissors:

[image loading]

However, oftentimes the magnitude of a player's build order advantage is not always the same across differing strategic interactions. To put it in terms of our simple example, a player with a standard strategy is only at a slight disadvantage to a greedy one compared to the build order loss of either standard play holding off cheese, or of cheese breaking down a greedy style.

[image loading]

Keeping this asymmetry in mind, we can build a basic theoretical model to help get a rough understanding of how the Sc2 meta-game works. Here, (x,y) represents (player 1 payoff, player 2 playoff). If you want to make this more concrete, you can pretend each payoff point represents a 25% advantage in winning: a “-2” as a 0% chance of winning, “-1” a 25% chance of winning, etc. However, the magnitude of the correlation is arbitrary. A “payoff point” could represent, say, a 5% advantage without affecting the mathematical results. The numbers here are just an example; the relative strengths of the openings are less important than understanding how they interact.

There's several things of note here. First, the two numbers in parenthesis always add up to 0. This is no accident. Starcraft II is a zero-sum game: if one player has an advantage, the other is at a disadvantage. If one player wins, the other must lose. Second, the payoffs are mirror images across the diagonal of the above table.(a symmetric game) Unlike the zero-sum condition, this is an artifact of our simple model. In reality, differences in races, skill, knowledge, map balance, spawning positions, etc. will almost always disrupt this symmetry to some extent.

There is an important distinction to make between a build order and an overall strategy. A very simple strategy may well be strict coherence to a single build order, but strategies can be a whole lot more. You can take account of the strengths and weaknesses of the individual opponent/map/race etc. or employ a mix of different builds. In game theory, a simple one-build strategy is a “pure strategy” whereas a “mixed strategy” picks from two or more builds at some given set of probabilities.

Let's take a look at the heart of meta-game strategy, countering your opponent. Pure strategies are simple to counter.(e.g. if you know your opponent always cheeses, your optimal response is to always play defensive) Mixed strategies are slightly more complicated to counter. However, because our model is straightforward, even the general case is not too complex:

For our strategy (x1,x2,x3) where x1,x2,x3 >= 0 and x1+x2+x3=1, and a payoff matrix Pij, and a constant opponent's strategy of (y1,y2,y3) the average payoff is

For our strategy to be optimal, the payoff must be at a maximum. This comes in several forms: a mix of one, two, or three strategies. Given that the one and two strategy cases are trivial, we can focus on the mix of three case. We can solve the three strategy mix this explicitly by differentiating with respect to x1,x2, and x3. Treating payoff as a multivariable equation, all the critical points must follow:

So, y2=y3, 2y1=y3 and 2y1=y2. Thus, our extrema is when our opponent plays (20% aggressive, 40% standard, 40% greedy).

What is the nature of this point? Without going deeper into mathematics, the answer is a Nash equilibrium. In a Nash equilibrium, neither player can improve their average payoffs by switching strategies. Specifically, because of the zero-sum condition, our model fits the conditions of Von Neuman's mini-max theorem, and this point is the mini-max point. For a general mini-max point, all non-dominated pure strategies fare equally well against the (mixed) mini-max strategy. In a symmetrical game like our model,(where Pij=Pji) this implies that players can adopt “safe” strategy which averages to even games, even if the opponent knows the exact percentages of that safe strategy.

Note that the equilibrium strategy is not necessarily an even mix of viable builds. Determining the Nash equilibrium is dependent not only on having a range of builds to draw from, but also understanding the strengths and weaknesses in each of them.

So why would players ever deviate from the Nash equilibrium? Does that only happen when they miscalculate the equilibrium solution? Not at all. The equilibrium in our model is not stable. That is, even though neither player gains anything from switching strategies in the long run, neither player loses anything from switching strategies either. An individual equilibrium player vs. unbalanced player game is as dead even as an equilibrium vs. equilibrium one. Therefore, the other viable option is “exploitative” strategies, which can pull a consistent advantage in the long term by taking advantage of imbalance in the opponent's or opponents' meta-game strategy, and do no worse than breaking even if there are no meta-game imbalances. However, unlike a safe strategy, an exploitative strategy can also result in a long term disadvantage if the opponent is able to consistently anticipate how you play.

The safeness of a meta-game strategy is not binary; it is possible to hedge bets between a completely “safe” strategy and purely “exploitative” one. A slightly imbalanced strategy will yield much less advantage(against a non-equilibrium opponent) than a full imbalance, though it comes with much less risk.(this risk/reward dynamic is in the nature of a zero-sum game)

Despite the fact that exploitative player behavior can remain outside the Nash equilibrium indefinitely,(i.e. you can choose to play greedy every game) interestingly, there is a tendency for the aggregate strategy to remain close to the Nash equilibrium as long as the players are part of what you might call an “open system” in analogy to thermodynamics. This is true even in the extreme case where there are no equilibrium strategy players. The mechanism that “enforces” the meta on a broad scale is of course, a player's hidden MMR. If the aggregate meta-game strays from the Nash equilibrium at a certain skill level, weaker players playing a counter-strategy will be brought into the system because of their comparative meta-game advantage.(e.g. if all the masters players are playing greedy 100% of the time, aggressive diamond players will gain an advantage against the masters players, helping them toward promotion, whereas safe/greedy diamond players' chances will be unaffected or even worse) If there is an upper end of the league, counter-strategy players will remain near the top of the system, creating even stronger demand for counter players at the lower end. Note that this is not true for the bottom of the skill range, where there is both no way to either demote bad players out of play or bring in additional worse players to balance the meta.

Now for some prediction: we should expect that the frequencies of a set of particular openers to be predictable from the win percentages of each build order match-up. If we know how good each strategy is, we can predict how often it is played.

As a test case, take the meta-game of ZvZ at the masters level. In order to test the three build model, divide builds into three (nearly arbitrary) categories: pool before 75 seconds (aggressive), pool between 75 and before 130 seconds (standard) and pool 130 seconds and after (greedy). The main concern is that none of the strategies are dominated by the others, because of the trivial results.(each of our pure strategies should be the optimal response to another of our pure strategies) Replays were worldwide ZvZ replays with an average league rating of "masters" during a roughly two week period. After 101 replays, the results were tallied by the winner's strategy and loser's strategy:

[image loading]

Big thanks to both the sc2reader project for providing the tools for automated replay analysis, and ggtracker.com's collection of replays for providing the large number of replays needed for a reasonably representative sample.

As you can see, the resulting frequencies of each build are relatively well-determined by our theoretical model. This is not generally true if the values of Pij were arbitrary, because adding in additional “standard” replays at the same win percentage would cause predicted frequencies to be at odds with our sample. So the model both works, and makes non-trivial predictions.

In summary, players can average out the effects of the meta-game in the long term by playing at the Nash equilibrium, a probabilistic mix of build openers. This is the “safe” meta-game strategy. Alternatively, if you believe the opponent is playing outside the Nash equilibrium, you can exploit this by taking the risk of purposeful imbalance in your own strategy. The higher the imbalance, the greater the advantage/disadvantage in payoff.

Overall, this is not intended to be a complete guide for how to open. The meta-game shifts all the time, and many meta-game skills are very difficult/impossible to emulate mathematically, like extrapolation from limited data, and creating better “boxes” to group strategies. This is just a basic framework for understanding what meta-game choices are available to a player, and how these affect their chance of winning. This is also just the tip of the iceberg. There are obviously more than three openers, more meta-game decisions than the opening, asymmetry between the players, additional considerations for formats other than a best of 1, and much, much more. The meta-game is a big place. I'd love to hear your thoughts.

DarkPlasmaBall

United States43529 Posts

September 27 2014 01:54 GMT

Awesome game theory post

I like the computations, and I agree with them (in the abstract sense).

It's unfortunate (or maybe fortunate?) that due to the nuanced imbalances in the game and skill differences and comfort zones between players, we cannot exactly map the mathematics onto what the players should and should not do. You make this quite explicit though, so I'm enjoying this more from a mathematics entertainment perspective than a rigorous ordering of how to play the game.

brickrd

United States4894 Posts

September 27 2014 02:10 GMT

you clearly put a lot of thought into this, but i'm skeptical of how much it can add to a ladder player's strategy or success. there are many different types of builds which all interact with each other in unique ways, so reducing everything to a wheel of counters is just incorrect. the arbitrary nature of defining which builds fall under which categories as well as asymmetrical balance make this concept so unstable that i don't think it's much more than an interesting thought experiment. you do allude to this, but i think it's a pretty major issue

for example your analysis of zvz builds based on pool timing seems haphazard and not attuned to all the ways the matchup can work. gas pool hatch or pool gas hatch go down after 75 seconds but are extremely aggressive and not at all standard. also, some aggressive openers are easier to transition out of than others - for example if you 6pool and dont kill a bunch of drones youre dead, but in some cases you can 2rax, do almost no damage and still float home into a normal game. for protoss, FFE vs gate expand could both be considered safe or greedy against different forms and timings of aggression, a gasless 9pool is a different attack from a speedling allin. builds have branching degrees of "safety" at different stages of the game, not only within the meta but within themselves

and on top of that you cant account for a players personal strengths/preferences/tendencies. for some people aggression is a high percentage play even if its expected because it is their style and they feel comfortable putting on pressure and microing small groups of units. if youre MC you can chrono out 2 zealots and a stalker and say "i dare you to make this a bad investment" and it works for him because that's just what he does best

small nitpick also - i would say "safe" is the preferred term over "standard" because playing in between greedy and aggressive is not always what is standard. the standard pvx meta is extremely greedy

Liquid`Jinro

Sweden33719 Posts

September 27 2014 03:29 GMT

Really awesome post. I wanted to do something like this for TvZ (reaper expand/2rax vs pool first/hatch first) but ran into a problem of the sample size on pool first being really tiny (+ I dont have sufficient math skills to actually work it all out I think).

Plus the difficulties of the subcategories --- i.e pool first with no lings counters 1 reaper expand really well, but is almost as bad as a hatch first vs proxy 2 rax etc.

Vetiton

United States7 Posts

September 27 2014 04:05 GMT

On September 27 2014 11:10 brickrd wrote:
you clearly put a lot of thought into this, but i'm skeptical of how much it can add to a ladder player's strategy or success. there are many different types of builds which all interact with each other in unique ways, so reducing everything to a wheel of counters is just incorrect. the arbitrary nature of defining which builds fall under which categories as well as asymmetrical balance make this concept so unstable that i don't think it's much more than an interesting thought experiment. you do allude to this, but i think it's a pretty major issue

for example your analysis of zvz builds based on pool timing seems haphazard and not attuned to all the ways the matchup can work. gas pool hatch or pool gas hatch go down after 75 seconds but are extremely aggressive and not at all standard. also, some aggressive openers are easier to transition out of than others - for example if you 6pool and dont kill a bunch of drones youre dead, but in some cases you can 2rax, do almost no damage and still float home into a normal game. for protoss, FFE vs gate expand could both be considered safe or greedy against different forms and timings of aggression, a gasless 9pool is a different attack from a speedling allin. builds have branching degrees of "safety" at different stages of the game, not only within the meta but within themselves

and on top of that you cant account for a players personal strengths/preferences/tendencies. for some people aggression is a high percentage play even if its expected because it is their style and they feel comfortable putting on pressure and microing small groups of units. if youre MC you can chrono out 2 zealots and a stalker and say "i dare you to make this a bad investment" and it works for him because that's just what he does best

small nitpick also - i would say "safe" is the preferred term over "standard" because playing in between greedy and aggressive is not always what is standard. the standard pvx meta is extremely greedy

Your main criticism, as I seem to understand it, is pretty fair. Applying this sort of model naïvely is definitely much worse than the accumulated game knowledge of an experienced player. The decision to use pool timings in ZvZ was more a matter of convenience, rather than revealing of the most meaningful differences in strategy.

That said, this methodology is much more flexible than I worked out here. You could define categories of builds for your own play (not the general ladder) and get a general idea of whether or not your style is weak to certain builds, and use that knowledge to adjust your opening choices vs certain players/styles. It would make the calculation a marginally harder/more uncertain, but you could add a fourth column to differentiate gas from gassless ZvZ, for example. You can work this out intuitively of course, but the numbers might surprise you.

The "safe"/"standard" swap was something I actually considered. I used the word "safe" in reference to the a metagame style, so I thought using "safe" in reference to a build order might make it more confusing. Maybe I should have found a different word for the opposite of an exploitative meta-game. You're correct on the standard lingo though.

Whitewing

United States7483 Posts

September 27 2014 04:55 GMT

I've been working on a project that similarly includes a lot of this information, you did a good write-up.

Zheryn

Sweden3653 Posts

September 27 2014 09:18 GMT

Really awesome post, I love reading game theory posts

Cheren

United States2911 Posts

September 27 2014 14:43 GMT

Holy shit it's actual game theory and not someone calling it game theory when they make a theoretical post about a video game.

entropy.

Great Britain25 Posts

September 27 2014 17:19 GMT

Amazing post. Thanks!

Vetiton

United States7 Posts

September 27 2014 22:42 GMT

#10

On September 27 2014 12:29 Liquid`Jinro wrote:
Really awesome post. I wanted to do something like this for TvZ (reaper expand/2rax vs pool first/hatch first) but ran into a problem of the sample size on pool first being really tiny (+ I dont have sufficient math skills to actually work it all out I think).

Plus the difficulties of the subcategories --- i.e pool first with no lings counters 1 reaper expand really well, but is almost as bad as a hatch first vs proxy 2 rax etc.

Sample size definitely seems to be the biggest hurdle to taking a broad view. You would theoretically get more accurate results if you divide builds into more categories, but dividing things up into 8 pure strategies for each player means you have 64 win percentages you have to get reasonable estimates for. Choosing a handful of good "boxes" will probably be more reliable. From what I understand, this is what many players do already: I remember day9 saying that he used to divide PvZ Broodwar openers into something like (1gate+gas/2gate/forge fe). Alternatively, you might be able to just estimate the win percentages for build interactions you don't have hard numbers for/experience with.

Understanding all the details of the math isn't super simple, but the actual legwork of the calculation is mostly just a matter of changing renaming x1,x2,x3 as x,y,z and plugging the equations into wofram.

Liquid`Jinro

Sweden33719 Posts

September 28 2014 01:55 GMT

#11

Thanks for the answer, gonna give it a try later. My gut feeling is that for how rarely zergs pool first, 2 rax is way under utilized from an exploitative standpoint.

atorkaman

1 Post

December 12 2015 09:57 GMT

#12

You use the construction time of Pool building to categorized three strategies for Zerg. What about Protoss and Terran? Which building and construction time can categorize their strategies?

Please or register to reply.

[G] Starcraft Openings and Game Theory

Completed

Ongoing

Upcoming