Model for imbalance, with myths

Warble

137 Posts

May 20 2011 05:08 GMT

Warning: Maths.

Aim

The SC2 community is passionate about balance, with many feeling their race is imbalanced and others feeling that the game is balanced. The commonality between the 2 groups is that their basis comes from feelings and we have begun to see forays into more objective evidence using statistical data about the game.

Unfortunately, statistics can be counterintuitive (like the rest of maths, in my opinion), so it’s easy for misconceptions to arise amongst those who don’t use it regularly. I think it’s cool that people have started to use data to back up their beliefs, and hope to see more, so my contribution here is to improve the quality of analysis by laying some framework for the community, and also to dispel a few common misconceptions.

For a start, I believe that beginning with the idea of imbalance is the wrong way to start. What a lot of people do is begin with the idea of imbalance, and then seek data to back up their opinion. It is better to develop an objective system that we can apply independent of our beliefs, and that’s what I aim to start here and hope that we can develop and improve this.

I believe most people will find the outcomes surprising. I have also tried to be thorough, but errors may have snuck through, so I will try to correct any errors as they are pointed out. Feel free to build on this or even create something better.

Here I will try to present some unbiased ways to think about balance in SC2. Some ways I will try to be unbiased are:

I will avoid reference to the current state of the game (jokes exempted) and just focus on general concepts.
I will present the methodology for identifying balance without having looked at the data. The idea here is that once we have developed an objective way to identify balance, once we apply that to the data we must accept whatever conclusions it leads to. I invite everyone who wants to contribute to this to avoid looking at the data as much as possible too.
I will not propose any balance changes, either directly or by insinuation.

In my examples I will be using terran as the bad guy for illustrative purposes only, not because they’re necessarily overpowered. The Myth Busting section also has me profess my love for Jinro. Everybody loves the bad guy.

Background

+ Show Spoiler +

First, a bit about me: I played zerg in season 1 (the reaper era) and switched to terran in season 2. All right, enough about me.

As for this analysis, since I started playing SC2, I have found the idea of balancing a game fascinating and have been playing around with the mathematics involved. I tried to develop some simple models to use when thinking about the game, but the level of complexity of the game is quite high. My aim was to be able to derive simple ideas about balance without having to resort to simulations.

So I’ll present here a simple model that doesn’t use any of that, but will provide the groundwork to help think about game balance, how it will affect players, and how it might be detected from game statistics. I’m sure Blizzard already has something much better in place, so this is aimed mostly at helping those in the SC2 community who seek to better understand the statistics related to this game.

Who should care?

+ Show Spoiler +

If you’re interested in an objective way of thinking about game balance, this could be an interesting read for you. Especially if you have a background in maths/statistics, but in that case this will all be very basic for you.

If you’re thinking about going pro, you could find this helpful in making your decision – or helping you to choose your race. I personally believe that SC2 is impossible to balance as it is and might make another post explaining why if there is interest in this sort of thing if there are requests for it (I’m hoping HOTS will introduce different mechanics that will change things). If you’re going pro, you’ll be staking your future on the ability of Blizzard to balance the game – or perhaps to make your race OP.

If you run a website that gathers and presents SC2 statistics, you might consider a few of these ideas when developing tools for your site if your aim is to contribute to SC2 discourse.

If you are tired of balance discussions, you might gain a few ideas here.

If you would like to know how we can identify imbalance in the game using available statistics, go ahead.

Most of you should jump to the Outcomes tab and I think the Myth Busting tab can be interesting to a wider audience.

I think the vast majority of players won’t gain much from reading the technicalities. They’re there more so that those understand it can review my analysis to make sure it’s tight and maybe even build on it. But we all know that even if you don’t have a technical background you’re going to skim through it anyway so I’ve tried to make it easy to read and understand.

Preliminaries

+ Show Spoiler +

Don’t worry, there won’t actually be that much maths here. It’s used mostly for convenience, becauseI still want to make this analysis accessible to many players. It’s just easier to present using concrete examples people can picture in their heads than with generalised equations.

Let’s start by defining our league distribution matrices for each race. We’ll consider bronze, silver, gold, platinum and diamond for a neat 20% split between our elements, with shorthand B, S, G, PL and D.

To make things easier, assume that each race has 100 players and has an even distribution between skill levels, i.e.:

T = P = Z = [20 20 20 20 20]

We assume that the probability of player i beating player j is a function f(x) in [0,1] where x = (player i's effective skill – player j’s effective skill) and that f’(x) > 0 for x > 0. We also stipulate that for x < 0, f(x) = 1 – f(|x|).

For those quivering in their boots right now, we’ve basically saying that the higher skilled player is more likely to win the larger the skill gap, and then we just made sure our function was consistent in both directions.

We’ll also assume a uniform distribution of skill gaps, i.e. moving from B to S is just as difficult as moving from S to G.

We’ll assume that players are matched randomly against those around their rank, i.e. on average someone won’t get a lucky streak and have 50 games in a row against a race where they have a favourable racial imbalance (since it will be counterbalanced by those without lucky streaks).

Players will be put in a league where their effective skill means they have an average 50% chance of winning and losing against their equally ranked peers.

We also have some definitions:

Actual skill: the skill level of a player when there are no imbalances of any kind.

Effective skill: the skill level of a player after benefiting from imbalances. This will be matchup specific.

Racial imbalance: the imbalance in a race. Will be matchup specific. It contributes to a player’s effective skill. Keep in mind that we’ll only be applying this factor to one race for each imbalance to avoid double counting it.

Absolutely overpowered: a race is equally overpowered against both other races. So terran is absolutely overpowered if it is OP against Z and OP against P, and its level of OPness is the same against both races.

Don’t worry, we’ll be relaxing some of these assumptions later and you’ll get to enjoy more complex analysis. Won’t that be fun?

Analysis

+ Show Spoiler +

We will consider different scenarios for racial imbalances and consider both the short-term effects on statistics and the long-term equilibrium.

For simplicity, let’s consider that the racial imbalance raises terran’s effective skill one league higher than his actual skill. So a G terran will be playing at the PL level. All the terrans will basically get promoted and take the place of their unfortunate peers in other races, so we’ll get:

Case 1

+ Show Spoiler +

Terran is made OP against both Z and P. Z and P are balanced against each other.

Long-term equilibrium

T = [6 20 20 20 34]

P = Z = [27 20 20 20 13]

Those of you with a background in maths can probably already see that matrices are not the best way to present this. For the rest of you, don’t worry; the conclusions we draw from this will still hold for a more rigorous model.

Since this is the first case, I’ll walk through the process and trust that if you can read and understand this, you can apply it to the other cases yourself.

So why does this shift happen? Because there are 300 players and Blizzard wants 60 players in each league. Each PL terran now plays at the diamond level while the D zergs/protosses still play at the D level (remember, we want to avoid double counting. Although this technicality doesn’t actually matter – it just means we’ll change the magnitude of the racial imbalance a little bit and our conclusions are still the same). The problem is that with our matrices, we don’t have a continuum but rather 5 pegs to put players in. So the D zergs will lose a lot to the actual D terrans and be evenly matched against the actual PL terrans. So the PL terrans will be promoted. But they won’t all be promoted because there’s not enough room. Why? Because not enough zergs and protosses will be demoted.

The PL terrans will also be losing a lot against the actual D terrans (the same amount as the D zergs will be losing to actual D terrans, in fact). So we have 20 actual D terrans, 20 effective D terrans, 20 effective D protosses and 20 effective D zergs vying for a slot of 60 places. The 20 actual D terrans stay in D while the 60 remaining players have equal claim to the 40 places that remain. So they get 2/3 of it each. 2/3 of 20 is 13. We’ll round a bit by giving the remainder to terran because precise numbers don’t really matter – this is for illustration.

So what’s the outcome? Terran should dominate top D – the other races simply can’t compete. The top 20 players are all terrans. Then we have an even distribution of races in the lower 2/3rds of diamond, randomly allocated.

In bronze, we have the bottommost B being zerg and protoss – bottom 40 bronze is randomly allocated between the two races. Top bronze has terran in it with an even distribution as the other races.

In between, from S to PL there’s no difference.

How does it feel?

Once at equilibrium, the game will feel balanced to the players even though it objectively isn’t. The only way to tell will be from examining the data for the top and bottom levels. More on this in a later section.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvX win rate, which then evens out at 50%.
PvZ win rates constant at 50%.

Note that these are win rates against equally ranked opponents (not equally skilled). This is important because remember our purpose is to see how we can find these patterns in data, and we can observe a player’s rank and win ratio but cannot observe their skill – actual or effective.

If terran was made to be UP against both other races, we just see the reverse of everything.

Case 1A

Say we’re currently in Case 1, then Blizzard recognises that terran is OP and nerfs it approprlately.

Long-term equilibrium

T = P = Z = [20 20 20 20 20]

We just return to our balanced state.

Short-term effects

Spike in XvT win rate (crossing the 50% mark), which then evens out at 50%.
No change in PvZ win rates.

So it’s just the reverse.

Case 2

+ Show Spoiler +

TvZ now favours T. TvP and ZvP are balanced.

Long-term equilibrium

T = [14 20 20 20 26]

P = [20 20 20 20 20]

Z = [26 20 20 20 14]

Since this one is a bit different to Case 1, I’ll explain this one too.

To understand why it looks like this, consider what happens if T gets promoted halfway up and Z gets demoted halfway down. TvZ would be even based on effective skill (they’ll win 50% against each other), but T is facing P that’s half a league higher so will be losing more than half of their games. This means that overall T will be winning less than 50% of their games. And similarly Z will still be winning more than 50% of their games.

So T will be promoted less than halfway up and Z will be demoted less than halfway down. Because of the way we’ve defined f(x), we know this move must be 1/3rd of the way because then T will be playing Zs 1/3rd lower in effective skill and Ps 1/3rd higher and it will balance out.

How about the protosses? They will be playing against terrans with a lower effective skill (since there’s no imbalance in TvP) and so will be winning more, but at the same time will be playing against zergs with a higher effective skill. Due to the symmetry, this will even out so the protosses will not move.

So we have 20/3 terrans moving up from PL to D. We can just round this to 6 terrans move up and 6 zergs move down.

So what’s the outcome? T and P both dominate top D. T will have an edge in wins at top D, followed by P, followed by Z. Since zerg was shifted 1/3rd of the way down, we have to go 1/3rd of the way down diamond until we see zergs, whereupon there will be an even distribution between all 3 races. Note that T doesn’t necessarily dominate the top more than P because of the way we’ve defined how players are matched up – terrans will be winning only 50% against their protoss peers at the top. We’ll look at this in more detail later.

The opposite story occurs at bottom B.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then evens out higher than 50%
Gradual rise in ZvP and PvT win rates to the same level that the TvZ win rate eventually settles.

Case 2A

Say Blizzard recognises that TvZ is OP and nerfs it approprlately.

Long-term equilibrium

T = P = Z = [20 20 20 20 20]

We just return to our balanced state.

Short-term effects

Spike in ZvT win rate (crossing the 50% barrier to the other side), which then evens out to 50%.
ZvP and PvT win rates gradually settle back to 50%.

Case 2B

Say Blizzard recognises that TvZ is OP and buffs ZvT, but unintentionally buffs ZvP.

This case effectively balances TvZ and TvP but makes ZvP OP. It’s the same as Case 2, but with Z being the villain. So it still feels like rock-paper-scissors.

Short-term effects

Spike in ZvP win rate, which then evens out higher than 50%.
Drop in TvZ win rate (crossing the 50% barrier to the other side), which then climbs back above 50% and evens out at the same level as ZvP.
PvT win rates (already above 50%) evens out to the same level as ZvP (above 50%).

Case 2C

Say Blizzard thinks PvT is OP and nerfs PvT.

This effectively makes TvP and TvZ OP, so we get Case 1.

Short-term effects

Spike in TvP, which evens out at 50%.
TvZ and ZvP gradually return to 50%.

Case 2D

Say Blizzard thinks PvT is OP and nerfs PvT, which unintentionally nerfs PvZ.

This effectively makes TvP and TvZ OP, and ZvP OP too. So we get Case 1, followed by Case 2 on top of that, with Z as the villain in Case 2 so P is demoted a bit while Z is promoted a bit. Terran dominates the top, P dominates the bottom, and all 3 races feel like they’re in rock-paper-scissors.

Short-term effects

Spike in TvP, which evens out above 50%.
Spike in ZvP, which evens out at the same level as TvP.
TvZ (which was above 50%) gradually evens out to the level of TvP.

Case 2E

Say Blizzard thinks PvT is OP and buffs TvP, which unintentionally buffs TvZ.

This effectively makes TvP and TvZ OP, with TvZ being doubly OP. So we get Case 1 followed by Case 2, with terran being the villain in both cases. This will be similar to Case 2D in how it feels like rock-paper-scissors, with T dominating the top like in and Z dominating the bottom.

Short-term effects

Spike in TvP, which evens out above 50%.
ZvP (which was above 50%) evens out at the same level as TvP.
TvZ (which was above 50%) spikes and then evens out to the level of TvP.

Case 2F

Say Blizzard thinks ZvP is OP and nerfs ZvP.

This effectively makes TvZ and PvZ OP and TvP balanced. So we get Case 1, except with Z being UP rather than OP.

Short-term effects

Spike in PvZ, which evens out to 50%.
TvZ and ZvP (which were above 50%) even out to 50%.

Case 3

+ Show Spoiler +

TvZ favours T. ZvP favours Z. PvT is balanced.

Long-term equilibrium

T = [14 20 20 20 26]

P = [26 20 20 20 14]

Z = [20 20 20 20 20]

This is similar to Case 2. T will move up relative to Z, but be held back by P. Z will move up relative to P, but be held back by T. P will move down but be bolstered up by T. Initially Z will still be winning 50% of its games and so will not move. So T will move up and P will move down. T will move up until its wins against Z matches its losses against P. Since there are even numbers of all 3 races, T will move up the same magnitude P moves down. So we can see that T will move up 1/3rd a lot and P will move down 1/3rd a lot while Z doesn’t move.

Overall we can see that it looks similar to Case 2.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then evens out higher than 50%
Spike in ZvP win rate, which then evens out higher than 50%.
Gradual rise in PvT win rate, which evens out somewhere between the above 2 levels (with our assumptions, all 3 levels will be identical).

This case is very similar to Case 2, distinguishable only because there are spikes in two matchups instead of one.

Case 4

+ Show Spoiler +

TvZ favours T. ZvP favours Z. PvT favours P.

Long-term equilibrium

T = [20 20 20 20 20]

P = [20 20 20 20 20]

Z = [20 20 20 20 20]

This shouldn’t be surprising to anyone by now. Everyone is winning and losing the same amount.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects
Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then stays spiked.
Spike in ZvP win rate, which then stays spiked.
Spike in PvT win rate, which then stays spiked.

A closer look at rankings and pairings.

+ Show Spoiler +

Sometimes it’s not the race that’s imbalanced but the player, and this particularly applies to high level games like GSL. Zerg isn’t overpowered, Idra is. It extends below the GSL, however, down through Grandmasters and perhaps even into Masters. With high level games we cannot use summary statistics to interpret imbalance – we must use more rigorous statistical analyses. Nothing short of that is acceptable for an objective analysis of game balance. So I would recommend that any use of high-level statistics involve some form of statistical analysis and anything that presents solely summary statistics should be disregarded.

But what about summary statistics for the lower leagues? We’ve identified that we can look at the distributions of the races between leagues and glean some information from that – but how do we differentiate between when the distribution is due to imbalance and when it might be due to player skill?

Let’s consider the simplest case. Everybody plays terran and is ranked on the ladder, and players are only paired against those immediately above and below them on the ladder. And everybody plays the same number of games. This should be fine since we’re only using this to look at how the stats for the top ladder players will look, and it should be safe to say they will all play a lot.

Consider the case where everybody has the typical distribution of skills we have been using, and then we add Idra to the ladder. Idra will place 1st have an 80% win rate against the 2nd player. The 2nd player will have a 20% win rate against Idra and a ~50% win rate against the 3rd player, giving him an overall 35% win rate. The 3rd player down will have a roughly 50% win rate.

We can immediately see 2 phenomena here. Firstly, the 2nd player has a poor win/loss ratio, and the lowest bronze player has a 50% win/loss ratio. Note that the bronze player is just a quirk with no real ramifications – to make this more like real life, we would add in the lowest bronze players in the same way we’re adding in Idra.

The more interesting aspect comes from the 2nd best player’s win ratio. He won’t be demoted because then the 3rd player will have an even worse ratio against Idra, and so on.

Let’s see if we can smooth this out. Say players can be paired against those 2 ranks up and down. For the sake of simplicity, let’s assume f(1) = 55% and f(2) = 60% unless you’re Idra, whereupon f(1) = 80% and f(2) = 85%.

Thus Idra will have an 82.5% win rate. Player 2 will have a 45% win rate. Player 3 will have a 43.75% win rate. And players 4 and down will all have a 50% win rate. Let’s be a bit more rigorous with this. We’ll define the Idra factor I that increases the skill gap in determining win probabilities. Then R(x) is the win/loss ratio of player ranked x.


R(1) = 1/2 * [f(1+I) + f(2+I)]
R(2) = 1/3 * [f(1) + f(2) + 1 – f(1+I)]
R(3) = 1/4 * [f(1) + f(2) + 2 – f(1) – f(2+I)] = 1/4 * [2 + f(2) – f(2+I)]
R(4) = 1/4 * [f(1) + f(2) + 2 – f(1) – f(2) = 1/4 * 2 = 1/2

Idra will have a high win ratio no matter what. Player 2 will have a low win ratio when:


1/3 * [f(1) + f(2) + 1 – f(1+I)] < 1/2
=> f(1) + f(2) + 1 – f(1+I) < 3/2
=> f(1) + f(2) – f(1+I) < 1/2
=> f(1+I) > f(1) + f(2) – 1/2

i.e. when Idra’s probability of winning against Player 2 is very high. The more even Idra’s win probability is, the closer this gets to 50%.

Player 3 will have a low ratio when:


1/4 * [2 + f(2) – f(2+I)] < 1/2
=> 2 + f(2) – f(2+I) < 2
=> f(2) < f(2+I)

This is unavoidable unless Idra is just a player of typical skill.

Fancy that, the 3rd best player on ladder inevitably has a bad win/loss ratio. And people think their ratios are actually indicative of their skill?

Is it possible to manipulate the pairings to give the players at the top of the ladder positive win ratios? Whatever we do, it would have to skew things downwards. The problem here comes from symmetry. Since if someone is more likely to be paired against someone lower down, that means the people lower down are more likely to be paired against someone higher up due to the symmetric nature of the pairing system (if you’re paired against someone, that means they’re also paired against you). For example, if we allow Players 2 and 3 to be paired more against Players 4, 5, 6 and 7 and less against Idra, then that means Players 4, 5, 6 and 7 now have a higher probability of being paired against Players 2 and 3 and lower probability of being paired against Players 8+. So we have effectively just transferred our low ratio down to other players and spread it out a bit.

One quirk in our current analysis is that this also assumes that Idra plays less than Players 2 and 3. If he plays more, their weightings against him will be higher and their ratios even lower.

If we introduce a second pro, say a clone of Idra who is not quite as good so that the top of the ladder goes Idra, Idrax, Player 3, Player 4…, then we will get a similar situation.

This allows us to come up with a general conclusion that for our analysis to apply, particularly when looking at race distributions on the ladder, and we want to exclude players who are so good they could be considered an imbalance on their own, we need to look down the ladder until players begin to have a 50% win ratio.

An interesting side point in our analysis is that even at the very top levels of play, players will have low win/loss ratios that are not statistical flukes.

Due to the Bnet match making system, at equilibrium, players outside of the very top and very bottom have an expected 50% win/loss ratio and any deviations are statistical flukes (even if half the jellybeans in the bag are red, if we pull out 10, it’s not uncommon to see 6 or even 7 red jellybeans). We expect top players to have a positive win/loss ratio, but as we have seen here, we also expect a buffer zone between professional level players and amateur players where the top amateur players will have a win/loss ratio below 50%.

Yet these players with sub-50% win/loss ratios are both by rank and by nature superior players to their peers with 50% win/loss ratios. Their low ratios come from the fact that they’re so good they’re the only amateurs who get matched against professional level players. This low ratio comes from the assumption that there is a significant skill gap between amateur players and professional players. We know that at the top level, skill gaps can be huge because Idra has a huge win/loss ratio.

Note that if this skill gap didn’t exist, we have already seen that even the top ladder players should have a fairly consistent 50% win/loss ratio.

Also note that if players in a lower league (say, diamond) has a good win/loss ratio, there’s also the possibility that they’re not in the correct league. This could happen if they have been practising in customs for 500 games, for example. Their ratio will likely remain positive all season. This introduces an extra element of care when interpreting win/loss ratios if the player had risen from lower ranks that season.

Effects of uneven race distributions.

+ Show Spoiler +

So far we have assumed that an equal number of players play each race, i.e. their ratios are 1:1:1. But what if the ratios were 2:1:1 in favour of some race? We can intuitively see that in cases where one race is absolutely overpowered (Case 1), there won’t be much change in our analysis. It becomes a bit more interesting in Case 2:

Say the races are distributed 2:1:1 and TvZ is OP by 1 level. Our initial balanced distribution:

T = [40 40 40 40 40]

P = Z = [20 20 20 20 20]

And then after the imbalance is introduced:

Long-term equilibrium
T = [30 40 40 40 50]

P = [20 20 20 20 20]

Z = [30 20 20 20 10]

If we take a cross section of terrans from, say, gold league, we see that 60/80 of their matchups don’t change (against T and P) and they have an edge in 20/80 of their matchups. For Z, 40/80 of their matchups don’t change and they have a disadvantage in 40/80 of their matchups.

As terrans are promoted, 40/80 of their matchups are balanced, they have an edge in 20/80 of their matchups and a disadvantage in 20/80 of their matchups. Say T rises x% of the way up one slot (from G to PL) and Z drops y% of the way down. The normalised magnitude of their advantage is represented by the skill gap (100-x-y). Similarly, their disadvantage is represented by the skill gap x.

The problem is that we don’t know how the probability of winning f(k) actually looks. For now we can consider the limiting condition where f’(k)=0 for all k. (I changed it to k just for this so nobody confuses it with the x that represents the terran shift here.)

They reach an equilibrium when:


20/80 * (100-x-y ) = 20/80 * x
=> 2x = 100 – y

As zergs are demoted, 20/80 of their matchups are balanced, they have an edge in 20/80 of their matchups and a disadvantage in 40/80 of their matchups. Using similar logic to terrans, we reach an equilibrium when:


40/80 * (100-x-y) = 20/80 * y
=> 200 – 2x – 2y = y
=> 3y = 200 – 2x
=> 3y = 200 – 100 + y
=> y = 50

Therefore x = 25

So terrans move 25% of the way up while zergs move 50% of the way down. Terrans made up 4/8th of diamond; now they make up 5/8th.

This hasn’t given us a very interesting result since it provides us with nothing fundamentally different to Case 2. The magnitudes of the moves changed as we would expect.

Now let’s see what happens if the ratio of P players change. Consider our initial condition:

T = [40 40 40 40 40]

P = [30 30 30 30 30]

Z = [20 20 20 20 20]

And after the introduction of imbalance:


20/90 * (100-x-y) = 30/90 * x
=> 5x = 200 – 2y
=> x = 40 – 2/5 * y

40/90 * (100-x-y) = 30/90 * y
=> 400 – 4x – 4y = 3y
=> 400 – 160 + 8/5 * y = 7y
=> 240 = 27/5 * y
=> y = 400/9 ~ 44

Therefore x = 22

Long-term equilibrium
T = [32 40 40 40 48]

P = [30 30 30 30 30]

Z = [28 20 20 20 12]

The increased presence of a balanced third race stabilises the movements. Again, nothing spectacular.

Now, the problem here is that we had initially defined f’(k) > 0 for k > 0. It isn’t so much of a problem because all that means is that what we have found are the maximum shifts. The actual shifts will be lower, i.e. people are more apt to remain in their leagues.

There is one important change to our conclusions if the racial distributions are not even, and that is in Case 4. Instead of having an even distribution of races across leagues, there will be some shifts due to different races being promoted and demoted more if they’re played less than the other races.

What about if the magnitudes of imbalances aren’t even?

+ Show Spoiler +

This mostly affects Case 4. If TvZ is more OP than ZvP, for example, then we will get uneven shifts, similarly to if the race distributions aren’t even.

What does this mean for SC2 statistics?

+ Show Spoiler +

Since our analysis was done on ladder rankings, our conclusions should only apply to the ladder. Statistical analysis is the best way to analyse high level statistics. To find the top level on the ladder where we can begin seeking conclusions from summary statistics, we should continue down the ladder from the top until we see players begin to consistently have a 50% win/loss ratio. Ideally we should see a little dip below a 50% win/loss ratio to know that we’ve hit the right spot, although this may be difficult to spot. We shall call this our upper statistical threshold.

I think it can’t be stated enough: GM and top masters is a bad place to be looking for racial imbalance.

We have identified patterns for two types of data:

Win/loss ratios for matchups (e.g. TvZ).
Race distribution near our statistical thresholds.

Note that the short-term effects don’t require a balanced initial position. They are the result of a shock (such as a Blizzard balance change) and aren’t affected by the initial state (whether it was balanced or imbalanced). In the case that the win rates weren’t at 50%, then we should measure the spike as a relative change.

We should also note that there are no observable impacts on distributions in the middle leagues. The only places where we can observe these patterns are in bottom bronze and top diamond (i.e. masters).

Let’s first consider what each long-term equilibrium means. Remember that we begin looking by looking down from our upper statistical threshold.

+ Show Spoiler +

All races fairly represented

Note that this doesn’t require a 1/3rd split. If terran makes up 50% of the population, then we expect terran to be represented 50% in bronze and diamond. So here we’ll introduce the idea of a race being fairly represented if its presence in a league is equal to the ratio of players who choose that race.

From what we’ve seen, this can mean that all the races are perfectly balanced, or we have a complete circle of imbalances like Case 4. The way to tell is by whether or not we have a rock-paper-scissors situation in the win ratios. If it feels normal, then it’s balanced. If each race has a favoured matchup, then we have a circle of imbalances. It should be easy to determine the direction of the circle and balance this.

One race dominates the top

This means that race is OP against both other races. Now, there can be nested imbalances. So if terran dominates the top, that means T is OP but PvZ can also be OP. We can tell by looking at the bottom because we expect Z to dominate the bottom if that’s the case (refer to Case 2D). We can also tell by looking at the racial win ratios. If it feels like rock-paper-scissors, then there’s probably also a nested imbalance.

One race dominates the bottom

This means that race is UP against both other races. Once again, there can be nested imbalances, so if terran dominates the bottom, or it feels like rock-paper-scissors, there can still be imbalances in PvZ. It’s just the reverse of the above case. In general, for any imbalance that is detectable from the top, its opposite is detectable from the bottom.

Two races dominate the top

This means that there is an imbalance in at least one matchup. The imbalance is against the race that is missing from the very top. If there’s no rock-paper-scissors situation, then the missing race is absolutely UP and it should dominate the bottom.

Now, if we have a rock-paper-scissors situation, we know there is at least one imbalanced matchup, and there may be two, and if there aren’t even numbers of players playing each of the 3 races, we may even have a circle of 3 imbalances. We can identify the race that is OP against the missing race by looking at the win ratios to see which race wins more than 50% of its games against the missing race. The race we identify should also be absent from the bottom.

If we have access to the short-term data, then we can count the number of spikes and work out from there whether we have one, two or a circle of racial imbalances.

Two races dominate the bottom

This is just the reverse of the above case.

If the 3 races dominate each other in rock-paper-scissors

We know that at least one of the matchups is broken, and it’s still possible for one race to be completely UP or completely OP. We need to look at the top and bottom of the ladder to gain more information and arrive at one of the above cases.

The short-term data also provides us with valuable clues. Remember that we only use data below our upper statistical threshold.

+ Show Spoiler +

Any patch that corrects an imbalance results in a spike in wins that makes the originally UP race look OP.

If Z was UP against both T and P, its win ratio of, say, 40% against both races will spike to, say, 60% soon after a patch and then gradually drop to 50%. If Z was only UP against T, then we had the rock-paper-scissors situation and a proper patch will cause this spike only in ZvT, after which all 3 win ratios will gradually move to 50%.

This is a good way to keep track of whether or not a patch rebalances the game. If there is no spike past the 50% mark to make it temporarily look like the previously UP race is now OP, then balance has not been fully restored.

The only time this doesn’t happen is in Case 4, in which we just see a spike straight back down to 50% if we balance perfectly.

At a balanced state, the win ratios should all be 50%.

The smaller the imbalances, the harder it is to detect. At very small levels it will be indistinguishable from noise. I suspect that most players will underestimate the size of the imbalance that’s required for this – it is likely to be higher than most players think. On the other hand, most players don’t ignore noise anyway and so have the tendency to attribute imbalance to the case where it’s just noise.

So it’s easy, right?

+ Show Spoiler +

Firstly, we should point out that our conclusions only apply when analysis statistics for active players. Analysis on balance that includes players who were not active during the period under consideration is invalid. The statistics, such as win rates, should all be from that period.

We’ve identified the problems with using data from the top and have identified a way to find the upper statistical threshold, but there’s a problem with the bottom too. The existence of portrait farmers and smurfs make it difficult to use data from the bottom. Even if we assume their races are randomly distributed, they add a significant level of noise. It may be possible to cut out portrait farmers by applying the skill gap argument to the bottom and find a lower statistical threshold from which to begin, but the existence of smurfs still makes this data unreliable. Smurfs are more annoying because they can exist in all leagues, although their presence in the top leagues are moot. This means that any reliable summary statistics we use for analysis should only come from the highest league below our upper statistical threshold.

This could be one reason Blizzard is cracking down on portrait farmers and smurfs – to make their balancing job easier.

So I would be cautious in accepting analysis on statistics from the bottom of the ladder. We probably cannot use any statistics from the bottom of the ladder when drawing our conclusions.

Now we should highlight one interesting outcome from our model, which is:

When matchups seem like rock-paper-scissors – this always indicates that there is imbalance in at least one matchup. And if the league distributions between races are uneven, then we can also make the following conclusion: if there is at least one racial imbalance beyond complete OPness, then we will have rock-paper-scissors at equilibrium. This allows us to rule out some claims of imbalance.

What about matchup win percentages? Well, there are problems there too. Since strategies take some time to evolve, we are unlikely to see many spikes immediately, so all the shifts will look gradual. Which means we need to be looking at the steady-state win percentages.

We know that if one race is completely UP or OP, then steady-state win percentages will be 50% - it’s impossible to tell it apart from a perfectly balanced game.

So what does it mean when our data shows that one race has a high win rate against both other races? Well, that’s interesting because:

There is no steady-state win rate where one race has a high win ratio against both other races.

In other words, the state of the game is still in flux if this is the case. A high win ratio against both other races thus signifies that race being buffed against both other races, or the development of new strategies for that race that have made it more powerful against the others. It should eventually drop to 50%.

This raises a major difficulty in analysing summary statistics, which is that the state of the game is often likely to be in flux. New strategies will be developed, the metagame may change, some players will improve and others will be out of practice. There’s not much that can be done about this except to hope that there is enough data available that this won’t be as big a deal, and I’m sure Blizzard already has more sophisticated mechanisms in place.

One thing we haven’t considered much is the possibility of players switching races, but that is relevant here. This shouldn’t affect our analysis much but it provides an alternative explanation for one race suddenly performing well against both other races – or performing poorly.

If we assume that a player switching races will go on a losing streak until they even out, then if we get a lot of terran players switching to protoss, then protoss will show a dip in its win ratio against both other races for some time before it evens out. If we further add in the concept where the player acclimatises to a race before climbing back to their initial spot on the ladder (i.e. they attain an equal level of mastery over the new race as they had over the old one), then protoss will first show a dip below 50%, and then a rise above 50% against both other races before evening out.

We are focusing more attention here because this situation (where one race shows win rate dominance against both other races) seems to occur a lot on the ladder. So I will summarise the possibilities for this:

A new Blizzard patch made the race OP against both other races. This should only be temporary.
A new strategy or tactical exploit was discovered for the race that is effective against both other races. This could be a new way to win, or a new way to negate the other races’ strategies. This should also only be temporary.
Many players simultaneously decide to switch to that race (causing a dip) and then learn how to play that race well (a rise). This should only be temporary. Note that the reverse may also occur (players switching away from a race), with the opposite effect.

No race should retain a high win ratio against both other races for long.

Similarly, we note:

No race has a high win ratio against any other race at equilibrium unless all the races also have high win ratios against another race.

This should be pretty self-evident. We should note that this relates right back to the rock-paper-scissors. This allows us to draw the conclusion:

Win rates will either be 50% or in a state of rock-paper-scissors at equilibrium.

But wait! I saw a graph that violates one of our conclusions!

+ Show Spoiler +

These conclusions apply to ladder because that’s what we modeled. If those graphs were of the ladder, then there must be more to the story the author didn’t look into, which is a sign of bias on their part. When you must choose between a subjective analysis made with an agenda or an objective analysis that can be applied to any general situation, it’s usually better to stay with objectivity.

So whenever somebody points to a set of ladder statistics that shows one race clearly dominating the others with a high win rate, then we must look towards recent patches, advances in that race’s strategies/exploits or players switching away from the overperforming race to explain the discrepancy. If the high performance was preceded by an underperformance with no patches in between, we may also seek an explanation from players switching to that race some time ago and are now rising back to their proper rank. If they show us a graph of one race dominating just one other race, we should keep our eye on the other matchups. A spike in only one matchup would alert us to seek an explanation in a balance patch or a new strategy/exploit, which could well be an imbalance. A gradual change could mean that it’s caused by a different matchup being imbalanced.

If they have used data from the top level, then their analysis must be disregarded since it is invalid for reasons we have already discussed. This obsession with the top might seem intuitive, but our analysis shows that any effects of imbalance on win ratios should be consistent through all of the leagues, and that if they go too high then they cannot draw any valid conclusions without conducting a proper statistical analysis. Thus it might be worthwhile just analysing racial win/loss data for diamond league (as separated from master league), using them as a threshold for when strategic and tactical elements become a major influence on performance. We are probably right to be hesitant to use data from lower down, where mechanical factors might dominate everything else.

One place where these conclusions need adjusting is when interpreting results at the top level, say GSL. In that case you really need to do a detailed statistical analysis. Summary statistics is useless for drawing any conclusions there because player skill becomes an important factor that must be considered in the analysis, and you just can’t do that using summary statistics. Any graphs of the GSL or GM or top masters (down until players begin to have a 50% win/loss ratio) without a proper statistical analysis are useless. I talk a bit more about factors that come from and can be applied to the GSL in the Myth Busting section. 100% of people who read it will end up loving Jinro (sample size: 1).

Outcomes

+ Show Spoiler +

Remember that our conclusions apply in cases where we attempt to interpret summary statistics. A proper statistical analysis has fewer limitations.

Top level data cannot be analysed using summary statistics. It requires a proper statistical analysis to draw any valid conclusions. Disregard any analysis that attempts to use data from high masters above the level where players begin to consistently have a 50% win/loss ratio since any conclusions they draw will likely be invalid.
There should be a gulf near GM/top masters where players experience a dip in their win/loss ratio below 50%. This should dispel the myth that win/loss ratios mean anything, even in master league.
The best place to take data is around this dip, where players begin to consistently have an overall 50% win/loss ratio. This is our upper statistical threshold. Above this point, statistical analysis is required.
Any effects of balance on racial win/loss ratios should be fairly consistent across leagues.
Imbalance doesn’t affect distributions in the silver, gold and platinum leagues.
Portrait farmers and smurfs make interpreting imbalance using data below the top leagues difficult. It could be one motivation behind Blizzard’s crackdown – to make their job of balancing the game a bit easier. Thus it is best if we only analyse racial win ratio data from the top leagues (while remaining below our upper statistical threshold).
If one race is absolutely overpowered or absolutely underpowered, the game can still feel balanced and have 50% win ratios between races. The only way to tell is by looking towards our upper statistical threshold to see how the races are represented.
If the game feels like rock-paper-scissors (which should be confirmed by the racial win ratios), then there is definitely an imbalance. We can gain clues regarding the imbalance by looking at the upper statistical threshold.
Aside from 50% win ratios or rock-paper-scissors win ratios, there are no other stable racial win ratios on ladder. Any variations from these stable ratios are the result of balance changes, changes in strategies/discoveries of tactical exploits, or players switching races.

Extensions

+ Show Spoiler +

We haven’t accounted for the fact that players in certain leagues might prefer some races over another. Maybe once players reach diamond they become comfortable enough to pick up zerg, for example. This sort of situation can arise if there is a steep initial learning curve for a race. This does not necessarily make the race harder to play at the higher leagues, however. For example, the learning curve can take the form of required knowledge. If Blizzard made zerg completely balanced except they automatically lose a game unless they begin a spawning pool by the 5-minute mark (say, their buildings spontaneously explode if a spawning pool had never been started, but nothing happens if it had been started then cancelled), then we would see zerg UP until the skill level where players have this required knowledge, after which it is perfectly balanced. More realistic examples would be knowledge of timing pushes by the other races.

There are probably many more considerations like this one. Neglecting this was intentional and is not a flaw in the analysis. Our analysis shows us the best case scenario for our ability to interpret any data gleaned from SC2 summary statistics. Factors such as this one confound the situation and further restricts our ability to interpret data. Hence:

This analysis shows the limit of what is possible to be interpreted using summary statistics. Many factors make application even more difficult, with more restrictions.

In other words, we’ve identified the conclusions that it’s possible to draw from summary statistics in an ideal world. When things diverge from the ideal, players are even more restricted in the conclusions they can draw. This should discourage the use of summary statistics in balance discussions.

And it also makes Blizzard’s job harder too. This goes to show how difficult it would actually be to balance SC2. Something to think about if you’re a pro or considering going pro.

We also haven’t considered the possibility of players switching races. We discussed this briefly, since the effects should generally be intuitive, and it’s unlikely to have much effect on our conclusions.

It would be interesting if we can get this model represented more rigorously. However, it is likely that in the face of the limitations in summary statistics that we have presented here, players’ interest will shift towards statistical analysis of high level play.

Myth Busting

I’m confident I’m not the only one who feels that most of the balance discussion in the SC2 community is clearly biased and thinly veiled QQ. Players often point out insignificant points or resort to ad hominem attacks and I hope this section can help to encourage community members to just outright ignore those sorts of posts, or perhaps stricter moderation in community forums.

+ Show Spoiler +

Only masters/pros should talk about balance.

+ Show Spoiler +

I am of the opinion that no player can truly talk about balance. Psychological studies have found that everybody has an inherent bias that favours themselves regardless of how unbiased they try to be. In other words, any balance suggestion by a pro will necessarily try to make his own race OP rather than to achieve objective balance, no matter how well-intentioned the pro is. This is especially true for pros, whose livelihoods depend on winning tournaments. No, they’re not bad people. This works at an unconscious level and happens to everyone.

Also, just because someone is in masters doesn’t mean their arguments will be sound.

”Fixing” this mechanic will balance the game.

+ Show Spoiler +

Since Blizzard uses objective data to balance their games, if they did nerf a race as many players wish, it would result in that race becoming UP and receiving other buffs to compensate.

Personal experiences matter.

+ Show Spoiler +

We’ve all seen it.

A: TvZ is so hard!
B: Actually, TvZ is my best matchup.

They’re both right.

You need a bigger sample size.

+ Show Spoiler +

This is often used by those who have learned a bit about statistics in high school. It’s really a cheap point, and the people who use this demand sample sizes in the thousands, and once given that will still not be satisfied and demand samples in the millions. The required sample size depends on what you plan to use it for, and it also depends on the nature of the sample itself. This point is also quite meaningless but I won’t go into the details. The bottom line is, I would be interested in a statistical analysis with a sample size as low as 50 if the results were significant. You don’t need that big a sample if your point has substance. A larger sample allows you to detect smaller imbalances, though.

The caveat is that this requires a proper detailed statistical analysis, not just some pretty graphs. We need more of the following sort of analysis, although hopefully done…well, better:

http://www.teamliquid.net/forum/viewmessage.php?topic_id=219399&currentpage=All

I won’t go into the technical details because there were some very basic things that bothered me. In this case he'd analysed data over a number of patches, i.e. includes data from patches where terran was generally agreed to be OP. Removing data from those patches in the analysis to analyse data from only current patches would provide a more accurate picture of the current state of the game. The analysis need not be restricted to just one patch period, however. It’s very difficult to do this properly with Blizzard patching the game so often.

I am also bothered by the fact that he didn’t include a table summarising his estimates for the parameters and their confidence intervals. I don’t like having data interpreted for me since we all know that statisticians lie and SC2 players are biased. I don’t think he was very clear on which parameters he used either.

I’ve got to hand it to the guy for trying, though. There were lots of naysayers in the thread, and we haven’t really seen any more efforts like this. I wish TL mods were harsher in this regard – what we need are more people putting in the effort to do stuff, and the current environment in the community discourages anybody from doing this. Who wants to go to all that work if their efforts are going to be dismissed without any valid reasons?

We can only find imbalance by looking at the top level of play.

+ Show Spoiler +

The problem here is that people tend to post simple summary statistics and graphs and call it a day. Consider, for example, if this game only had terran and no other races. Our statistics from GSL would show MKP and MVP winning most of their games against other terrans. Conclusion? It’s not terran that’s OP, it’s the players.

Now add the other races and players back in. When MKP plays against a protoss and wins, how much is a result of his race and how much of it is a result of his hard work and raw talent? The summary statistics do not show this.

An analysis has to be perfect to prove a point.

+ Show Spoiler +

We often see people nitpicking on the tiniest points, and it seems like a lot of discussion degenerates because of it. Often the minor points have no significant impact on the analysis and it’s a waste of time to discuss them. This is evident in the thread I linked to above where a TLer did his own statistical analysis. Yes, there were flaws, and some important flaws were pointed out, but some of the other “flaws” were also unimportant. I think a lot of it came from people trying to comment on something they didn’t understand, and trying to dismiss something because they didn’t like the conclusions.

If everybody, especially pros, chooses a race, then that race must be imbalanced.

+ Show Spoiler +

Although variations in race choices can indicate imbalance (and I would certainly use it as a clue), it’s not sufficient evidence to draw the conclusion. Sure, people switch races they believe are OP. But people can be wrong. You must have seen people make many clearly incorrect assertions. Perhaps they just enjoy playing a race. The differences in race preferences between regions can represent different cultural preferences.

Maybe people choose a race because they have an affinity for it.

+ Show Spoiler +

This is a difficult idea, which we can flesh out by looking at the extreme cases. The claim is that discrepancies in the ratio of players who choose a race can be explained just from personal preferences rather than from selection due to imbalance. This is the opposite idea to the one posed above.

Say 100% of all players have an affinity for terran, so they perform better as terran than as protoss or zerg. Everybody would draw the conclusion that terran is OP and therefore should be nerfed, right? And for all purposes, this is a reasonable outcome.

But what if only 99% of players have an affinity for terran, and 0.9% have an affinity for protoss and 0.1% for zerg? Should we nerf terran? By how much? Let’s assume the players are equally skilled, except the affinity gives them an edge, say their equivalent skill race (which is found by combining their actual skill with their racial affinity) means they perform one league higher with that race. Assume that each race has an even skill distribution (20% with actual skill in each league). What will happen?

Well, if everybody picks the race they have an affinity for, then it will be perfectly balanced, but we’ll have 99% terran, 0.9% protoss and 0.1% zerg. To make this more interesting, consider that some of the players with a terran affinity choose to play other races (those with an affinity for protoss and zerg stick to their own races) until we have an equal 1/3 distribution between races. And assume that we still have an even skill distribution between those who chose to play each race.

What will happen?

The terrans will dominate the top of the ladder. A player with a terran affinity playing terran will be approximately one league higher than a player with a terran affinity playing protoss. So there will be few terrans in bronze, silver, gold and plat will see no real difference, but diamond will have a lot of terrans (remember, each race has a 33% share of the player pool). Since those with an affinity for the other races are so few, we can just ignore their effects.

So what should Blizzard do?

If Blizzard does nothing, people will point to terran dominating diamond league and cry imba. So let’s say Blizzard chooses to do something. They decide to nerf terran so that the 3-factor effective skill (which combines actual skill, racial affinity and racial imbalance) for a player with a terran affinity is equal to their actual skill – basically, they introduce a racial imbalance that negates terran’s racial affinity. They do this by nerfing terran compared to protoss and zerg.

What will be the outcome?

The leagues will stabilise with even ratios in each. There will be slightly more protoss and zerg in the higher leagues, but 0.9% and 0.1% will fly below our radar – it will be indistinguishable from noise. So all’s good, right?

Let’s go back and think about those protoss and zerg lovers we have been ignoring until now.

In our model, we have one top zerg, one top terran and one top protoss with equal skill level (I haven’t said it directly but you should have assumed so). They are equally good at the game and work equally hard, so if there were no imbalances, they should each have a 50% chance of winning against another. And since they’re at the very top, they’ll be our pro players in GSL. What will happen to these players?

Well, Blizzard has introduced a racial imbalance to balance the rest of the ladder. Unfortunately, that has massive repercussions for these players’ careers. Let’s break this down step by step.

First we have the players’ skills without racial imbalance or affinity. They’re equally skilled at #1 diamond.

Now we introduce the racial imbalance and let these guys play their best races. Their equivalent skill is now one league higher than #1 diamond. In other words, they’re one league ahead of anybody with #1 diamond skill playing their off race. Their equivalent skills are still even – they still have a 50% chance of winning against each other.

Now Blizzard introduces a terran nerf to help balance the rest of the ladder. The zerg and protoss still have a 50% chance of winning against each other, but now terran’s 3-factor effective skill is one league lower – that terran is playing at #1 diamond while the zerg and protoss are playing one league higher in skill. So the terran will lose more than he wins against his equally skilled peers.

Why is this important? It’s not – unless you’re pro or plan on going pro.

This goes to show how hard it is to balance for ladder and for pros at the same time. We often see players point out that most casual gamers will be more comfortable with terran because the race is most similar to that of other RTS games, then it’s likely that terran pros will have a disadvantage against their peers if statistics show terran is balanced on the ladder. The magnitude of the disadvantage for the terran pro is equal to the average gain in performance due to players’ affinity for terran.

(On a personal note, this is why I love Jinro. Jinro fighting!)

So this is a tough one. If everybody has an affinity for a race, then that race is clearly overpowered. If even one person has an affinity for another race, then nerfing the race that most players have an affinity for gives that other person an edge in professional competition.

In other words, if you want to go pro, you have an edge if you’re better at a less conventional race and if Blizzard chooses to balance for the ladder rather than for the pro level. We’re also assuming that players choose to play the less intuitive races, but that shouldn’t be much of a stretch in my opinion.

If everyone stuck to their favoured races, then the ladder will be balanced with some intervention by Blizzard, but we will have a wonky race distribution. If players choose to play races they’re not good at (because “it’s more fun”, say) then it’s possible we’ll see Blizzard’s intervention if the overwhelming majority have an affinity for one race, resulting in them creating an imbalance that manifests at the pro level. This can also happen if players stick with their favoured races but Blizzard wants an even distribution of races (I haven’t shown the analysis but it’s similar to what we’ve already covered). But, yes, it is in fact possible for the races to be balanced and for one race to outnumber every other race.

(What, did you think I’d bust the myth saying that race numbers imply imbalance and then say that race selection can’t be due to personal preference?)

It’s only a small imbalance.

+ Show Spoiler +

Statistics is not very intuitive and unfortunately it leads to many incorrect interpretations. Let me pose to you the following hypothetical situation:

A summarised report on college admission rates show that 40% of college entrants are female and 60% of college entrants are male. Is this “good enough”?

To the untrained eye, the numbers look pretty close to 50%. But in reality this is a massive difference. The true numbers crop up when we compare them with each other. So out of 100 entrants, 40 are women and 60 are men. That means 150% as many men enter college as women. When presented that way, the magnitude of the difference becomes much more evident and it becomes clear there is sexism problem with the college admission system.

Now back to SC2. Consider that Blizzard’s margin of error is +- 5%. So if T wins 55% of games against Z, Blizzard considers that balanced. (55-45)/45 = 22%. So terran wins 122% as many games as zerg in TvZ. It’s still quite a difference.

So imbalances that we glean from win percentages are actually much larger than they first appear.

Trends from only one region is a valid reflection on racial imbalance.

+ Show Spoiler +

This one should be fairly self-explanatory. If two different regions show different win ratios for the races, then we cannot conclude that racial imbalance is the primary factor involved in either region from summary statistics alone.

One race takes more skill to play than another.

+ Show Spoiler +

If we define this as meaning it’s easier to get to a certain league as one race or another, then our model certainly indicates that this is possible if one race is completely OP. However, whenever players make such a claim, it is usually a subjective opinion (biased towards their own race and prejudices) with no objective support.

From our analysis, we can see that certain conditions must be present for one race to take more skill to play than another:

Win rates are 50% between all races, i.e. the game looks balanced based on win rates. This is an unintuitive requirement that many QQers miss.
The UP race is overrepresented at the bottom of the ladder. We have already discussed how this may be difficult to identify properly.
The UP race is underrepresented at the top of the ladder. Look down from our upper statistical threshold. Is the race underrepresented? Look a bit lower – is the race now suddenly fairly represented?

If the above 3 conditions hold, then there is objective support for a racing being UP and requiring more skill to play. Remember that underrepresentation is by comparison with how many players choose that race overall. So if 10% of all active players main terran, then 10% of terrans at the top of the ladder is fair representation.

The players know better than Blizzard.

+ Show Spoiler +

I have faith in Blizzard. Everything I’ve presented here is pretty basic stuff to anybody who’s done any maths and I’m sure Blizzard has statisticians helping the balance team who have worked all this out and have created much more rigorous models. As we’ve seen, sometimes an imbalance can create a different illusion at the player level that leaves other clues on the more objective statistical level. I’m sure Blizzard has already conducted its own in-house statistical analysis of the GSL that I bet many players would love to see. But that sort of thing is a lot of work for the community to replicate.

And let’s face it, the upshot of everything we’ve talked about is basically that Blizzard has one hell of a job ahead of them. :-)

Final Words

I have more analysis on specific mechanics within SC2 but am not sure if anybody would be interested in seeing them. The idea is to figure out whether or not it is possible to balance SC2 at all, and how the game would look once it is balanced. Once again, it’s a general analysis and avoids looking into current specifics on units, etc. I think the outcomes will surprise many players. Let me know if it will be worth writing that up.

And this might seem like a long-winded way of telling TL that I love Jinro, but, well…

usa11220

United States38 Posts

May 20 2011 05:10 GMT

best first post ever?

s[O]rry

Canada398 Posts

May 20 2011 05:10 GMT

Mind. Blown. Well done, sir. Good read.

CoSyN

United States122 Posts

May 20 2011 05:13 GMT

Nice first post. Quite amazing...

sKo

United States45 Posts

May 20 2011 05:15 GMT

You should post a PDF or some other format so I can save this without all the spoilers. It'll be easier to read closely.

edit: Because it's spot on. Great job, sir.

sOAvoid

Canada206 Posts

May 20 2011 05:16 GMT

Amazing.

Antedelerium

United States224 Posts

May 20 2011 05:21 GMT

Phenomenal first post. I loved reading the part analyzing critiques based on a player's affinity for a certain race. Very well written and very intriguing. I hope this motivates like-minded people to spend time doing their due diligence and putting in time researching the ladder before making any more bold claims as well as preventing others from shooting down research like this.

Nik0

Uruguay460 Posts

May 20 2011 05:21 GMT

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

Probably

DarkPlasmaBall

United States45164 Posts

May 20 2011 05:23 GMT

If this became the new standard for opening threads...

+ Show Spoiler +

There would be exactly 1 thread on TeamLiquid.

Awesome job, I like your explanations

mister.bubbles

Canada171 Posts

May 20 2011 05:26 GMT

#10

Great post! I think the various imbalances on maps are noteworthy as well. I spent a lot of time wondering to myself how Blizzard balanced Brood War so well when they stopped making balance patches so early in the development of the game and eventually concluded that the balance we see in BW today is more due to the talent of map makers. I think the racial imbalances in SC2 and BW vary more from map to map then they do over all, which means that if the game is already exciting to play and watch (which I think in SC2 will be improved), that the focus should be more on turning out balanced maps for the competitive scene to take place on then balancing the races.

Jombozeus

China1014 Posts

May 20 2011 05:29 GMT

#11

First of all, you completely ruined your legitimacy with the myth busting. It was completely necessary, biased, and based on nothing but personal experience. You do not bust myths with your opinion, that's ridiculously counter-intuitive.

Secondly, this is unfortunately the wrong way to look at imbalance. There are much more related to imbalance than those meaningless statistics. The "dip" or the upper statistics threshold you speak of assumes that the level of skill follows a linear trend for all three races. As in, say the maximum of a race is 100%, three players playing at the same level are all playing at 80%. This is not true. It can be possible that a 80% zerg will lose overwhelmingly to a 80% terran, but a 95% zerg will beat 95% terran most of the time.

This can be due to a subtle timing, an increase in mechanics breaking a threshold of the amount of units you need at a certain time for said timing, or whatever other reason(s).

We cannot assume this "dip" of players follow the same trend as the actual maximum potential for the races. We also do not know what 100% of a race's potential is like, and on top of that, we do not know when another paradigm shift will be incoming, that will ruin your long-term hypothesis.

rift

1819 Posts

May 20 2011 05:31 GMT

#12

In line with your own basic assumption, a lot of your post contains personal bias towards what you believe.

Imbu

United States903 Posts

May 20 2011 05:32 GMT

#13

Holy.... This is an amazing first post. Half way through it and I can't believe you went through everything that cleanly.

I can only wish I could write and analyze at your level.

Severian

Australia2052 Posts

May 20 2011 05:32 GMT

#14

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

How could you possibly have read and digested the OP in less than 2 minutes?

sinani206

United States1959 Posts

May 20 2011 05:32 GMT

#15

mind=blown

This is why postcount doesn't matter.

sam!zdat

United States5559 Posts

May 20 2011 05:35 GMT

#16

If I'm understanding you correctly, you are assuming that the effects of racial imbalance scale linearly with that player's skill?

That is, you are assuming that XvZ is as imbalanced in bronze as it is in diamond? Particularly in the case of things like spellcasters a certain level of skill might be required for the imbalance to be felt. Also strategies that are metagame dependent will only be successful in a league where that metagame obtains... There are a number of factors which make me think at first glance that racial imbalance would not be particularly likely to be a constant factor across leagues.

for example, cannon rushes might mean that protoss is absurdly overpowered in bronze league, despite being balanced in grandmasters.

As something of a sidenote, I think the most interesting and probably most rigorous way to look at the ladder is as an ecological system.

edit: great post btw

lazyo

Germany90 Posts

May 20 2011 05:37 GMT

#17

I apreciate your effort but ladder play is not a good basis for balance analysis, especially not for the master/GM league since pros use it mostly as a practice tool and do not 100% play to win.
Instead, data from tournaments should be gathered.

Jombozeus

China1014 Posts

May 20 2011 05:38 GMT

#18

On May 20 2011 14:35 sam!zdat wrote:
If I'm understanding you correctly, you are assuming that the effects of racial imbalance scale linearly with that player's skill?

That is, you are assuming that XvZ is as imbalanced in bronze as it is in diamond? Particularly in the case of things like spellcasters a certain level of skill might be required for the imbalance to be felt. Also strategies that are metagame dependent will only be successful in a league where that metagame obtains... There are a number of factors which make me think at first glance that racial imbalance would not be particularly likely to be a constant factor across leagues.

for example, cannon rushes might mean that protoss is absurdly overpowered in bronze league, despite being balanced in grandmasters.

As something of a sidenote, I think the most interesting and probably most rigorous way to look at the ladder is as an ecological system.

Yes, this is what I was saying. I couldn't of thought of a better way to put it as a linear scale. When I said normal distribution, this is the concept I was thinking of haha.

Normal distribution would have assumed differently given different standard deviations.

Off to edit my post.

gogogadgetflow

United States2583 Posts

May 20 2011 05:42 GMT

#19

So are you going to be analyzing the tlpd or what?

Snuggles

United States1865 Posts

May 20 2011 05:42 GMT

#20

Sharks jumping in so fast to bring discredit to this poor guy. I honestly don't care if he's right or wrong. I just agree with the way he thinks. Nice job.

1 2 3 4 5 7 8 9 Next All

Please or register to reply.

Model for imbalance, with myths

Completed

Ongoing

Upcoming