Model for imbalance, with myths

Warble

137 Posts

May 20 2011 05:08 GMT

Warning: Maths.

Aim

The SC2 community is passionate about balance, with many feeling their race is imbalanced and others feeling that the game is balanced. The commonality between the 2 groups is that their basis comes from feelings and we have begun to see forays into more objective evidence using statistical data about the game.

Unfortunately, statistics can be counterintuitive (like the rest of maths, in my opinion), so it’s easy for misconceptions to arise amongst those who don’t use it regularly. I think it’s cool that people have started to use data to back up their beliefs, and hope to see more, so my contribution here is to improve the quality of analysis by laying some framework for the community, and also to dispel a few common misconceptions.

For a start, I believe that beginning with the idea of imbalance is the wrong way to start. What a lot of people do is begin with the idea of imbalance, and then seek data to back up their opinion. It is better to develop an objective system that we can apply independent of our beliefs, and that’s what I aim to start here and hope that we can develop and improve this.

I believe most people will find the outcomes surprising. I have also tried to be thorough, but errors may have snuck through, so I will try to correct any errors as they are pointed out. Feel free to build on this or even create something better.

Here I will try to present some unbiased ways to think about balance in SC2. Some ways I will try to be unbiased are:

I will avoid reference to the current state of the game (jokes exempted) and just focus on general concepts.
I will present the methodology for identifying balance without having looked at the data. The idea here is that once we have developed an objective way to identify balance, once we apply that to the data we must accept whatever conclusions it leads to. I invite everyone who wants to contribute to this to avoid looking at the data as much as possible too.
I will not propose any balance changes, either directly or by insinuation.

In my examples I will be using terran as the bad guy for illustrative purposes only, not because they’re necessarily overpowered. The Myth Busting section also has me profess my love for Jinro. Everybody loves the bad guy.

Background

+ Show Spoiler +

First, a bit about me: I played zerg in season 1 (the reaper era) and switched to terran in season 2. All right, enough about me.

As for this analysis, since I started playing SC2, I have found the idea of balancing a game fascinating and have been playing around with the mathematics involved. I tried to develop some simple models to use when thinking about the game, but the level of complexity of the game is quite high. My aim was to be able to derive simple ideas about balance without having to resort to simulations.

So I’ll present here a simple model that doesn’t use any of that, but will provide the groundwork to help think about game balance, how it will affect players, and how it might be detected from game statistics. I’m sure Blizzard already has something much better in place, so this is aimed mostly at helping those in the SC2 community who seek to better understand the statistics related to this game.

Who should care?

+ Show Spoiler +

If you’re interested in an objective way of thinking about game balance, this could be an interesting read for you. Especially if you have a background in maths/statistics, but in that case this will all be very basic for you.

If you’re thinking about going pro, you could find this helpful in making your decision – or helping you to choose your race. I personally believe that SC2 is impossible to balance as it is and might make another post explaining why if there is interest in this sort of thing if there are requests for it (I’m hoping HOTS will introduce different mechanics that will change things). If you’re going pro, you’ll be staking your future on the ability of Blizzard to balance the game – or perhaps to make your race OP.

If you run a website that gathers and presents SC2 statistics, you might consider a few of these ideas when developing tools for your site if your aim is to contribute to SC2 discourse.

If you are tired of balance discussions, you might gain a few ideas here.

If you would like to know how we can identify imbalance in the game using available statistics, go ahead.

Most of you should jump to the Outcomes tab and I think the Myth Busting tab can be interesting to a wider audience.

I think the vast majority of players won’t gain much from reading the technicalities. They’re there more so that those understand it can review my analysis to make sure it’s tight and maybe even build on it. But we all know that even if you don’t have a technical background you’re going to skim through it anyway so I’ve tried to make it easy to read and understand.

Preliminaries

+ Show Spoiler +

Don’t worry, there won’t actually be that much maths here. It’s used mostly for convenience, becauseI still want to make this analysis accessible to many players. It’s just easier to present using concrete examples people can picture in their heads than with generalised equations.

Let’s start by defining our league distribution matrices for each race. We’ll consider bronze, silver, gold, platinum and diamond for a neat 20% split between our elements, with shorthand B, S, G, PL and D.

To make things easier, assume that each race has 100 players and has an even distribution between skill levels, i.e.:

T = P = Z = [20 20 20 20 20]

We assume that the probability of player i beating player j is a function f(x) in [0,1] where x = (player i's effective skill – player j’s effective skill) and that f’(x) > 0 for x > 0. We also stipulate that for x < 0, f(x) = 1 – f(|x|).

For those quivering in their boots right now, we’ve basically saying that the higher skilled player is more likely to win the larger the skill gap, and then we just made sure our function was consistent in both directions.

We’ll also assume a uniform distribution of skill gaps, i.e. moving from B to S is just as difficult as moving from S to G.

We’ll assume that players are matched randomly against those around their rank, i.e. on average someone won’t get a lucky streak and have 50 games in a row against a race where they have a favourable racial imbalance (since it will be counterbalanced by those without lucky streaks).

Players will be put in a league where their effective skill means they have an average 50% chance of winning and losing against their equally ranked peers.

We also have some definitions:

Actual skill: the skill level of a player when there are no imbalances of any kind.

Effective skill: the skill level of a player after benefiting from imbalances. This will be matchup specific.

Racial imbalance: the imbalance in a race. Will be matchup specific. It contributes to a player’s effective skill. Keep in mind that we’ll only be applying this factor to one race for each imbalance to avoid double counting it.

Absolutely overpowered: a race is equally overpowered against both other races. So terran is absolutely overpowered if it is OP against Z and OP against P, and its level of OPness is the same against both races.

Don’t worry, we’ll be relaxing some of these assumptions later and you’ll get to enjoy more complex analysis. Won’t that be fun?

Analysis

+ Show Spoiler +

We will consider different scenarios for racial imbalances and consider both the short-term effects on statistics and the long-term equilibrium.

For simplicity, let’s consider that the racial imbalance raises terran’s effective skill one league higher than his actual skill. So a G terran will be playing at the PL level. All the terrans will basically get promoted and take the place of their unfortunate peers in other races, so we’ll get:

Case 1

+ Show Spoiler +

Terran is made OP against both Z and P. Z and P are balanced against each other.

Long-term equilibrium

T = [6 20 20 20 34]

P = Z = [27 20 20 20 13]

Those of you with a background in maths can probably already see that matrices are not the best way to present this. For the rest of you, don’t worry; the conclusions we draw from this will still hold for a more rigorous model.

Since this is the first case, I’ll walk through the process and trust that if you can read and understand this, you can apply it to the other cases yourself.

So why does this shift happen? Because there are 300 players and Blizzard wants 60 players in each league. Each PL terran now plays at the diamond level while the D zergs/protosses still play at the D level (remember, we want to avoid double counting. Although this technicality doesn’t actually matter – it just means we’ll change the magnitude of the racial imbalance a little bit and our conclusions are still the same). The problem is that with our matrices, we don’t have a continuum but rather 5 pegs to put players in. So the D zergs will lose a lot to the actual D terrans and be evenly matched against the actual PL terrans. So the PL terrans will be promoted. But they won’t all be promoted because there’s not enough room. Why? Because not enough zergs and protosses will be demoted.

The PL terrans will also be losing a lot against the actual D terrans (the same amount as the D zergs will be losing to actual D terrans, in fact). So we have 20 actual D terrans, 20 effective D terrans, 20 effective D protosses and 20 effective D zergs vying for a slot of 60 places. The 20 actual D terrans stay in D while the 60 remaining players have equal claim to the 40 places that remain. So they get 2/3 of it each. 2/3 of 20 is 13. We’ll round a bit by giving the remainder to terran because precise numbers don’t really matter – this is for illustration.

So what’s the outcome? Terran should dominate top D – the other races simply can’t compete. The top 20 players are all terrans. Then we have an even distribution of races in the lower 2/3rds of diamond, randomly allocated.

In bronze, we have the bottommost B being zerg and protoss – bottom 40 bronze is randomly allocated between the two races. Top bronze has terran in it with an even distribution as the other races.

In between, from S to PL there’s no difference.

How does it feel?

Once at equilibrium, the game will feel balanced to the players even though it objectively isn’t. The only way to tell will be from examining the data for the top and bottom levels. More on this in a later section.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvX win rate, which then evens out at 50%.
PvZ win rates constant at 50%.

Note that these are win rates against equally ranked opponents (not equally skilled). This is important because remember our purpose is to see how we can find these patterns in data, and we can observe a player’s rank and win ratio but cannot observe their skill – actual or effective.

If terran was made to be UP against both other races, we just see the reverse of everything.

Case 1A

Say we’re currently in Case 1, then Blizzard recognises that terran is OP and nerfs it approprlately.

Long-term equilibrium

T = P = Z = [20 20 20 20 20]

We just return to our balanced state.

Short-term effects

Spike in XvT win rate (crossing the 50% mark), which then evens out at 50%.
No change in PvZ win rates.

So it’s just the reverse.

Case 2

+ Show Spoiler +

TvZ now favours T. TvP and ZvP are balanced.

Long-term equilibrium

T = [14 20 20 20 26]

P = [20 20 20 20 20]

Z = [26 20 20 20 14]

Since this one is a bit different to Case 1, I’ll explain this one too.

To understand why it looks like this, consider what happens if T gets promoted halfway up and Z gets demoted halfway down. TvZ would be even based on effective skill (they’ll win 50% against each other), but T is facing P that’s half a league higher so will be losing more than half of their games. This means that overall T will be winning less than 50% of their games. And similarly Z will still be winning more than 50% of their games.

So T will be promoted less than halfway up and Z will be demoted less than halfway down. Because of the way we’ve defined f(x), we know this move must be 1/3rd of the way because then T will be playing Zs 1/3rd lower in effective skill and Ps 1/3rd higher and it will balance out.

How about the protosses? They will be playing against terrans with a lower effective skill (since there’s no imbalance in TvP) and so will be winning more, but at the same time will be playing against zergs with a higher effective skill. Due to the symmetry, this will even out so the protosses will not move.

So we have 20/3 terrans moving up from PL to D. We can just round this to 6 terrans move up and 6 zergs move down.

So what’s the outcome? T and P both dominate top D. T will have an edge in wins at top D, followed by P, followed by Z. Since zerg was shifted 1/3rd of the way down, we have to go 1/3rd of the way down diamond until we see zergs, whereupon there will be an even distribution between all 3 races. Note that T doesn’t necessarily dominate the top more than P because of the way we’ve defined how players are matched up – terrans will be winning only 50% against their protoss peers at the top. We’ll look at this in more detail later.

The opposite story occurs at bottom B.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then evens out higher than 50%
Gradual rise in ZvP and PvT win rates to the same level that the TvZ win rate eventually settles.

Case 2A

Say Blizzard recognises that TvZ is OP and nerfs it approprlately.

Long-term equilibrium

T = P = Z = [20 20 20 20 20]

We just return to our balanced state.

Short-term effects

Spike in ZvT win rate (crossing the 50% barrier to the other side), which then evens out to 50%.
ZvP and PvT win rates gradually settle back to 50%.

Case 2B

Say Blizzard recognises that TvZ is OP and buffs ZvT, but unintentionally buffs ZvP.

This case effectively balances TvZ and TvP but makes ZvP OP. It’s the same as Case 2, but with Z being the villain. So it still feels like rock-paper-scissors.

Short-term effects

Spike in ZvP win rate, which then evens out higher than 50%.
Drop in TvZ win rate (crossing the 50% barrier to the other side), which then climbs back above 50% and evens out at the same level as ZvP.
PvT win rates (already above 50%) evens out to the same level as ZvP (above 50%).

Case 2C

Say Blizzard thinks PvT is OP and nerfs PvT.

This effectively makes TvP and TvZ OP, so we get Case 1.

Short-term effects

Spike in TvP, which evens out at 50%.
TvZ and ZvP gradually return to 50%.

Case 2D

Say Blizzard thinks PvT is OP and nerfs PvT, which unintentionally nerfs PvZ.

This effectively makes TvP and TvZ OP, and ZvP OP too. So we get Case 1, followed by Case 2 on top of that, with Z as the villain in Case 2 so P is demoted a bit while Z is promoted a bit. Terran dominates the top, P dominates the bottom, and all 3 races feel like they’re in rock-paper-scissors.

Short-term effects

Spike in TvP, which evens out above 50%.
Spike in ZvP, which evens out at the same level as TvP.
TvZ (which was above 50%) gradually evens out to the level of TvP.

Case 2E

Say Blizzard thinks PvT is OP and buffs TvP, which unintentionally buffs TvZ.

This effectively makes TvP and TvZ OP, with TvZ being doubly OP. So we get Case 1 followed by Case 2, with terran being the villain in both cases. This will be similar to Case 2D in how it feels like rock-paper-scissors, with T dominating the top like in and Z dominating the bottom.

Short-term effects

Spike in TvP, which evens out above 50%.
ZvP (which was above 50%) evens out at the same level as TvP.
TvZ (which was above 50%) spikes and then evens out to the level of TvP.

Case 2F

Say Blizzard thinks ZvP is OP and nerfs ZvP.

This effectively makes TvZ and PvZ OP and TvP balanced. So we get Case 1, except with Z being UP rather than OP.

Short-term effects

Spike in PvZ, which evens out to 50%.
TvZ and ZvP (which were above 50%) even out to 50%.

Case 3

+ Show Spoiler +

TvZ favours T. ZvP favours Z. PvT is balanced.

Long-term equilibrium

T = [14 20 20 20 26]

P = [26 20 20 20 14]

Z = [20 20 20 20 20]

This is similar to Case 2. T will move up relative to Z, but be held back by P. Z will move up relative to P, but be held back by T. P will move down but be bolstered up by T. Initially Z will still be winning 50% of its games and so will not move. So T will move up and P will move down. T will move up until its wins against Z matches its losses against P. Since there are even numbers of all 3 races, T will move up the same magnitude P moves down. So we can see that T will move up 1/3rd a lot and P will move down 1/3rd a lot while Z doesn’t move.

Overall we can see that it looks similar to Case 2.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects

Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then evens out higher than 50%
Spike in ZvP win rate, which then evens out higher than 50%.
Gradual rise in PvT win rate, which evens out somewhere between the above 2 levels (with our assumptions, all 3 levels will be identical).

This case is very similar to Case 2, distinguishable only because there are spikes in two matchups instead of one.

Case 4

+ Show Spoiler +

TvZ favours T. ZvP favours Z. PvT favours P.

Long-term equilibrium

T = [20 20 20 20 20]

P = [20 20 20 20 20]

Z = [20 20 20 20 20]

This shouldn’t be surprising to anyone by now. Everyone is winning and losing the same amount.

How does it feel?

T would think that P is OP, P would think that Z is OP, and Z would think that T is OP. We get rock-paper-scissors with the races.

Short-term effects
Let’s say this was the result of a Blizzard patch and that the game used to be balanced. Then we will see:

Spike in TvZ win rate, which then stays spiked.
Spike in ZvP win rate, which then stays spiked.
Spike in PvT win rate, which then stays spiked.

A closer look at rankings and pairings.

+ Show Spoiler +

Sometimes it’s not the race that’s imbalanced but the player, and this particularly applies to high level games like GSL. Zerg isn’t overpowered, Idra is. It extends below the GSL, however, down through Grandmasters and perhaps even into Masters. With high level games we cannot use summary statistics to interpret imbalance – we must use more rigorous statistical analyses. Nothing short of that is acceptable for an objective analysis of game balance. So I would recommend that any use of high-level statistics involve some form of statistical analysis and anything that presents solely summary statistics should be disregarded.

But what about summary statistics for the lower leagues? We’ve identified that we can look at the distributions of the races between leagues and glean some information from that – but how do we differentiate between when the distribution is due to imbalance and when it might be due to player skill?

Let’s consider the simplest case. Everybody plays terran and is ranked on the ladder, and players are only paired against those immediately above and below them on the ladder. And everybody plays the same number of games. This should be fine since we’re only using this to look at how the stats for the top ladder players will look, and it should be safe to say they will all play a lot.

Consider the case where everybody has the typical distribution of skills we have been using, and then we add Idra to the ladder. Idra will place 1st have an 80% win rate against the 2nd player. The 2nd player will have a 20% win rate against Idra and a ~50% win rate against the 3rd player, giving him an overall 35% win rate. The 3rd player down will have a roughly 50% win rate.

We can immediately see 2 phenomena here. Firstly, the 2nd player has a poor win/loss ratio, and the lowest bronze player has a 50% win/loss ratio. Note that the bronze player is just a quirk with no real ramifications – to make this more like real life, we would add in the lowest bronze players in the same way we’re adding in Idra.

The more interesting aspect comes from the 2nd best player’s win ratio. He won’t be demoted because then the 3rd player will have an even worse ratio against Idra, and so on.

Let’s see if we can smooth this out. Say players can be paired against those 2 ranks up and down. For the sake of simplicity, let’s assume f(1) = 55% and f(2) = 60% unless you’re Idra, whereupon f(1) = 80% and f(2) = 85%.

Thus Idra will have an 82.5% win rate. Player 2 will have a 45% win rate. Player 3 will have a 43.75% win rate. And players 4 and down will all have a 50% win rate. Let’s be a bit more rigorous with this. We’ll define the Idra factor I that increases the skill gap in determining win probabilities. Then R(x) is the win/loss ratio of player ranked x.


R(1) = 1/2 * [f(1+I) + f(2+I)]
R(2) = 1/3 * [f(1) + f(2) + 1 – f(1+I)]
R(3) = 1/4 * [f(1) + f(2) + 2 – f(1) – f(2+I)] = 1/4 * [2 + f(2) – f(2+I)]
R(4) = 1/4 * [f(1) + f(2) + 2 – f(1) – f(2) = 1/4 * 2 = 1/2

Idra will have a high win ratio no matter what. Player 2 will have a low win ratio when:


1/3 * [f(1) + f(2) + 1 – f(1+I)] < 1/2
=> f(1) + f(2) + 1 – f(1+I) < 3/2
=> f(1) + f(2) – f(1+I) < 1/2
=> f(1+I) > f(1) + f(2) – 1/2

i.e. when Idra’s probability of winning against Player 2 is very high. The more even Idra’s win probability is, the closer this gets to 50%.

Player 3 will have a low ratio when:


1/4 * [2 + f(2) – f(2+I)] < 1/2
=> 2 + f(2) – f(2+I) < 2
=> f(2) < f(2+I)

This is unavoidable unless Idra is just a player of typical skill.

Fancy that, the 3rd best player on ladder inevitably has a bad win/loss ratio. And people think their ratios are actually indicative of their skill?

Is it possible to manipulate the pairings to give the players at the top of the ladder positive win ratios? Whatever we do, it would have to skew things downwards. The problem here comes from symmetry. Since if someone is more likely to be paired against someone lower down, that means the people lower down are more likely to be paired against someone higher up due to the symmetric nature of the pairing system (if you’re paired against someone, that means they’re also paired against you). For example, if we allow Players 2 and 3 to be paired more against Players 4, 5, 6 and 7 and less against Idra, then that means Players 4, 5, 6 and 7 now have a higher probability of being paired against Players 2 and 3 and lower probability of being paired against Players 8+. So we have effectively just transferred our low ratio down to other players and spread it out a bit.

One quirk in our current analysis is that this also assumes that Idra plays less than Players 2 and 3. If he plays more, their weightings against him will be higher and their ratios even lower.

If we introduce a second pro, say a clone of Idra who is not quite as good so that the top of the ladder goes Idra, Idrax, Player 3, Player 4…, then we will get a similar situation.

This allows us to come up with a general conclusion that for our analysis to apply, particularly when looking at race distributions on the ladder, and we want to exclude players who are so good they could be considered an imbalance on their own, we need to look down the ladder until players begin to have a 50% win ratio.

An interesting side point in our analysis is that even at the very top levels of play, players will have low win/loss ratios that are not statistical flukes.

Due to the Bnet match making system, at equilibrium, players outside of the very top and very bottom have an expected 50% win/loss ratio and any deviations are statistical flukes (even if half the jellybeans in the bag are red, if we pull out 10, it’s not uncommon to see 6 or even 7 red jellybeans). We expect top players to have a positive win/loss ratio, but as we have seen here, we also expect a buffer zone between professional level players and amateur players where the top amateur players will have a win/loss ratio below 50%.

Yet these players with sub-50% win/loss ratios are both by rank and by nature superior players to their peers with 50% win/loss ratios. Their low ratios come from the fact that they’re so good they’re the only amateurs who get matched against professional level players. This low ratio comes from the assumption that there is a significant skill gap between amateur players and professional players. We know that at the top level, skill gaps can be huge because Idra has a huge win/loss ratio.

Note that if this skill gap didn’t exist, we have already seen that even the top ladder players should have a fairly consistent 50% win/loss ratio.

Also note that if players in a lower league (say, diamond) has a good win/loss ratio, there’s also the possibility that they’re not in the correct league. This could happen if they have been practising in customs for 500 games, for example. Their ratio will likely remain positive all season. This introduces an extra element of care when interpreting win/loss ratios if the player had risen from lower ranks that season.

Effects of uneven race distributions.

+ Show Spoiler +

So far we have assumed that an equal number of players play each race, i.e. their ratios are 1:1:1. But what if the ratios were 2:1:1 in favour of some race? We can intuitively see that in cases where one race is absolutely overpowered (Case 1), there won’t be much change in our analysis. It becomes a bit more interesting in Case 2:

Say the races are distributed 2:1:1 and TvZ is OP by 1 level. Our initial balanced distribution:

T = [40 40 40 40 40]

P = Z = [20 20 20 20 20]

And then after the imbalance is introduced:

Long-term equilibrium
T = [30 40 40 40 50]

P = [20 20 20 20 20]

Z = [30 20 20 20 10]

If we take a cross section of terrans from, say, gold league, we see that 60/80 of their matchups don’t change (against T and P) and they have an edge in 20/80 of their matchups. For Z, 40/80 of their matchups don’t change and they have a disadvantage in 40/80 of their matchups.

As terrans are promoted, 40/80 of their matchups are balanced, they have an edge in 20/80 of their matchups and a disadvantage in 20/80 of their matchups. Say T rises x% of the way up one slot (from G to PL) and Z drops y% of the way down. The normalised magnitude of their advantage is represented by the skill gap (100-x-y). Similarly, their disadvantage is represented by the skill gap x.

The problem is that we don’t know how the probability of winning f(k) actually looks. For now we can consider the limiting condition where f’(k)=0 for all k. (I changed it to k just for this so nobody confuses it with the x that represents the terran shift here.)

They reach an equilibrium when:


20/80 * (100-x-y ) = 20/80 * x
=> 2x = 100 – y

As zergs are demoted, 20/80 of their matchups are balanced, they have an edge in 20/80 of their matchups and a disadvantage in 40/80 of their matchups. Using similar logic to terrans, we reach an equilibrium when:


40/80 * (100-x-y) = 20/80 * y
=> 200 – 2x – 2y = y
=> 3y = 200 – 2x
=> 3y = 200 – 100 + y
=> y = 50

Therefore x = 25

So terrans move 25% of the way up while zergs move 50% of the way down. Terrans made up 4/8th of diamond; now they make up 5/8th.

This hasn’t given us a very interesting result since it provides us with nothing fundamentally different to Case 2. The magnitudes of the moves changed as we would expect.

Now let’s see what happens if the ratio of P players change. Consider our initial condition:

T = [40 40 40 40 40]

P = [30 30 30 30 30]

Z = [20 20 20 20 20]

And after the introduction of imbalance:


20/90 * (100-x-y) = 30/90 * x
=> 5x = 200 – 2y
=> x = 40 – 2/5 * y

40/90 * (100-x-y) = 30/90 * y
=> 400 – 4x – 4y = 3y
=> 400 – 160 + 8/5 * y = 7y
=> 240 = 27/5 * y
=> y = 400/9 ~ 44

Therefore x = 22

Long-term equilibrium
T = [32 40 40 40 48]

P = [30 30 30 30 30]

Z = [28 20 20 20 12]

The increased presence of a balanced third race stabilises the movements. Again, nothing spectacular.

Now, the problem here is that we had initially defined f’(k) > 0 for k > 0. It isn’t so much of a problem because all that means is that what we have found are the maximum shifts. The actual shifts will be lower, i.e. people are more apt to remain in their leagues.

There is one important change to our conclusions if the racial distributions are not even, and that is in Case 4. Instead of having an even distribution of races across leagues, there will be some shifts due to different races being promoted and demoted more if they’re played less than the other races.

What about if the magnitudes of imbalances aren’t even?

+ Show Spoiler +

This mostly affects Case 4. If TvZ is more OP than ZvP, for example, then we will get uneven shifts, similarly to if the race distributions aren’t even.

What does this mean for SC2 statistics?

+ Show Spoiler +

Since our analysis was done on ladder rankings, our conclusions should only apply to the ladder. Statistical analysis is the best way to analyse high level statistics. To find the top level on the ladder where we can begin seeking conclusions from summary statistics, we should continue down the ladder from the top until we see players begin to consistently have a 50% win/loss ratio. Ideally we should see a little dip below a 50% win/loss ratio to know that we’ve hit the right spot, although this may be difficult to spot. We shall call this our upper statistical threshold.

I think it can’t be stated enough: GM and top masters is a bad place to be looking for racial imbalance.

We have identified patterns for two types of data:

Win/loss ratios for matchups (e.g. TvZ).
Race distribution near our statistical thresholds.

Note that the short-term effects don’t require a balanced initial position. They are the result of a shock (such as a Blizzard balance change) and aren’t affected by the initial state (whether it was balanced or imbalanced). In the case that the win rates weren’t at 50%, then we should measure the spike as a relative change.

We should also note that there are no observable impacts on distributions in the middle leagues. The only places where we can observe these patterns are in bottom bronze and top diamond (i.e. masters).

Let’s first consider what each long-term equilibrium means. Remember that we begin looking by looking down from our upper statistical threshold.

+ Show Spoiler +

All races fairly represented

Note that this doesn’t require a 1/3rd split. If terran makes up 50% of the population, then we expect terran to be represented 50% in bronze and diamond. So here we’ll introduce the idea of a race being fairly represented if its presence in a league is equal to the ratio of players who choose that race.

From what we’ve seen, this can mean that all the races are perfectly balanced, or we have a complete circle of imbalances like Case 4. The way to tell is by whether or not we have a rock-paper-scissors situation in the win ratios. If it feels normal, then it’s balanced. If each race has a favoured matchup, then we have a circle of imbalances. It should be easy to determine the direction of the circle and balance this.

One race dominates the top

This means that race is OP against both other races. Now, there can be nested imbalances. So if terran dominates the top, that means T is OP but PvZ can also be OP. We can tell by looking at the bottom because we expect Z to dominate the bottom if that’s the case (refer to Case 2D). We can also tell by looking at the racial win ratios. If it feels like rock-paper-scissors, then there’s probably also a nested imbalance.

One race dominates the bottom

This means that race is UP against both other races. Once again, there can be nested imbalances, so if terran dominates the bottom, or it feels like rock-paper-scissors, there can still be imbalances in PvZ. It’s just the reverse of the above case. In general, for any imbalance that is detectable from the top, its opposite is detectable from the bottom.

Two races dominate the top

This means that there is an imbalance in at least one matchup. The imbalance is against the race that is missing from the very top. If there’s no rock-paper-scissors situation, then the missing race is absolutely UP and it should dominate the bottom.

Now, if we have a rock-paper-scissors situation, we know there is at least one imbalanced matchup, and there may be two, and if there aren’t even numbers of players playing each of the 3 races, we may even have a circle of 3 imbalances. We can identify the race that is OP against the missing race by looking at the win ratios to see which race wins more than 50% of its games against the missing race. The race we identify should also be absent from the bottom.

If we have access to the short-term data, then we can count the number of spikes and work out from there whether we have one, two or a circle of racial imbalances.

Two races dominate the bottom

This is just the reverse of the above case.

If the 3 races dominate each other in rock-paper-scissors

We know that at least one of the matchups is broken, and it’s still possible for one race to be completely UP or completely OP. We need to look at the top and bottom of the ladder to gain more information and arrive at one of the above cases.

The short-term data also provides us with valuable clues. Remember that we only use data below our upper statistical threshold.

+ Show Spoiler +

Any patch that corrects an imbalance results in a spike in wins that makes the originally UP race look OP.

If Z was UP against both T and P, its win ratio of, say, 40% against both races will spike to, say, 60% soon after a patch and then gradually drop to 50%. If Z was only UP against T, then we had the rock-paper-scissors situation and a proper patch will cause this spike only in ZvT, after which all 3 win ratios will gradually move to 50%.

This is a good way to keep track of whether or not a patch rebalances the game. If there is no spike past the 50% mark to make it temporarily look like the previously UP race is now OP, then balance has not been fully restored.

The only time this doesn’t happen is in Case 4, in which we just see a spike straight back down to 50% if we balance perfectly.

At a balanced state, the win ratios should all be 50%.

The smaller the imbalances, the harder it is to detect. At very small levels it will be indistinguishable from noise. I suspect that most players will underestimate the size of the imbalance that’s required for this – it is likely to be higher than most players think. On the other hand, most players don’t ignore noise anyway and so have the tendency to attribute imbalance to the case where it’s just noise.

So it’s easy, right?

+ Show Spoiler +

Firstly, we should point out that our conclusions only apply when analysis statistics for active players. Analysis on balance that includes players who were not active during the period under consideration is invalid. The statistics, such as win rates, should all be from that period.

We’ve identified the problems with using data from the top and have identified a way to find the upper statistical threshold, but there’s a problem with the bottom too. The existence of portrait farmers and smurfs make it difficult to use data from the bottom. Even if we assume their races are randomly distributed, they add a significant level of noise. It may be possible to cut out portrait farmers by applying the skill gap argument to the bottom and find a lower statistical threshold from which to begin, but the existence of smurfs still makes this data unreliable. Smurfs are more annoying because they can exist in all leagues, although their presence in the top leagues are moot. This means that any reliable summary statistics we use for analysis should only come from the highest league below our upper statistical threshold.

This could be one reason Blizzard is cracking down on portrait farmers and smurfs – to make their balancing job easier.

So I would be cautious in accepting analysis on statistics from the bottom of the ladder. We probably cannot use any statistics from the bottom of the ladder when drawing our conclusions.

Now we should highlight one interesting outcome from our model, which is:

When matchups seem like rock-paper-scissors – this always indicates that there is imbalance in at least one matchup. And if the league distributions between races are uneven, then we can also make the following conclusion: if there is at least one racial imbalance beyond complete OPness, then we will have rock-paper-scissors at equilibrium. This allows us to rule out some claims of imbalance.

What about matchup win percentages? Well, there are problems there too. Since strategies take some time to evolve, we are unlikely to see many spikes immediately, so all the shifts will look gradual. Which means we need to be looking at the steady-state win percentages.

We know that if one race is completely UP or OP, then steady-state win percentages will be 50% - it’s impossible to tell it apart from a perfectly balanced game.

So what does it mean when our data shows that one race has a high win rate against both other races? Well, that’s interesting because:

There is no steady-state win rate where one race has a high win ratio against both other races.

In other words, the state of the game is still in flux if this is the case. A high win ratio against both other races thus signifies that race being buffed against both other races, or the development of new strategies for that race that have made it more powerful against the others. It should eventually drop to 50%.

This raises a major difficulty in analysing summary statistics, which is that the state of the game is often likely to be in flux. New strategies will be developed, the metagame may change, some players will improve and others will be out of practice. There’s not much that can be done about this except to hope that there is enough data available that this won’t be as big a deal, and I’m sure Blizzard already has more sophisticated mechanisms in place.

One thing we haven’t considered much is the possibility of players switching races, but that is relevant here. This shouldn’t affect our analysis much but it provides an alternative explanation for one race suddenly performing well against both other races – or performing poorly.

If we assume that a player switching races will go on a losing streak until they even out, then if we get a lot of terran players switching to protoss, then protoss will show a dip in its win ratio against both other races for some time before it evens out. If we further add in the concept where the player acclimatises to a race before climbing back to their initial spot on the ladder (i.e. they attain an equal level of mastery over the new race as they had over the old one), then protoss will first show a dip below 50%, and then a rise above 50% against both other races before evening out.

We are focusing more attention here because this situation (where one race shows win rate dominance against both other races) seems to occur a lot on the ladder. So I will summarise the possibilities for this:

A new Blizzard patch made the race OP against both other races. This should only be temporary.
A new strategy or tactical exploit was discovered for the race that is effective against both other races. This could be a new way to win, or a new way to negate the other races’ strategies. This should also only be temporary.
Many players simultaneously decide to switch to that race (causing a dip) and then learn how to play that race well (a rise). This should only be temporary. Note that the reverse may also occur (players switching away from a race), with the opposite effect.

No race should retain a high win ratio against both other races for long.

Similarly, we note:

No race has a high win ratio against any other race at equilibrium unless all the races also have high win ratios against another race.

This should be pretty self-evident. We should note that this relates right back to the rock-paper-scissors. This allows us to draw the conclusion:

Win rates will either be 50% or in a state of rock-paper-scissors at equilibrium.

But wait! I saw a graph that violates one of our conclusions!

+ Show Spoiler +

These conclusions apply to ladder because that’s what we modeled. If those graphs were of the ladder, then there must be more to the story the author didn’t look into, which is a sign of bias on their part. When you must choose between a subjective analysis made with an agenda or an objective analysis that can be applied to any general situation, it’s usually better to stay with objectivity.

So whenever somebody points to a set of ladder statistics that shows one race clearly dominating the others with a high win rate, then we must look towards recent patches, advances in that race’s strategies/exploits or players switching away from the overperforming race to explain the discrepancy. If the high performance was preceded by an underperformance with no patches in between, we may also seek an explanation from players switching to that race some time ago and are now rising back to their proper rank. If they show us a graph of one race dominating just one other race, we should keep our eye on the other matchups. A spike in only one matchup would alert us to seek an explanation in a balance patch or a new strategy/exploit, which could well be an imbalance. A gradual change could mean that it’s caused by a different matchup being imbalanced.

If they have used data from the top level, then their analysis must be disregarded since it is invalid for reasons we have already discussed. This obsession with the top might seem intuitive, but our analysis shows that any effects of imbalance on win ratios should be consistent through all of the leagues, and that if they go too high then they cannot draw any valid conclusions without conducting a proper statistical analysis. Thus it might be worthwhile just analysing racial win/loss data for diamond league (as separated from master league), using them as a threshold for when strategic and tactical elements become a major influence on performance. We are probably right to be hesitant to use data from lower down, where mechanical factors might dominate everything else.

One place where these conclusions need adjusting is when interpreting results at the top level, say GSL. In that case you really need to do a detailed statistical analysis. Summary statistics is useless for drawing any conclusions there because player skill becomes an important factor that must be considered in the analysis, and you just can’t do that using summary statistics. Any graphs of the GSL or GM or top masters (down until players begin to have a 50% win/loss ratio) without a proper statistical analysis are useless. I talk a bit more about factors that come from and can be applied to the GSL in the Myth Busting section. 100% of people who read it will end up loving Jinro (sample size: 1).

Outcomes

+ Show Spoiler +

Remember that our conclusions apply in cases where we attempt to interpret summary statistics. A proper statistical analysis has fewer limitations.

Top level data cannot be analysed using summary statistics. It requires a proper statistical analysis to draw any valid conclusions. Disregard any analysis that attempts to use data from high masters above the level where players begin to consistently have a 50% win/loss ratio since any conclusions they draw will likely be invalid.
There should be a gulf near GM/top masters where players experience a dip in their win/loss ratio below 50%. This should dispel the myth that win/loss ratios mean anything, even in master league.
The best place to take data is around this dip, where players begin to consistently have an overall 50% win/loss ratio. This is our upper statistical threshold. Above this point, statistical analysis is required.
Any effects of balance on racial win/loss ratios should be fairly consistent across leagues.
Imbalance doesn’t affect distributions in the silver, gold and platinum leagues.
Portrait farmers and smurfs make interpreting imbalance using data below the top leagues difficult. It could be one motivation behind Blizzard’s crackdown – to make their job of balancing the game a bit easier. Thus it is best if we only analyse racial win ratio data from the top leagues (while remaining below our upper statistical threshold).
If one race is absolutely overpowered or absolutely underpowered, the game can still feel balanced and have 50% win ratios between races. The only way to tell is by looking towards our upper statistical threshold to see how the races are represented.
If the game feels like rock-paper-scissors (which should be confirmed by the racial win ratios), then there is definitely an imbalance. We can gain clues regarding the imbalance by looking at the upper statistical threshold.
Aside from 50% win ratios or rock-paper-scissors win ratios, there are no other stable racial win ratios on ladder. Any variations from these stable ratios are the result of balance changes, changes in strategies/discoveries of tactical exploits, or players switching races.

Extensions

+ Show Spoiler +

We haven’t accounted for the fact that players in certain leagues might prefer some races over another. Maybe once players reach diamond they become comfortable enough to pick up zerg, for example. This sort of situation can arise if there is a steep initial learning curve for a race. This does not necessarily make the race harder to play at the higher leagues, however. For example, the learning curve can take the form of required knowledge. If Blizzard made zerg completely balanced except they automatically lose a game unless they begin a spawning pool by the 5-minute mark (say, their buildings spontaneously explode if a spawning pool had never been started, but nothing happens if it had been started then cancelled), then we would see zerg UP until the skill level where players have this required knowledge, after which it is perfectly balanced. More realistic examples would be knowledge of timing pushes by the other races.

There are probably many more considerations like this one. Neglecting this was intentional and is not a flaw in the analysis. Our analysis shows us the best case scenario for our ability to interpret any data gleaned from SC2 summary statistics. Factors such as this one confound the situation and further restricts our ability to interpret data. Hence:

This analysis shows the limit of what is possible to be interpreted using summary statistics. Many factors make application even more difficult, with more restrictions.

In other words, we’ve identified the conclusions that it’s possible to draw from summary statistics in an ideal world. When things diverge from the ideal, players are even more restricted in the conclusions they can draw. This should discourage the use of summary statistics in balance discussions.

And it also makes Blizzard’s job harder too. This goes to show how difficult it would actually be to balance SC2. Something to think about if you’re a pro or considering going pro.

We also haven’t considered the possibility of players switching races. We discussed this briefly, since the effects should generally be intuitive, and it’s unlikely to have much effect on our conclusions.

It would be interesting if we can get this model represented more rigorously. However, it is likely that in the face of the limitations in summary statistics that we have presented here, players’ interest will shift towards statistical analysis of high level play.

Myth Busting

I’m confident I’m not the only one who feels that most of the balance discussion in the SC2 community is clearly biased and thinly veiled QQ. Players often point out insignificant points or resort to ad hominem attacks and I hope this section can help to encourage community members to just outright ignore those sorts of posts, or perhaps stricter moderation in community forums.

+ Show Spoiler +

Only masters/pros should talk about balance.

+ Show Spoiler +

I am of the opinion that no player can truly talk about balance. Psychological studies have found that everybody has an inherent bias that favours themselves regardless of how unbiased they try to be. In other words, any balance suggestion by a pro will necessarily try to make his own race OP rather than to achieve objective balance, no matter how well-intentioned the pro is. This is especially true for pros, whose livelihoods depend on winning tournaments. No, they’re not bad people. This works at an unconscious level and happens to everyone.

Also, just because someone is in masters doesn’t mean their arguments will be sound.

”Fixing” this mechanic will balance the game.

+ Show Spoiler +

Since Blizzard uses objective data to balance their games, if they did nerf a race as many players wish, it would result in that race becoming UP and receiving other buffs to compensate.

Personal experiences matter.

+ Show Spoiler +

We’ve all seen it.

A: TvZ is so hard!
B: Actually, TvZ is my best matchup.

They’re both right.

You need a bigger sample size.

+ Show Spoiler +

This is often used by those who have learned a bit about statistics in high school. It’s really a cheap point, and the people who use this demand sample sizes in the thousands, and once given that will still not be satisfied and demand samples in the millions. The required sample size depends on what you plan to use it for, and it also depends on the nature of the sample itself. This point is also quite meaningless but I won’t go into the details. The bottom line is, I would be interested in a statistical analysis with a sample size as low as 50 if the results were significant. You don’t need that big a sample if your point has substance. A larger sample allows you to detect smaller imbalances, though.

The caveat is that this requires a proper detailed statistical analysis, not just some pretty graphs. We need more of the following sort of analysis, although hopefully done…well, better:

http://www.teamliquid.net/forum/viewmessage.php?topic_id=219399&currentpage=All

I won’t go into the technical details because there were some very basic things that bothered me. In this case he'd analysed data over a number of patches, i.e. includes data from patches where terran was generally agreed to be OP. Removing data from those patches in the analysis to analyse data from only current patches would provide a more accurate picture of the current state of the game. The analysis need not be restricted to just one patch period, however. It’s very difficult to do this properly with Blizzard patching the game so often.

I am also bothered by the fact that he didn’t include a table summarising his estimates for the parameters and their confidence intervals. I don’t like having data interpreted for me since we all know that statisticians lie and SC2 players are biased. I don’t think he was very clear on which parameters he used either.

I’ve got to hand it to the guy for trying, though. There were lots of naysayers in the thread, and we haven’t really seen any more efforts like this. I wish TL mods were harsher in this regard – what we need are more people putting in the effort to do stuff, and the current environment in the community discourages anybody from doing this. Who wants to go to all that work if their efforts are going to be dismissed without any valid reasons?

We can only find imbalance by looking at the top level of play.

+ Show Spoiler +

The problem here is that people tend to post simple summary statistics and graphs and call it a day. Consider, for example, if this game only had terran and no other races. Our statistics from GSL would show MKP and MVP winning most of their games against other terrans. Conclusion? It’s not terran that’s OP, it’s the players.

Now add the other races and players back in. When MKP plays against a protoss and wins, how much is a result of his race and how much of it is a result of his hard work and raw talent? The summary statistics do not show this.

An analysis has to be perfect to prove a point.

+ Show Spoiler +

We often see people nitpicking on the tiniest points, and it seems like a lot of discussion degenerates because of it. Often the minor points have no significant impact on the analysis and it’s a waste of time to discuss them. This is evident in the thread I linked to above where a TLer did his own statistical analysis. Yes, there were flaws, and some important flaws were pointed out, but some of the other “flaws” were also unimportant. I think a lot of it came from people trying to comment on something they didn’t understand, and trying to dismiss something because they didn’t like the conclusions.

If everybody, especially pros, chooses a race, then that race must be imbalanced.

+ Show Spoiler +

Although variations in race choices can indicate imbalance (and I would certainly use it as a clue), it’s not sufficient evidence to draw the conclusion. Sure, people switch races they believe are OP. But people can be wrong. You must have seen people make many clearly incorrect assertions. Perhaps they just enjoy playing a race. The differences in race preferences between regions can represent different cultural preferences.

Maybe people choose a race because they have an affinity for it.

+ Show Spoiler +

This is a difficult idea, which we can flesh out by looking at the extreme cases. The claim is that discrepancies in the ratio of players who choose a race can be explained just from personal preferences rather than from selection due to imbalance. This is the opposite idea to the one posed above.

Say 100% of all players have an affinity for terran, so they perform better as terran than as protoss or zerg. Everybody would draw the conclusion that terran is OP and therefore should be nerfed, right? And for all purposes, this is a reasonable outcome.

But what if only 99% of players have an affinity for terran, and 0.9% have an affinity for protoss and 0.1% for zerg? Should we nerf terran? By how much? Let’s assume the players are equally skilled, except the affinity gives them an edge, say their equivalent skill race (which is found by combining their actual skill with their racial affinity) means they perform one league higher with that race. Assume that each race has an even skill distribution (20% with actual skill in each league). What will happen?

Well, if everybody picks the race they have an affinity for, then it will be perfectly balanced, but we’ll have 99% terran, 0.9% protoss and 0.1% zerg. To make this more interesting, consider that some of the players with a terran affinity choose to play other races (those with an affinity for protoss and zerg stick to their own races) until we have an equal 1/3 distribution between races. And assume that we still have an even skill distribution between those who chose to play each race.

What will happen?

The terrans will dominate the top of the ladder. A player with a terran affinity playing terran will be approximately one league higher than a player with a terran affinity playing protoss. So there will be few terrans in bronze, silver, gold and plat will see no real difference, but diamond will have a lot of terrans (remember, each race has a 33% share of the player pool). Since those with an affinity for the other races are so few, we can just ignore their effects.

So what should Blizzard do?

If Blizzard does nothing, people will point to terran dominating diamond league and cry imba. So let’s say Blizzard chooses to do something. They decide to nerf terran so that the 3-factor effective skill (which combines actual skill, racial affinity and racial imbalance) for a player with a terran affinity is equal to their actual skill – basically, they introduce a racial imbalance that negates terran’s racial affinity. They do this by nerfing terran compared to protoss and zerg.

What will be the outcome?

The leagues will stabilise with even ratios in each. There will be slightly more protoss and zerg in the higher leagues, but 0.9% and 0.1% will fly below our radar – it will be indistinguishable from noise. So all’s good, right?

Let’s go back and think about those protoss and zerg lovers we have been ignoring until now.

In our model, we have one top zerg, one top terran and one top protoss with equal skill level (I haven’t said it directly but you should have assumed so). They are equally good at the game and work equally hard, so if there were no imbalances, they should each have a 50% chance of winning against another. And since they’re at the very top, they’ll be our pro players in GSL. What will happen to these players?

Well, Blizzard has introduced a racial imbalance to balance the rest of the ladder. Unfortunately, that has massive repercussions for these players’ careers. Let’s break this down step by step.

First we have the players’ skills without racial imbalance or affinity. They’re equally skilled at #1 diamond.

Now we introduce the racial imbalance and let these guys play their best races. Their equivalent skill is now one league higher than #1 diamond. In other words, they’re one league ahead of anybody with #1 diamond skill playing their off race. Their equivalent skills are still even – they still have a 50% chance of winning against each other.

Now Blizzard introduces a terran nerf to help balance the rest of the ladder. The zerg and protoss still have a 50% chance of winning against each other, but now terran’s 3-factor effective skill is one league lower – that terran is playing at #1 diamond while the zerg and protoss are playing one league higher in skill. So the terran will lose more than he wins against his equally skilled peers.

Why is this important? It’s not – unless you’re pro or plan on going pro.

This goes to show how hard it is to balance for ladder and for pros at the same time. We often see players point out that most casual gamers will be more comfortable with terran because the race is most similar to that of other RTS games, then it’s likely that terran pros will have a disadvantage against their peers if statistics show terran is balanced on the ladder. The magnitude of the disadvantage for the terran pro is equal to the average gain in performance due to players’ affinity for terran.

(On a personal note, this is why I love Jinro. Jinro fighting!)

So this is a tough one. If everybody has an affinity for a race, then that race is clearly overpowered. If even one person has an affinity for another race, then nerfing the race that most players have an affinity for gives that other person an edge in professional competition.

In other words, if you want to go pro, you have an edge if you’re better at a less conventional race and if Blizzard chooses to balance for the ladder rather than for the pro level. We’re also assuming that players choose to play the less intuitive races, but that shouldn’t be much of a stretch in my opinion.

If everyone stuck to their favoured races, then the ladder will be balanced with some intervention by Blizzard, but we will have a wonky race distribution. If players choose to play races they’re not good at (because “it’s more fun”, say) then it’s possible we’ll see Blizzard’s intervention if the overwhelming majority have an affinity for one race, resulting in them creating an imbalance that manifests at the pro level. This can also happen if players stick with their favoured races but Blizzard wants an even distribution of races (I haven’t shown the analysis but it’s similar to what we’ve already covered). But, yes, it is in fact possible for the races to be balanced and for one race to outnumber every other race.

(What, did you think I’d bust the myth saying that race numbers imply imbalance and then say that race selection can’t be due to personal preference?)

It’s only a small imbalance.

+ Show Spoiler +

Statistics is not very intuitive and unfortunately it leads to many incorrect interpretations. Let me pose to you the following hypothetical situation:

A summarised report on college admission rates show that 40% of college entrants are female and 60% of college entrants are male. Is this “good enough”?

To the untrained eye, the numbers look pretty close to 50%. But in reality this is a massive difference. The true numbers crop up when we compare them with each other. So out of 100 entrants, 40 are women and 60 are men. That means 150% as many men enter college as women. When presented that way, the magnitude of the difference becomes much more evident and it becomes clear there is sexism problem with the college admission system.

Now back to SC2. Consider that Blizzard’s margin of error is +- 5%. So if T wins 55% of games against Z, Blizzard considers that balanced. (55-45)/45 = 22%. So terran wins 122% as many games as zerg in TvZ. It’s still quite a difference.

So imbalances that we glean from win percentages are actually much larger than they first appear.

Trends from only one region is a valid reflection on racial imbalance.

+ Show Spoiler +

This one should be fairly self-explanatory. If two different regions show different win ratios for the races, then we cannot conclude that racial imbalance is the primary factor involved in either region from summary statistics alone.

One race takes more skill to play than another.

+ Show Spoiler +

If we define this as meaning it’s easier to get to a certain league as one race or another, then our model certainly indicates that this is possible if one race is completely OP. However, whenever players make such a claim, it is usually a subjective opinion (biased towards their own race and prejudices) with no objective support.

From our analysis, we can see that certain conditions must be present for one race to take more skill to play than another:

Win rates are 50% between all races, i.e. the game looks balanced based on win rates. This is an unintuitive requirement that many QQers miss.
The UP race is overrepresented at the bottom of the ladder. We have already discussed how this may be difficult to identify properly.
The UP race is underrepresented at the top of the ladder. Look down from our upper statistical threshold. Is the race underrepresented? Look a bit lower – is the race now suddenly fairly represented?

If the above 3 conditions hold, then there is objective support for a racing being UP and requiring more skill to play. Remember that underrepresentation is by comparison with how many players choose that race overall. So if 10% of all active players main terran, then 10% of terrans at the top of the ladder is fair representation.

The players know better than Blizzard.

+ Show Spoiler +

I have faith in Blizzard. Everything I’ve presented here is pretty basic stuff to anybody who’s done any maths and I’m sure Blizzard has statisticians helping the balance team who have worked all this out and have created much more rigorous models. As we’ve seen, sometimes an imbalance can create a different illusion at the player level that leaves other clues on the more objective statistical level. I’m sure Blizzard has already conducted its own in-house statistical analysis of the GSL that I bet many players would love to see. But that sort of thing is a lot of work for the community to replicate.

And let’s face it, the upshot of everything we’ve talked about is basically that Blizzard has one hell of a job ahead of them. :-)

Final Words

I have more analysis on specific mechanics within SC2 but am not sure if anybody would be interested in seeing them. The idea is to figure out whether or not it is possible to balance SC2 at all, and how the game would look once it is balanced. Once again, it’s a general analysis and avoids looking into current specifics on units, etc. I think the outcomes will surprise many players. Let me know if it will be worth writing that up.

And this might seem like a long-winded way of telling TL that I love Jinro, but, well…

usa11220

United States38 Posts

May 20 2011 05:10 GMT

best first post ever?

s[O]rry

Canada398 Posts

May 20 2011 05:10 GMT

Mind. Blown. Well done, sir. Good read.

CoSyN

United States122 Posts

May 20 2011 05:13 GMT

Nice first post. Quite amazing...

sKo

United States45 Posts

May 20 2011 05:15 GMT

You should post a PDF or some other format so I can save this without all the spoilers. It'll be easier to read closely.

edit: Because it's spot on. Great job, sir.

sOAvoid

Canada206 Posts

May 20 2011 05:16 GMT

Amazing.

Antedelerium

United States224 Posts

May 20 2011 05:21 GMT

Phenomenal first post. I loved reading the part analyzing critiques based on a player's affinity for a certain race. Very well written and very intriguing. I hope this motivates like-minded people to spend time doing their due diligence and putting in time researching the ladder before making any more bold claims as well as preventing others from shooting down research like this.

Nik0

Uruguay460 Posts

May 20 2011 05:21 GMT

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

Probably

DarkPlasmaBall

United States45164 Posts

May 20 2011 05:23 GMT

If this became the new standard for opening threads...

+ Show Spoiler +

There would be exactly 1 thread on TeamLiquid.

Awesome job, I like your explanations

mister.bubbles

Canada171 Posts

May 20 2011 05:26 GMT

#10

Great post! I think the various imbalances on maps are noteworthy as well. I spent a lot of time wondering to myself how Blizzard balanced Brood War so well when they stopped making balance patches so early in the development of the game and eventually concluded that the balance we see in BW today is more due to the talent of map makers. I think the racial imbalances in SC2 and BW vary more from map to map then they do over all, which means that if the game is already exciting to play and watch (which I think in SC2 will be improved), that the focus should be more on turning out balanced maps for the competitive scene to take place on then balancing the races.

Jombozeus

China1014 Posts

May 20 2011 05:29 GMT

#11

First of all, you completely ruined your legitimacy with the myth busting. It was completely necessary, biased, and based on nothing but personal experience. You do not bust myths with your opinion, that's ridiculously counter-intuitive.

Secondly, this is unfortunately the wrong way to look at imbalance. There are much more related to imbalance than those meaningless statistics. The "dip" or the upper statistics threshold you speak of assumes that the level of skill follows a linear trend for all three races. As in, say the maximum of a race is 100%, three players playing at the same level are all playing at 80%. This is not true. It can be possible that a 80% zerg will lose overwhelmingly to a 80% terran, but a 95% zerg will beat 95% terran most of the time.

This can be due to a subtle timing, an increase in mechanics breaking a threshold of the amount of units you need at a certain time for said timing, or whatever other reason(s).

We cannot assume this "dip" of players follow the same trend as the actual maximum potential for the races. We also do not know what 100% of a race's potential is like, and on top of that, we do not know when another paradigm shift will be incoming, that will ruin your long-term hypothesis.

rift

1819 Posts

May 20 2011 05:31 GMT

#12

In line with your own basic assumption, a lot of your post contains personal bias towards what you believe.

Imbu

United States903 Posts

May 20 2011 05:32 GMT

#13

Holy.... This is an amazing first post. Half way through it and I can't believe you went through everything that cleanly.

I can only wish I could write and analyze at your level.

Severian

Australia2052 Posts

May 20 2011 05:32 GMT

#14

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

How could you possibly have read and digested the OP in less than 2 minutes?

sinani206

United States1959 Posts

May 20 2011 05:32 GMT

#15

mind=blown

This is why postcount doesn't matter.

sam!zdat

United States5559 Posts

May 20 2011 05:35 GMT

#16

If I'm understanding you correctly, you are assuming that the effects of racial imbalance scale linearly with that player's skill?

That is, you are assuming that XvZ is as imbalanced in bronze as it is in diamond? Particularly in the case of things like spellcasters a certain level of skill might be required for the imbalance to be felt. Also strategies that are metagame dependent will only be successful in a league where that metagame obtains... There are a number of factors which make me think at first glance that racial imbalance would not be particularly likely to be a constant factor across leagues.

for example, cannon rushes might mean that protoss is absurdly overpowered in bronze league, despite being balanced in grandmasters.

As something of a sidenote, I think the most interesting and probably most rigorous way to look at the ladder is as an ecological system.

edit: great post btw

lazyo

Germany90 Posts

May 20 2011 05:37 GMT

#17

I apreciate your effort but ladder play is not a good basis for balance analysis, especially not for the master/GM league since pros use it mostly as a practice tool and do not 100% play to win.
Instead, data from tournaments should be gathered.

Jombozeus

China1014 Posts

May 20 2011 05:38 GMT

#18

On May 20 2011 14:35 sam!zdat wrote:
If I'm understanding you correctly, you are assuming that the effects of racial imbalance scale linearly with that player's skill?

That is, you are assuming that XvZ is as imbalanced in bronze as it is in diamond? Particularly in the case of things like spellcasters a certain level of skill might be required for the imbalance to be felt. Also strategies that are metagame dependent will only be successful in a league where that metagame obtains... There are a number of factors which make me think at first glance that racial imbalance would not be particularly likely to be a constant factor across leagues.

for example, cannon rushes might mean that protoss is absurdly overpowered in bronze league, despite being balanced in grandmasters.

As something of a sidenote, I think the most interesting and probably most rigorous way to look at the ladder is as an ecological system.

Yes, this is what I was saying. I couldn't of thought of a better way to put it as a linear scale. When I said normal distribution, this is the concept I was thinking of haha.

Normal distribution would have assumed differently given different standard deviations.

Off to edit my post.

gogogadgetflow

United States2583 Posts

May 20 2011 05:42 GMT

#19

So are you going to be analyzing the tlpd or what?

Snuggles

United States1865 Posts

May 20 2011 05:42 GMT

#20

Sharks jumping in so fast to bring discredit to this poor guy. I honestly don't care if he's right or wrong. I just agree with the way he thinks. Nice job.

FroZeNN

United States165 Posts

May 20 2011 05:42 GMT

#21

WoW what an amazing thread, would love a PDF just for an easier read got lost a couple of times in the spoilers.

After reading this I have some thinking to do hehe =)

RoninShogun

United States315 Posts

May 20 2011 05:45 GMT

#22

Edit: Sorry was under wrong Section

Yoshi Kirishima

United States10366 Posts

May 20 2011 05:46 GMT

#23

This is one of the best posts ever.

Dude, just wow! Good job, hopefully people will feel less confident now that they're better suited at balancing than Blizzard is.

Oh I think I found an error:

Under "It's only a small imbalance"

Now back to SC2. Consider that Blizzard’s margin of error is +- 5%. So if T wins 55% of games against Z, Blizzard considers that balanced. (55-45)/45 = 22%. So terran wins 122% more games than zerg in TvZ. It’s still quite a difference.

They win 122% of the amount of games Zerg wins, but 22% more (100% is the same as the amount of games zerg wins). Correct?

saus

United States59 Posts

May 20 2011 05:50 GMT

#24

He had to simplify the idea of balance to get some results. If balance WAS like this, we would get these results. It can still help us understand how game balance is manifested in league distributions etc...

Azzur

Australia6260 Posts

May 20 2011 05:52 GMT

#25

Great post! I read the entire thing and I understood most of it.

Now, what I'll be super interested if you can come up with some sort of statistical method to interpret data at the highest (i.e. pro) level.

Primadog

United States4411 Posts

May 20 2011 05:53 GMT

#26

I love you. Are you taken?

Waking

United States46 Posts

May 20 2011 05:56 GMT

#27

Great post, but like a few people have stated, imbalances always appear/disappear at different levels of play.

If I may make a suggestion - instead of looking at the top players with 50% win ratio versus the bottom, you should look at the top players with 50% versus plat players. This would yield the same information. Assume that plat players represent the equilibrium ratios for the mid-leagues (silver-plat) and find that in comparison with masters. This helps to eliminate the bias I mentioned earlier.

Essentia

1150 Posts

May 20 2011 05:58 GMT

#28

Balance can only be measured at the highest of high level players.

Ladder statistics based on W/L are also a dumb way to measure stats.

avilo

United States4100 Posts

May 20 2011 06:00 GMT

#29

Just because someone made an incredibly long post does not make it mega awesome or even remotely accurate.

The most obvious thing completely wrong with the "post" is to not look at pros for balance. In any RTS game or game you always look at the top level for balance because these are the people playing the game at the highest level and are actively trying to "break the game."

Really all the OP is saying (but ironically not doing in many of his examples in the OP) is: "don't be biased with your balance judgements."

Nothing new...and there's no need to go into "intricate mathematics" or math at all for any of this...the OP is overcomplicating things, and likely has not enough experience to legitimately comment on balance or imbalance in the first place.

The most qualified people to talk about balance are the pro players and people high up on ladder that are playing the game everyday versus other good players.

But 99% of these players are trying to practice and improve themselves and not even worry about balance in the first place, though everyone QQ sometimes.

imo OP is just trying to re-invent the wheel on balance discussions aka having a discussion about how to discuss things lol...there's about one of these posts per month or so that pop up with some guy that thinks he's mega smart and mystical with "the maths" -_- there's just so many things wrong in the OP and ironically "biased."

Do we really need another thread discussing how we should be discussing things and hordes of low post count people going, "wow you're so smart and amazing."

Nice effort sure...but i think a bit misplaced. Also, the entire premise of the thread doesn't work because there is no definitive model for imbalance. The model everyone uses for imbalance is...guess what?

Their personal bias and opinion. Notice my use of italics for emphasis.

fire_brand

Canada1123 Posts

May 20 2011 06:00 GMT

#30

All that for Jinro. ^_^

Stellar read, hope to see more posts in the future.

On May 20 2011 15:00 avilo wrote:
Just because someone made an incredibly long post does not make it mega awesome or even remotely accurate.

The most obvious thing completely wrong with the "post" is to not look at pros for balance. In any RTS game or game you always look at the top level for balance because these are the people playing the game at the highest level and are actively trying to "break the game."

Really all the OP is saying (but ironically not doing in many of his examples in the OP) is: "don't be biased with your balance judgements."

Nothing new...and there's no need to go into "intricate mathematics" or math at all for any of this...the OP is overcomplicating things, and likely has not enough experience to legitimately comment on balance or imbalance in the first place.

The most qualified people to talk about balance are the pro players and people high up on ladder that are playing the game everyday versus other good players.

But 99% of these players are trying to practice and improve themselves and not even worry about balance in the first place, though everyone QQ sometimes.

imo OP is just trying to re-invent the wheel on balance discussions aka having a discussion about how to discuss things lol...there's about one of these posts per month or so that pop up with some guy that thinks he's mega smart and mystical with "the maths" -_- there's just so many things wrong in the OP and ironically "biased."

Do we really need another thread discussing how we should be discussing things and hordes of low post count people going, "wow you're so smart and amazing."

Math is fact ^_______^ Love to see people in the community supporting work our members do instead of harping on it.

ElusoryX

Singapore2047 Posts

May 20 2011 06:02 GMT

#31

you have the good makings of a writer...

AcrossFiveJulys

United States3612 Posts

May 20 2011 06:06 GMT

#32

First, thanks for taking the time to write this up. This is definitely several steps up from most OPs and will hopefully generate some interesting discussion.

I didn't read the whole thing because while it looks nicely formatted at first glance, it's actually cumbersome to read because of the nested spoilers. Please generate a PDF version of this. From an organizational standpoint, it would be good if it was easier to find the conclusions you are making, since you seem to have the attitude that your statements are backed by data and your model.

After reading your assumptions, I second the concerns brought up regarding the validity of your assumption that imbalance is invariant to player skill. For example, gold leaguers don't have the micro to execute decent marine splits, and thus a moving banelings into marines at that level is much more cost efficient. There is a reason that this forum gets angry when it seems blizzard is making balance changes that balance the game at lower levels but potentially imbalance the game at high levels (e.g., barracks requiring depot). This assumption is so off that I don't know if the rest it is worth reading.

Yoshi Kirishima

United States10366 Posts

May 20 2011 06:12 GMT

#33

@avilo

I think he was trying to explain the details of analyzing balance only statistically. Though, analyzing something statistically is not necessarily the best way. So I guess that's what you mean by "misplaced"?

Essentia

1150 Posts

May 20 2011 06:12 GMT

#34

What rank/league is the OP?

Jombozeus

China1014 Posts

May 20 2011 06:15 GMT

#35

On May 20 2011 15:12 Yoshi Kirishima wrote:
@avilo

I think he was trying to explain the details of analyzing balance only statistically. Though, analyzing something statistically is not necessarily the best way. So I guess that's what you mean by "misplaced"?

Since the format of the post follows that of a scientific paper, its only natural it has to go through some sort of pseudo-peer-review. One cannot write a paper on evolution, and use a erroneous method and assume to receive praise.

In fact, some of the pseudo-science papers out there are amongst the most well laid out pieces of work I've seen. Doesn't make it any more credible.

windsupernova

Mexico5280 Posts

May 20 2011 06:15 GMT

#36

Great post OP, while there are some points I don't agree with I agree with several more.

Overall I feel like talking Imbalance and players skill is a waste of time because there are no feasible ways to measure those things, and people tend to get too emotional about that.Add to that the armchair designers and a lot of the discussion turns into a crap fest.

Thanks for the post though it was pretty interesting.

GenoPewPew

United States347 Posts

May 20 2011 06:18 GMT

#37

Imbalance isn't even that noticeable. The people who should comment are people who play the game a lot more professionally rather than a bunch of theorycrafters who have bad mechanics

fire_brand

Canada1123 Posts

May 20 2011 06:22 GMT

#38

I think a lot of people were not reading the post entirely. He went through how he analyses the different levels of play and how imbalances in certain levels of play do not NECESSARILY point to imbalances in the game, then goes on justify these points. I think what happened in a lot of cases is people read until they found something wrong, and instead of reading the entire post and weighing it in its complete form got lazy and posted.

When someone writes something so detailed and well crafted the least you can do is read the entire thing before bashing the shit out of it. Post count and ladder rank don't always equate to intelligence.

Penke

Sweden346 Posts

May 20 2011 06:26 GMT

#39

Probably the best first post i've ever seen

Samhax

1054 Posts

May 20 2011 06:27 GMT

#40

Wow props to OP for your effort. It will take some time for me to read and understand all your article. I just read the important things and i have to agree with avilo, you have to listen the pros even if they can be a little bit biased about their race. They have to be part of your balance data.

But i agree with your thesis, you can't balance a game with win ratio and ladder ranking because of the disparity of skill and i think Blizzard is wrong when they display their statistics about the races. It's not a good approach. But they said it was just a part of their informations, they have many others data to balance the game. We just have to trust them and hope for the best :p

worldsnap

Canada222 Posts

May 20 2011 06:27 GMT

#41

Why would you make such a terrible post. Did you even read what he wrote?

You sound like a politician who argues with a scientist because he just doesn't like the facts.

Jombozeus

China1014 Posts

May 20 2011 06:33 GMT

#42

Top level data cannot be analysed using summary statistics. It requires a proper statistical analysis to draw any valid conclusions. Disregard any analysis that attempts to use data from high masters above the level where players begin to consistently have a 50% win/loss ratio since any conclusions they draw will likely be invalid.

There should be a gulf near GM/top masters where players experience a dip in their win/loss ratio below 50%. This should dispel the myth that win/loss ratios mean anything, even in master league.

The best place to take data is around this dip, where players begin to consistently have an overall 50% win/loss ratio. This is our upper statistical threshold. Above this point, statistical analysis is required.

Any effects of balance on racial win/loss ratios should be fairly consistent across leagues.

Imbalance doesn’t affect distributions in the silver, gold and platinum leagues.

How is this not saying that measuring balance should be taken at a level not at the top? The OP explicitly stated it in his conclusion, no one went out of their ways to disprove him.

Any effects on balance on racial win/loss ratios should be fairly consistent across leagues? Nope.

Imbalance doesn't affect distribution in the silver, gold and platinum leagues? Why does it suddenly matter at diamond and midmasters where the "dip" is? I'm high masters and I am in no way claiming that I play this game at even 50% of its maximum potential.

That upper statistical threshold is the most randomly made conclusion I've seen in my life. The OP went "ABC and therefore, Z."

Mojar

Australia185 Posts

May 20 2011 06:37 GMT

#43

I like the concept you put forth in the OP however, balance should always be based of the elite level of play. This is an e-sport it has to be competitive to base balance decisions off of anything but pro-level play will do nothing but harm the competitive nature and scene of the game.

forSeohyun

504 Posts

May 20 2011 06:41 GMT

#44

On May 20 2011 14:08 Warble wrote:

Only masters/pros should talk about balance.

I am of the opinion that no player can truly talk about balance. Psychological studies have found that everybody has an inherent bias that favours themselves regardless of how unbiased they try to be. In other words, any balance suggestion by a pro will necessarily try to make his own race OP rather than to achieve objective balance, no matter how well-intentioned the pro is.

What about those that play random at one time, such as TLO or Nerchio, in what way would they be biased? If Tyler talked about ZvT-balance, would he be trying to favour him self?

On May 20 2011 14:08 Warble wrote:
This is often used by those who have learned a bit about statistics in high school.
You need a bigger sample size.

As this post uses no data at all, at best it proposes how data should be interpreted, this as no bearing on the matter.

On May 20 2011 14:08 Warble wrote:
We can only find imbalance by looking at the top level of play.

The problem here is that people tend to post simple summary statistics and graphs and call it a day. Consider, for example, if this game only had terran and no other races. Our statistics from GSL would show MKP and MVP winning most of their games against other terrans. Conclusion? It’s not terran that’s OP, it’s the players.

Now add the other races and players back in. When MKP plays against a protoss and wins, how much is a result of his race and how much of it is a result of his hard work and raw talent? The summary statistics do not show this.

I feel we should be talking about if imbalance can be found by looking at the statistics of top level play not the play in itself.
In general we could be looking at the statistics of the bottom, the top or the whole. The only thing that it should be noted is that the bottom or top should be large enough to reduce the variance of the data.

On May 20 2011 14:08 Warble wrote:
If everybody, especially pros, chooses a race, then that race must be imbalanced.
----
Maybe people choose a race because they have an affinity for it.

Here you posted a long analysis. However I didn't see a proposed remedy for sorting out the problems when differentiating affinities for races and imbalance.
The solution is this.

Get a huge number of players from the whole Blizzard-ladder randomly
Calculate how many Z, V, P and put them in three different lists -> stratified sampling.
Pick randomly (uniformly distributed) n items of each Z-, V-, P-list respectively, the number n must be smaller than the number of players of the least played race.

Now you can calculate imbalances without biasing your data.

On May 20 2011 14:08 Warble wrote:

The true numbers crop up when we compare them with each other. So out of 100 entrants, 40 are women and 60 are men. That means 150% more men enter college than women.

This is plain false, it is 50% more men.

Please, allay my following concerns or questions:

1. What education or experience do you have with statistics?
2. "I think it can’t be stated enough: GM and top masters is a bad place to be looking for racial imbalance." - No, it's perfectly viable as long as the sample group is large enough, what you can't do is take the player from 75-85% and use statistics on them.
You have to take an upper or lower x%, like the top 5% or GM-league.
3. What exactly does "Statistical analysis is the best way to analyse high level statistics." mean?
4. Do you believe it is impossible to the analyse imbalance without having Blizzard's data? Couldn't you have tried it with a ~100 random samples from the top of the league?

Torpedo.Vegas

United States1890 Posts

May 20 2011 06:44 GMT

#45

On May 20 2011 15:37 Mojar wrote:
I like the concept you put forth in the OP however, balance should always be based of the elite level of play. This is an e-sport it has to be competitive to base balance decisions off of anything but pro-level play will do nothing but harm the competitive nature and scene of the game.

Perhaps I interpreted this wrong, but I believe he is not arguing to what demographics the game out to balanced for with respect to choosing sample statistics, rather that regardless what you pick, using the elite or noobs own opinions would be both make it invalid due to biases. Also, even if you cater only to the elite, there are still many factors that need to be considered outside the game including individual player skill and other variables outside the objective race itself. So, uses lower skilled players could help isolate the fundamental game errors from the skill based QQ,

denzelz

United States604 Posts

May 20 2011 06:48 GMT

#46

Like others have said before, a long post doesn't mean it is a good post. There is definitely merit to approaching things in a scientific and mathematical way but we must also be aware of all the assumptions that were made along the way and make sure that the conclusion and analysis aligh in with the actual methods or initial hypothesis.

The rationale for the study is admirable, and I agree with the type of basic mathematical techniques used, but the assumptions that skill levels are linear and that race choice by Starcraft 2 players is proof for balance (because of the human tendency to optimize, thus reaching race equilibrium?) are quite debatable. With these assumptions alone, this study cannot be considered the "end-all" research on balance. And no, this is not nit-picking on small flaws in a study, but you are making very major assumptions that should not be ignored.

Further, conclusions made at the end are not supported by any of the data and analysis in the sections previous. What part of the data proves that Master/Pro levels should not be used for balance analysis? Your point about trends in one region alone cannot be interpreted as imbalance makes intuitive sense, but what part of your methods actually proves that? And the part about your faith in Blizzard?

Your study would be a lot better if you just presented your results and your methods. Instead, much of your "myth-busting" is just your own opinions and experiences.

I_Destroy

Canada22 Posts

May 20 2011 06:49 GMT

#47

This is how day(9) looks at balance lol
P.S. Great Post!

johanngrunt

Hong Kong1555 Posts

May 20 2011 06:49 GMT

#48

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

Definitely.

Dear OP

Most of the math you explained it in a way that's easy to understand. Thanks for this =)

TehForce

1072 Posts

May 20 2011 06:49 GMT

#49

On May 20 2011 15:44 Torpedo.Vegas wrote:

Show nested quote +

Perhaps I interpreted this wrong, but I believe he is now arguing to what demographics the game out to balanced for with respect to choosing sample statistics, rather that regardless what you pick, using the elite or noobs own opinions would be both make it invalid due to biases. Also, even if you cater only to the elite, there are still many factors that need to be considered outside the game including individual player skill and other variables outside the objective race itself. So, uses lower skilled players could help isolate the fundamental game errors from the skill based QQ,

Yeah but i dont care if Protoss or Terran has 70% Winrate against the other races up until Masters. (which it has not). As long as it is balanced on top level, imbalances in the lower leagues doesn't matter.

MangoTango

United States3670 Posts

May 20 2011 06:49 GMT

#50

So much truth, so little time. *golf clap* Well done.

uzas

Croatia52 Posts

May 20 2011 06:57 GMT

#51

This is so true.

DiDigital

75 Posts

May 20 2011 07:00 GMT

#52

One thing I think you may have neglected to mention when discussing problems with looking at the top of the ladder is the fact that blizzard has probability statistics on the expected results of each specific game played on bnet. This is the basis of the points we earn from our games. So while looking at win ratios for the top of the ladder is useless, point totals and the related data should be more accurate.

Also I'd love to see a graph of the matchup win percentages based on a moving average of the entire MMR spectrum. I'm sure blizzard has something like this, and I imagine you could make a lot of balance conclusions by seeing this data.

Lastly, your analysis dealt specifically with evaluating game results. Even if we can prove imbalance through your methods there is still the challenge of identifying the exact in game imbalance.

jalstar

United States8198 Posts

May 20 2011 07:02 GMT

#53

Yet these players with sub-50% win/loss ratios are both by rank and by nature superior players to their peers with 50% win/loss ratios. Their low ratios come from the fact that they’re so good they’re the only amateurs who get matched against professional level players. This low ratio comes from the assumption that there is a significant skill gap between amateur players and professional players. We know that at the top level, skill gaps can be huge because Idra has a huge win/loss ratio.

There are very few players (relatively) with 70-80% W/L ratios, so the best players outside those ratios will face ~50% ratio players most of the time, and if they're truly better they'll have a higher than 50% win rate against those. Over hundreds of games the better players should have a higher winrate unless 70-80% win ratio players are literally laddering all the time.

How much math have you taken, OP? By your post I'm guessing you're a second-year math undergrad, or maybe an engineering/science major.

Beakyboo

United States485 Posts

May 20 2011 07:02 GMT

#54

Your analysis is seriously convoluted and you draw no apparently meaningful conclusions that I can infer. I don't think too many people at the moment put much faith in arguments about balance based on statistics. It's bothersome that something like this can actually get so much praise from people who've neither read nor understood it simply because they think it looks intelligent.

Coming from a math major, most of your points are only made more confusing by the math you attempt to justify them with and can be inferred without it far more concisely. It really feels that you've only included a lot of math for the sake of attempting to appear credible.

Holy_AT

Austria978 Posts

May 20 2011 07:06 GMT

#55

I am sorry, I dont quiet agree with your post.
In my opinion it lacks of data to back it up. In many cases you present some facts and present your opinion, while the same facts would lead others to another opinion.
Some of your conclusions are based on the concept of player skill, but "player skill" is subjective and can not be defined very well.
In my oppinion there are too much asumptions in this post, I think it would be better to break it down and focus on smaller practical details then to try and cover the whole balance discussion with assumptions and opinions.

forSeohyun

504 Posts

May 20 2011 07:08 GMT

#56

On May 20 2011 15:44 Torpedo.Vegas wrote:

Show nested quote +

No.

The question is if you want the game to be balanced for the elite as a whole, everybody as a whole or both (you probably don't want too balance it for the bottom 5% in bronze league if it means sacrificing balance in another).
Here I mean balance as in the average win-ratio for a race.
So there is five viable options:

1. The imbalance is linear for every player with a win ratio between 0-100%, great you can balance the top 1%'s match-ups and bronze league will be balanced as well.
2. The balance is non-linear, so that people with different skill levels will be affected (in changed win rate) differently.What do you do then:
2a. You try to make the average win ratio 50% when considering all players.
2b. You try to cater for a specified percentage of players so that they achieve 50% win ratio amongst themselves
2c. You try to make balance changes which are the "inverse" to the non-linear balance, which could be practically impossible.
2d. A compromise between a,b,c

What is Blizzard's strategy? I don't know.

If the case is 1. or 2b. you only need to apply statistics to a elite percentage of players.
If the case is 2a, 2c, 2d you will have to sample the whole interval of players.

omisa

United States494 Posts

May 20 2011 07:11 GMT

#57

On May 20 2011 15:00 avilo wrote:

The most obvious thing completely wrong with the "post" is to not look at pros for balance. In any RTS game or game you always look at the top level for balance because these are the people playing the game at the highest level and are actively trying to "break the game."
.

The most obvious thing wrong with your post is you assume you're right.

Amazing OP. It seems the whole point of it is to look at the balance situation with a little less bias and to not automatically assume rumors of imbalance are actually true. If anything is to be assumed, its that you are wrong (this goes for everybody).

Deltablazy

Canada580 Posts

May 20 2011 07:12 GMT

#58

mind=blown

Good job dude. Enjoyed every second of reading your post

Jombozeus

China1014 Posts

May 20 2011 07:17 GMT

#59

On May 20 2011 16:11 omisa wrote:

Show nested quote +

Cool, so we have reached a conclusion that the OP is wrong despite also being amazing. Doublethink it is!

Essentia

1150 Posts

May 20 2011 07:17 GMT

#60

On May 20 2011 16:11 omisa wrote:

Show nested quote +

Ok, then explain why the game shouldn't be balanced around pro play/

Anomandaris

Afghanistan440 Posts

May 20 2011 07:21 GMT

#61

Meh, this kind of posts is (nearly) useless imo, and mistaken in a couple of points.

Discussing how to discuss balance wtf.

All those lower leagues play the game wrong, and are insignificant for balance. The only one which you should watch are GM and maybe tournament results.

Altough they sometimes qq, high level players try non stop new things out and reinvent the matchup. Non believes seriously in imbalance.

I guess some people are impressed when they see some math...

qazadex

Australia473 Posts

May 20 2011 07:24 GMT

#62

On May 20 2011 16:17 Essentia wrote:

Show nested quote +

Ok, then explain why the game shouldn't be balanced around pro play/

It's not that the game shouldn't be balanced against pro play, its that anyone with ~80% win ration is a statistical outlier. Their skill is above the other competitors by a significant enough margin that they cannot be used.

Holy_AT

Austria978 Posts

May 20 2011 07:26 GMT

#63

All those "mind = blown" posts and other 1 sentence statements in regard to the opneing post feel so horrible.
If you are actually amazed then explain why, because with that sort of statement I can only come to the conclusion that you are not able to understand what the thread opener is writing about and therefore your mind is blown ?
It would be nice to read about more educated opinions then just 1 sentence, it does not need to be highly detailed but "mind blown" and "golfclap" make me feel bad.

VIB

Brazil3567 Posts

May 20 2011 07:28 GMT

#64

- First of all, to even start arguing, we need to decide what "balance" is. Seems silly, but actually people have very contradictory opinion about this
- Even if the game has 1/3 of each race at every level. It doesn't mean it's "balanced". Balance is much more than that.
- ZvZ has a 50% win ratio. But often times the winner is the one who got a lucky coin flip in the build order battle. And not the most skilled player. Does that fit your definition of the word "balanced"? Opinions will vary.
- SC2, like every other game in history, does NOT use objective math to "calculate" balance. There is no formula where you put in variables and find out how much damage a stalker should do
- SC2, like any other game. Is balanced through brute force. You put a random value, test it, if it seems imba, you change it out of pure intuition. There's zero science in this.
- It is mathematically impossible to achieve perfect mathematical balance with this brute force approach.
- Only solution would be remake a game from scratch. With balance in mind since start. And calculate balance (once we define what that is) before designing the game. Then make the game around this balanced model. ie.: completely remake sc2
- Of course blizzard will never do this. So we will never have perfect balance. We can only hope for "balanced enough" (like many would argue bw is)
- Realistically, considering the brute force approach. Our best bet is have as many balance-test iterations as possible. The easiest way to do this is to balance through designing maps (which is about ~50% of what balances the game), instead of patching. Just like BW is being rebalanced by kespa mappers after blizzard stopped patching

Waking

United States46 Posts

May 20 2011 07:29 GMT

#65

On May 20 2011 16:02 jalstar wrote:

Show nested quote +

No need to insult the guy. If player skill between high masters and grandmasters scale in a nonlinear way with points, then you will get a situation where high masters have less than 50% ratio. Not that hard to understand.

NastyMarine

United States1252 Posts

May 20 2011 07:34 GMT

#66

Incredible post!

omisa

United States494 Posts

May 20 2011 07:35 GMT

#67

On May 20 2011 16:17 Jombozeus wrote:

Show nested quote +

Cool, so we have reached a conclusion that the OP is wrong despite also being amazing. Doublethink it is!

It seems you have reached a conclusion of your own. But this sort of bickering is quite off topic, lets keep discussion on topic.

forSeohyun

504 Posts

May 20 2011 07:38 GMT

#68

On May 20 2011 16:24 qazadex wrote:

Show nested quote +

If you want to balance the game statistically you have to use either all or a number of randomly selected players from a league or from the game as whole.

This depends on a) if the balance is linear or not, and if it's not b) who do you want to balance it for.

You could select randomly a equal (high enough) number of Z,V,T's in GM-league and find the difference in mean rank or win ratio.

If the imbalances are not linear you have to sacrifice balance for some players, otherwise it's just fine using statistics of "pro play"-

BeastofManju

United States79 Posts

May 20 2011 07:48 GMT

#69

To the OP:

What about map imbalances? How does maps win/loss ratios play into all of this?

And 2nd question... Let's just assume that it is found that 1 race is OP or UP. How should one go about pinpointing the unit/mechinic that is the source of this imbalance?

omisa

United States494 Posts

May 20 2011 07:58 GMT

#70

On May 20 2011 16:17 Essentia wrote:

Show nested quote +

Ok, then explain why the game shouldn't be balanced around pro play/

There is no real way of objectively balancing the game to just the pro scene, assuming there is an imbalance. Just because it is considered top level play, does not mean it is acceptable to "balance" the game over it. I do agree that it is crucial to look at top level play for balance issues but to actually make assumptions of imbalance solely regarding pro play would be quite, nonsensical.

DestroManiak

257 Posts

May 20 2011 08:06 GMT

#71

For a start, I believe that beginning with the idea of imbalance is the wrong way to start. What a lot of people do is begin with the idea of imbalance, and then seek data to back up their opinion.

One should always refrain from confirmation bias.

paperwing

49 Posts

May 20 2011 08:07 GMT

#72

Good post, agree on harsher punishments by mods for obviously unfounded, or obviously low-quality criticisms

Jombozeus

China1014 Posts

May 20 2011 08:16 GMT

#73

On May 20 2011 17:07 paperwing wrote:
Good post, agree on harsher punishments by mods for obviously unfounded, or obviously low-quality criticisms

So you are a person with 4 posts, with absolutely content in this post, asking for mods to do your bidding by banning people who have contributed much more than you, and have given a structured argument against the OP?

Grohg

United States243 Posts

May 20 2011 08:41 GMT

#74

I'm holding my breath for a thread like this expressed solely using symbolic logic with parameters explained using only quantifiable data. On my first pass through the OP I saw the logic and followed the arguments. However, I feel that some of the omissions and assumptions were more than minor factors in augmenting the eventual conclusions. Even excluding the top tier of players to eliminate out-lier bias, you end up theoretically balancing for only the top tier of the remaining sample. The middle of the pack players should have higher degrees of skill variance which skews the shape of the trend to no longer be a nice normal curve. In an ideal sample with the ever-desired certainty of the normal curve, most of the results from the OP could be reached due to the elimination of any assumptions (regardless of their actual impact on the game's balance).

Theorycrafting, even in the context of balance, will inevitably break down once too many variables are left without being operationally defined. This is especially true when dealing with skill. How do you accurately operationalize skill? Skill is loosely defined as it is and it would be hard to decide arbitrary values to use as means of measurement (think of how the social sciences define Happiness or Anger). In a closed study, the efficacy or internal validity might be extremely high. However, once parameters are no longer clearly defined outside of the study, the external validity and ability to generalize to any other study is shot to hell.

I think the idea itself is actually a refreshing way to approach balance but there would need to be an overhaul to the method's core to eliminate confounding variables or bias. I have too many other concerns to list out but that's probably just my compulsive personality kicking in wanting a way to eventually break everything down in to binary.

AbInitio

United States4 Posts

May 20 2011 09:01 GMT

#75

While I agree that the effects of imbalance scaling linearly is unrealistic, a linear relationship is the simplest case, and the OP does point out that this is merely a "simple" model.

Obviously this analysis cannot simply be directly applied and sweeping conclusions made, but it is arguably the first step that one would take to try and rigorously analyze the game.

Comogury

United States412 Posts

May 20 2011 09:18 GMT

#76

On May 20 2011 16:21 Anomandaris wrote:
Meh, this kind of posts is (nearly) useless imo, and mistaken in a couple of points.

Discussing how to discuss balance wtf.

All those lower leagues play the game wrong, and are insignificant for balance. The only one which you should watch are GM and maybe tournament results.

Altough they sometimes qq, high level players try non stop new things out and reinvent the matchup. Non believes seriously in imbalance.

I guess some people are impressed when they see some math...

I don't know what you mean by "play the game wrong." If there really was an imbalance, I am pretty sure that it would show in all levels of play, not just in grandmasters and tournaments. So why not lump them in, too?

ru.meta

Russian Federation88 Posts

May 20 2011 09:32 GMT

#77

Could you explain what exactly matrix like [27 20 20 20 13] means?

Regretful

Sweden91 Posts

May 20 2011 09:46 GMT

#78

Very well done!

I for one would like to see more analysis from you.

Blasts

Netherlands99 Posts

May 20 2011 09:48 GMT

#79

On May 20 2011 18:18 Comogury wrote:

Show nested quote +

On the lower levels players make to many mistakes to blame losses or wins on "balance".
But I think there are far too many factors to count in for when talking about balance, not to mention that nobody knows for sure what balance is. Or if balance should be for the pro's or the whole ladder.

Regretful

Sweden91 Posts

May 20 2011 09:50 GMT

#80

On May 20 2011 18:32 ru.meta wrote:
Could you explain what exactly matrix like [27 20 20 20 13] means?

I think he did.

It is the spread of people in the leagues.
27 bronze 20 s 20 g 20 pl and 13 diamond

freetgy

1720 Posts

May 20 2011 10:05 GMT

#81

On May 20 2011 14:08 Warble wrote:
For a start, I believe that beginning with the idea of imbalance is the wrong way to start. What a lot of people do is begin with the idea of imbalance, and then seek data to back up their opinion.

exactly this
that the reason, why
P says P UP
T says T UP
Z says Z UP

and the otherway around with OP.

especially when the Argument comes in that a matchup is "hard"? What does this mean?...
A Matchup has to be "hard" to be considered Balanced for both sides -.-

ODKStevez

Ireland1225 Posts

May 20 2011 10:14 GMT

#82

Amazing, I really enjoyed that ^^

wassbix

Canada499 Posts

May 20 2011 11:03 GMT

#83

Firstly, Post-hoc analysis for hypothesis generation is perfectly fine and done quite commonly. Internal biases for the investigator is also fine long as their methodology is solid. Its pointless to say an expert in the field should be barred from their hypothesis because they have some X bias. Anyone immersed in a topic will inherently have biases. So what if they have inherent biases? Are their proofs solid? If not you discredit them based on their data and not hand-wave their efforts because their beliefs don't align into some mythical perfect neutrality.

Asking Idra for his opinion for ZvP balance is more meaningful than asking some disinterested passer-by what he thinks about ZvP. Sure Idra is known for his rigidity and having strong opinions, but his information for why he hypothesize ZvP is imbalanced is A) more informed because he is a pro gamer B) falsifiable because he'll provide the premise in why he believes its broken and this leaves room to prove him wrong.

The first paragraph is a faulty in the logic and quite meaningless to say. If the game had one race you wouldn't even argue imbalance nor would you even look at statistic to look for "imbalance".

Then your second example is to add two races and say MKP's stats don't matter. Yes, but not because MKP is a top player but because you're taking only one individual information; of course its meaningless. It has nothing to do with looking at a population of high level players to look for imbalances.

You basis for not looking at high level play for balance doesn't work cause you can apply that same faulty logic across any skills level - and even more damning because at the middle of distribution you have even more variance in individual skill level because its easier to improve when you're starting rock bottom.

So you basically make no convincing argument to why we shouldn't look at summary statistics for high level players to look for imbalances.

I appreciate the effort post and trying to back your claim is especially refreshing, but as many people pointed out it has many flaws in the theoretical assumptions.

EdSlyB

Portugal1621 Posts

May 20 2011 11:05 GMT

#84

On May 20 2011 14:21 Nik0 wrote:

Show nested quote +

Probably

You forget Baller's first post...Epic.

legatus legionis

Netherlands559 Posts

May 20 2011 11:31 GMT

#85

I just want to say that even though I cannot read this atm, just woke up and it's soo much.
It actually looks very promising. I almost got a heart attack when I kept opening spoilers and there were more spoilers inside! And not little things, multiple paragraphs standard. Really like it, I'll save and read it later when I'm more fresh. Without a doubt there will be at least a couple perspectives that I've never thought about.
Nice job must've been a lot of work!

Qikz

United Kingdom12024 Posts

May 20 2011 11:44 GMT

#86

Just read it all, fantastic post man. I agree with almost everything your saying, then again that's kind of my own bias (but if you think about it, all thoughts and feelings are bias!) but I hope a lot of people give it a read!

Deckkie

Netherlands1595 Posts

May 20 2011 11:48 GMT

#87

I would love to read your mechanics analysis

Tegin

United States840 Posts

May 20 2011 12:06 GMT

#88

Great read. Would love to see the maths behind everything.

HaXXspetten

Sweden15718 Posts

May 20 2011 12:11 GMT

#89

Not quite sure what to say, other than that you've done an amazing job at this. Thanks a lot.

karpo

Sweden1998 Posts

May 20 2011 12:12 GMT

#90

On May 20 2011 20:03 wassbix wrote:

Show nested quote +

He's saying that looking at high performing players for balance might skew the balance IF there's a couple more great players using a specific race. Looking at top players is not worthwhile if one race is just more popular (as terran seem to be in korea) or if there's just a couple more really great players using that race.

ondik

Czech Republic2908 Posts

May 20 2011 12:12 GMT

#91

Mods, spotlight this..NOW! Excellent (first) post, that's all I've gotta say.

Elean

689 Posts

May 20 2011 13:22 GMT

#92

On May 20 2011 15:00 avilo wrote:
Just because someone made an incredibly long post does not make it mega awesome or even remotely accurate.

Exactly, it also makes it easier to hide absurdities.

Such a long post makes it impossible to point out all the defaults/absurdities (and I think there are a lot of them in the OP). This is in no way helpfull to anydebate.

I also can't stand people replying "it's amazing", without even reading what they agree with. 2 minutes to read such a long post ? Really ?

Perscienter

957 Posts

May 20 2011 13:26 GMT

#93

We can only apply a subtle model for balance, but can't achieve mathematically proven balance.

Mine would try to almost even out the win ratios per match-up per map per time stage at least not below diamond and I'd especially emphasize looking into statistics of the professionals.

At a first glance, the balance situation still looks pretty grim and I don't like Blizzard's half-balance and non-transparency philosophy either.

FutureArchon

United States25 Posts

May 20 2011 13:58 GMT

#94

Excellent post. Blizzard hire this guy!

zanmat0

188 Posts

May 20 2011 14:03 GMT

#95

On May 20 2011 22:58 FutureArchon wrote:
Excellent post. Blizzard hire this guy!

That would cost a lot more than their current solution of Monkey + Dartboard.

holynorth

United States590 Posts

May 20 2011 14:10 GMT

#96

Thank you for actually understanding statistics and sample size. I remember reading a post awhile back where someone posted an interested statistic (I forget exactly what) that was around 94%. He was attacked for several pages by people demanding a sample size of 10,000 and greater compared to his sample of 700. Made me pretty angry.

Great read though.

Warble

137 Posts

May 20 2011 14:25 GMT

#97

This was written a bit tongue-in-cheek starting from the very first line since I see SC2 as something to be enjoyed rather than, well...work.

The idea wasn't to provide a Theory of Everything on using statistics to analyse balance, but to provide some groundwork for ideas to develop since I saw that the community kept going back to the same old ideas without moving forward. I thought that with a bit of groundwork everyone can pitch in and develop something better. While it would be nice to have a solid and thorough Theory for Everything, they take considerable work to develop. Those who have attempted a thesis (or even succeeded) knows how much work goes into contributing to theory, and how small each individual's contribution is. My favourite example is how it took centuries for the best minds in maths to prove that we cannot solve polynomials of degree 5 or higher using radicals (and one of those minds died young over a dispute regarding the affections of a young lady). Blizzard probably has its own in-house statisticians who spend 8 hours a day on this problem while I do it on the bus. The difference is that Blizzard isn't sharing their findings with the community.

I guess a bit more background would be useful here. I am interested in the idea of how we might go about balancing a game, and look at SC2 because it's the game I enjoy most. The aim was to think about in the way I would if I was working on Blizzard's balance team. I mentioned in the Background that I was working on a more generalised model, but the levels of complexity make it difficult. Many of you have already pointed out many of the factors that such a model would need to consider and I'd be interested in seeing what methods you use to integrate those factors into your models to provide useful conclusions. After all, one of my aims in posting here was to get the ball rolling and see what others can come up with. It's always good to learn from others.

One impression I hoped to convey with this model was how hard it is to use statistics when analysing balance. I talked about this a bit in the Extensions. This model was very simple and set in an idealised world, yet already it imposes so many restrictions on the conclusions we can draw when looking at the data. We then relaxed a few assumptions to make it more rigorous and the restrictions grew, as did the uncertainties. So if you take some win/loss statistics from masters and graph them, what does it really show?

I took a bit of a lazy way out when saying that high-level games should only be analysed using proper statistical analysis rather than summary statistics. After all, considerable work is involved in formulating a proper statistical analysis. I posted a link to an attempt by another on TL and I think he should be commended for the effort. However, I think it was also written for a school assignment, so maybe he was pressed for deadlines.

Here's an example just to illustrate one of the tougher barriers to a meaningful statistical analysis:

+ Show Spoiler +

Consider a GSL with just 2 pros, MKP and MKQ. MKP plays terran while MKQ plays zerg. TvZ is perfectly balanced and both players are of equal talent (t). We define their skill as S(t,p) where p is the hours of practice they put in, and dS/dp > 0 for all p, i.e. the more they practise, the better they play.

We observe 30 games between these 2 players on XNC and conduct an analysis using parameters representing their skills, say BMKP to represent how much higher MKP's skill is compared to MKQ, and BTvZ for how overpowered terran is against zerg.

Take a moment to see why this doesn't work.

So if MKP wins 20 games and MKQ wins 10 games, what does that tell us about our parameters?

Nothing!

Why? Because we don't know which one is relevant. Did MKP win more because he practised more, or did he win because TvZ is imba? Sure, I told you that the game is balanced, so you know MKP practised more - but when conducting the analysis, we don't have this information - the point of the analysis is to figure out using only their results how skilful they are and how imbalanced the matchup is.

This applies even if you add more players. Say we clone both players a few times, and each clone is slightly weaker than the last. So from the best terrans we have MKP, MKPa, MKPb, and from the best zergs we have MKQ, MKQa, MKQb. This is a bit trickier to explain, so let's say we break them up and analyse just the terrans first and then just the zergs. Our analysis would find their relative rankings within their own races and work out their skill levels relative to an arbitrary baseline. Say we get 2, 1, 0 for the terrans and 2, 1, 0 for the zergs.

Now we let them play each other and conduct an analysis on that. Let's assume that the terrans practised more and are one skill level ahead of their zerg opponents. So we would rate their skills MKP = 3, MKPa = MKQ = 2, MKPb = MKQa = 1, MKQb = 0.

But the problem is that our model has the parameter BTvZ, which actually makes it impossible to solve for this. We would get the above if BTvZ = 0, but it could also be 1 and we would subtract 1 from all the skills we calculated above. Or BTvZ = 0.5. There are an infinite number of possibilities. I can't remember exactly, but I think you wouldn't actually be able to find a solution to this at all. But you can see that even if you do find a solution, it's actually meaningless.

To further compound the problem, the players in reality won't keep their relative performances. Maybe some are better at TvZ than TvT and others are worse at TvZ than TvT. Then your model becomes a mess and you're left wondering if the results of your analysis are actually meaningful because you haven't really addressed the underlying problem of attribution.

What is the source of the problem? It is because the player and the race they play are not independent variables. MKP, MKPa, MKPb always play terran and hence we cannot distinguish between when an effect came from the player or from the race.

So it's quite possible that we can't use available statistics to make any inferences about imbalance at all.

But that's where the fun is. Maybe there are creative ways around it. Maybe there are things we can measure that will allow us to draw conclusions. I guess you can also wonder if Blizzard has figured it out and what work-arounds they have found for it.

Personally, I suspect this is one of the benefits of forcing players to log in to play the game. It allows them to collect more data on player behaviour, like how often they play custom games and against whom. I would love to know what Blizzard is doing but I doubt they will share it with us. If there is a solution, I think it lies somewhere in those sorts of data. After all, when you play a custom game, it shows up in your match history with your build order and everything. This means all that data is collected... And I have always felt that they save every replay too (how else would they catch cheaters?), which means they have data on APM and all that and can control for games where players aren't actually trying. I'm not sure how yet, but it seems like the most promising direction.

This means that as a pro, it's definitely a good idea to make sure you do all your practice games legitimately and on Bnet rather than on private hacked lans because when Blizzard goes to analyse the data and sees your stellar performance with only a handful of hours played, your race will look easier than it is.

I have fixed up a few issues with wording in my OP after reading some suggestions here. There's actually an interesting story behind why I made some of the mistakes.

+ Show Spoiler +

When I began writing it (it's been sitting there for a while now), I had recently taken an aptitude test when applying for a new job. One of the first questions was a basic statistical one. I can't remember it exactly, but it basically came down to a ratio of 2:1 when comparing two statistics. And it asked how many times more did the first thing occur than the second. However, the options given were ridiculous, like 50%, 200%, 250%, 300%, 400%. Needless to say, I was nonplussed.

As for questions about things like map balance, it would be quite simple for Blizzard to look at balance on individual maps. Unless we can get similar data as players, it will be harder for us.

As for the Myth Busting section, it was more subjective than the analysis, which was why I put it after Outcomes, which was where the analysis ended. I also used the words "my opinion" in the Myth Busting section when I believed there was a significant subjective element to what I was saying, although I tried to stick with things that have objective support (such as how humans will still be biased even if they earnestly try to be unbiased to explain my belief on why we shouldn't give it too much credence when a pro claims the game is imbalanced).

EDIT: I made a mistake in this post explaining why a statistical analysis wouldn't be able to differentiate between the effect of a player's skill and racial imbalance. This has been fixed.

RoachyRoach

81 Posts

May 20 2011 14:35 GMT

#98

Imagine if SC didnt allow for a player to pick thier race. The quality of balance discussions would skyrocket.

I liked this post though.

Jombozeus

China1014 Posts

May 20 2011 14:45 GMT

#99

The issue is that your definition of balanced is based on the outcomes of the games, not the game itself.

Players who play the game would like to believe that even though they are able to win now, they see a fundamental issue that an unit of the opposing race has, that can be abused given higher APM, multitasking and experience.

What statisticians lack is the foresight. Statisticians can only gain their predictions through models and trend analysis. It has been seen again and again that the RTS genre does not develop in a plottable fashion in any given XYZ axis, and instead follows a paradigm shift.

There are certain examples where one can predict, roughly, the exponential growth of technology (for example, http://en.wikipedia.org/wiki/Moore's_law). But there is no way we can say "because today we have been able to map the DNA of a human being, in three years we will have a X% likelihood of obtaining time travel."

The game is dynamic, and a simple change in playstyle can open up a completely new paradigm; see TvP tank play and ZvP ling/bling intro ultras play. One can argue that the underpoweredness of a race displayed in the short run motivates players to improve faster than the overpowered race who has no obligations to improve, hence balancing out the long run. This however is an assumption based on a flimsy model.

The conclusion that one should reach is that the game is evolving nonlinearly. It is a concept we cannot look at by plotting it on a line. Hence, mathematical models are extremely hard to implement for any long-term conclusions, and short-term conclusions are useless in most senses of the word.

PS:
In the word of Blizzard themselves, the statisticians they hire are only one way of measuring balance. They take into account many other factors including replays sent to them by pros. One occasion I remember, it was MKP? that sent Blizzard replays involving an absolute imbalance in void rays, and thus the VR nerf early on in the game.

Atlare

Australia893 Posts

May 20 2011 14:46 GMT

#100

On May 20 2011 23:35 RoachyRoach wrote:
Imagine if SC didnt allow for a player to pick thier race. The quality of balance discussions would skyrocket.

I liked this post though.

No that doesn't work since people would probably not play the game if you were forced random -.-

Harstem

Netherlands263 Posts

May 20 2011 14:51 GMT

#101

Great post!

RoachyRoach

81 Posts

May 20 2011 14:51 GMT

#102

On May 20 2011 23:46 Atlare wrote:

Show nested quote +

No that doesn't work since people would probably not play the game if you were forced random -.-

I disagree. People would play more customs to tone thier individual race matchups. Then ladder was all RvR. I would love that.

I imagine SC tournies where player vs player is a bo9

pvp
pvz
pvt

tvp
tvz
tvt

zvp
zvz
zvt

Would be the only way to actually tell who is the better starcraft player is.

edit: "your wrong because I think" statements are pointless.

Kenderson

Canada280 Posts

May 20 2011 14:52 GMT

#103

So I didn't read the entire OP, but how would you calculate whether forcefields are imbalanced? I don't claim to be an expert on the subject, but forcefields are the only thing in the game that really seem imba atm. From my perspective it's seemingly obvious. I'm sure there are many protoss players that would secretly agree and they love exploiting it. Imho a few sentries give the protoss way too much control over the battle. Then you see the games where the protoss gets A LOT of sentries and it gets to the point of completely unfair lol. I might be wrong Idk.

Suggestions for balancing forcefields (if they are in fact imba):
+ Show Spoiler +

-Higher forcefield energy cost for less forcefields total.
-Lower sentry energy cap for less forcefields total.
-More expensive sentries for less sentries total and thus less forcefields.
-Slower sentry energy regeneration so it takes longer to recharge and to bank extra forcefields.
-Maybe a forcefield cooldown or something so they can't use so many in a short time period.
-Smaller forcefield radius to decrease their effectiveness? Idk

Myia

173 Posts

May 20 2011 15:05 GMT

#104

I dont think that this post was looking at specific things in game to do with balance, rather how people viewed balance, and how to view balance overall

Black Gun

Germany4482 Posts

May 20 2011 15:10 GMT

#105

wow, awesome thread and thanks a lot for the tons of effort you have put into this. will be an interesting read. even at a first, brief glimpse, i could see several good points that i can agree with.

Volka

Argentina411 Posts

May 20 2011 15:10 GMT

#106

I really wanted to see the math on how to prove imbalance, and some DATA analysis. That would had been interesting, even though I don't agree with many of your assumptions.

I found the Myth section particulary disturbing.

Scriptix

United States145 Posts

May 20 2011 15:16 GMT

#107

Well put, I enjoyed reading it. I really like you're train of thought.

whatthefat

United States918 Posts

May 20 2011 15:30 GMT

#108

One of the best posts I've seen on TL. You covered an incredible amount of ground, and your outcomes provide a usable framework for interpreting the data. I wonder if appropriate statistics can be derived from sc2ranks.

TheFrankOne

United States667 Posts

May 20 2011 15:51 GMT

#109

I would like to add there is a massive problem with balancing for the top level of the game from Blizzard's perspective: they will not be balancing for the majority of their customers. If in the lower leagues something is broken, i.e. mass void rays, something needs to be done so Blizzard can assure the vast majority of their customers still enjoy the game and recommend it to others.

While in a theoretical sense balancing for the top works, it is a bad business decision and so will/should not be the exclusive way the game is balanced.

CryMore

United States497 Posts

May 20 2011 17:40 GMT

#110

Great post.

I think the ultimate conclusion is that understanding racial imbalance is a ridiculous difficult task that can't be simplified to current statistics or specific game mechanics/units. No one is qualified to talk about balance, even the top players. What balance talk degrades to is just biased statements backed up by statistically flawed data.

Please don't get discouraged by anyone who disagrees with you without any real backup. They don't understand you are not making any ACTUAL data analysis, but providing a model of how racial imbalanced can be viewed from a purely statistical viewpoint. Please write your next article, I would be really interested in seeing what kind of conclusion you can draw.

infinity2k9

United Kingdom2397 Posts

May 20 2011 17:59 GMT

#111

Yeah seriously, not going to quote this all but i don't see why people think this is so particularly great just because it's a shitload of words. By this logic BW isn't balanced well enough cause PvT is easier for P for 99% of people. I can't be bothered to go into this any deeper cause this whole thing is basically useless and something that could have been said in about 100 times less words.

sylverfyre

United States8298 Posts

May 20 2011 18:17 GMT

#112

On May 21 2011 02:59 infinity2k9 wrote:

Show nested quote +

But the problem with the pros is that they're inherently outliers. We don't exactly have a magic number attached to every pro showing their skill (or even their skill in each matchup) even when we attempt to model it through Elo ratings and such. Statistically the pros are going to always have very odd looking win ratios, and it's extremely hard to draw conclusions from them.

Jombozeus

China1014 Posts

May 20 2011 18:28 GMT

#113

On May 21 2011 03:17 sylverfyre wrote:

Show nested quote +

Hence the conclusion drawn should be that statistics is not a good way to measure imbalance, NOT that pros are not a good way to draw conclusions FOR statistics.

Since we are discussing imbalance at the maximum potential, the statistical outlier is the prime consideration, not to be ignored. To assume that imbalance is equal at any level is absurd as previously stated, the skill:winrate ratio does not scale linearly.

Sleight

2471 Posts

May 20 2011 19:11 GMT

#114

On May 21 2011 03:28 Jombozeus wrote:

Show nested quote +

Hence the conclusion drawn should be that statistics is not a good way to measure imbalance, NOT that pros are not a good way to draw conclusions FOR statistics.

Since we are discussing imbalance at the maximum potential, the statistical outlier is the prime consideration, not to be ignored. To assume that imbalance is equal at any level is absurd as previously stated, the skill:winrate ratio does not scale linearly.

If we cannot use statistics, what can we use as a metric to examine data? There is ONLY statistics. Within the model presented, the OP does a great job of supporting and defining his argument. Everyone saying imbalances vary at skill levels have a VALID point, but that doesn't make it true.

If every Bronze Z loses to 3 Rax an imbalance metric of 2, and every Master Z loses to 2 Rax an imbalance metric of 2, then the imbalance value would be the same overall, assuming the MU was otherwise in harmony, for sake of an argument.

The methodology presented argues this: If a race is overpowered against another race, it should exist at a similar level regardless of direct causation or mechanism across all levels for purposes of general game balance. (ie different means at different levels but same net result of imba)

What it does NOT argue is this: Racial imbalance is uniform in mechanism across the spectrum (ie 2 rax is always the cause of OP).

This approach to balance allows for exactly one thing: Identification and stratification of the most gross (meaning large) imbalances in a game for presence alone. Why such an imbalance is present is up to debate. This means that he game CAN BE balanced as the author proposes at the largest scale and that tweaks in units and fine mechanics must be trade offs in overall power to solve issues at the highest level.

Well done, OP. Very neat read.

Jombozeus

China1014 Posts

May 20 2011 19:54 GMT

#115

On May 21 2011 04:11 Sleight wrote:

Show nested quote +

Contrary to popular belief, anecdotal evidence from pros usually do. Statistics is the only metric to measure data? Since when have we concluded that a metrics is necessary?

The assumption you make with the net result is absolutely preposterous. Its grossly abusing inductive reasoning. The stats themselves show that at different levels, the win% of different races is different in each matchup. I don't understand how you can convince yourself that is a valid argument.

As there are still those who have not realized, identification of short term "imbalance" is easy with statistics, we say "hey, we see 55% winrate over terran as zerg at X point master level, hence zerg is more imba than terran at X point master level." That does NOT mean:

1. Zerg is imbalanced compared to terran at all levels
2. Zerg will exhibit the same winrate vs. terran tomorrow, next week, or next month due to a new paradigm shift
3. Zerg players and terran players will exhibit the same level of increase in general skill at the same rate
4. We shouldn't listen to IdrA because of his cognitive bias towards the zerg race

1,2,3 are assumptions that lapses in logic, while 4 is a conclusion the OP made with the utmost lack of respect for pro gamers.

Would you go up to a scientist and tell him: "Hey, I know you're a scientist, but because you have cognitive bias I don't believe you should be able to make conclusions about science."

PS: Short term can be as short as an infinitely small amount of time

forSeohyun

504 Posts

May 20 2011 20:16 GMT

#116

On May 21 2011 03:17 sylverfyre wrote:

Show nested quote +

This is wrong:

That they are statistical outliers make no difference - if you randomly select 50 Grandmaster Zerg, 50 Grandmaster Protoss, 50 Grandmaster Terran; they should have equal win ratios (within a standard deviation or so, considering the standard error of the mean) as average.

Averaging over a large number of samples reduce the variance of the mean. "Odd looking win ratios" are therefore a non-problem as long as the number of randomly selected samples are large.

Clog

United States950 Posts

May 20 2011 20:24 GMT

#117

This post was rather useless.

His entire point was, as you said, the model people use for imbalance is their own opinions. The OP is trying to make an effort to move away from that and provide somewhat of a structure for balance discussions.

If you're going to try and discredit his post, you should put some effort into not only reading it, but understanding it as well.

iSTime

1579 Posts

May 20 2011 20:28 GMT

#118

On May 20 2011 15:00 avilo wrote:
The model everyone uses for imbalance is...guess what?

Their personal bias and opinion. Notice my use of italics for emphasis.

So basically your argument is, "Everyone uses a broken model for imbalance, therefor fuck you for trying to present a more logical model."

That is such mind-boggling poor logic.

OP's model is flawed in many ways, but your counterargument is even worse.

Sleight

2471 Posts

May 20 2011 20:46 GMT

#119

On May 21 2011 04:54 Jombozeus wrote:

Show nested quote +

On May 21 2011 04:11 Sleight wrote:

On May 21 2011 03:28 Jombozeus wrote:

On May 21 2011 03:17 sylverfyre wrote:

On May 21 2011 02:59 infinity2k9 wrote:

Contrary to popular belief, anecdotal evidence from pros usually do. Statistics is the only metric to measure data? Since when have we concluded that a metrics is necessary?

The assumption you make with the net result is absolutely preposterous. Its grossly abusing inductive reasoning. The stats themselves show that at different levels, the win% of different races is different in each matchup. I don't understand how you can convince yourself that is a valid argument.

As there are still those who have not realized, identification of short term "imbalance" is easy with statistics, we say "hey, we see 55% winrate over terran as zerg at X point master level, hence zerg is more imba than terran at X point master level." That does NOT mean:

1. Zerg is imbalanced compared to terran at all levels
2. Zerg will exhibit the same winrate vs. terran tomorrow, next week, or next month due to a new paradigm shift
3. Zerg players and terran players will exhibit the same level of increase in general skill at the same rate
4. We shouldn't listen to IdrA because of his cognitive bias towards the zerg race

1,2,3 are assumptions that lapses in logic, while 4 is a conclusion the OP made with the utmost lack of respect for pro gamers.

Would you go up to a scientist and tell him: "Hey, I know you're a scientist, but because you have cognitive bias I don't believe you should be able to make conclusions about science."

PS: Short term can be as short as an infinitely small amount of time

YES! That's exactly the point. You DO say that in the current scientific community. There is a reason paper publication works like it does. Peer review is established so that one person's findings have to hold up to expert in the same area who have NO PERSONAL GAIN.

Your quote is EXACTLY why we don't use a single lab's results or a single paper. No one cares if you feel that way if it doesn't hold up to other un-invested parties.

So let's pull IdrA a ZvT expert's statment against more ZvT experts AND TvZ experts, and see if they all hold up. Soon enough we are sampling a monster pool and back at statistical analysis.

GeorgeForeman

United States1746 Posts

May 20 2011 20:53 GMT

#120

An excellent discussion of the way imbalance manifests itself on the ladder.

Of course, it's important to realize that what the OP did was use an incredibly simple model to elucidate a much more complicated game that is SC2. For example, skill is not one-dimensional. There are a lot of things that go into how well a player performs, and nerfs don't interact with these skills uniformly.

The point about not using ~50 or whatever games from the latest GSL as a basis for cries of "imba" is also well taken. Not only is the sample pathetically small, it's also highly biased by virtue of the fact that the games are not random draws but in fact are heavily dependent upon previous matches. Moreover, the direction of bias is in no way clear.

My point is not that the OP was bad or wrong. Rather, I think the important thing to take away is that if there really is imbalance, it's incredibly difficult to suss it out based on win rates alone, even when we use a metric ton of simplifying assumptions. When discussing balance in the future, a modest approach is therefore recommended.

VIB

Brazil3567 Posts

May 20 2011 20:56 GMT

#121

On May 20 2011 23:25 Warble wrote:
Blizzard probably has its own in-house statisticians who spend 8 hours a day on this problem while I do it on the bus. The difference is that Blizzard isn't sharing their findings with the community.

No they don't. Did you ever read any book on game design? Balancing is a very simple process. It's a brute force repetition of trial and error. Nothing more than that. It's pure intuition. Pure guess work. There's zero science. There's zero math. Blizzard has no statisticians at all.

+ Show Spoiler +

avilo

United States4100 Posts

May 20 2011 21:02 GMT

#122

On May 21 2011 05:28 PJA wrote:

Show nested quote +

I was pointing out that there is no model and you cannot model it. Everyone has their own perception of what the balance of the game is based on their own bias and experiences. It's not something you can quantify with numbers or statistics.

You can't try and model something like this with numbers or statistics. Someone might think a unit is massively OP, and someone else might think it's just fine. That's their opinions. They didn't need a complicated formula to come to their conclusion, and people's opinions and conclusions about balance are ever changing as they learn more about the game.

So yah...you missed the point too...

Cheeznuklz

60 Posts

May 20 2011 21:02 GMT

#123

Thank you for being such a baller. I'd love to see the analysis that you didn't post, please do it!

oDieN[Siege]

United States2905 Posts

May 20 2011 21:03 GMT

#124

I agree with what most people said above.. nice post, especially it being your first.

eXwOn

Canada351 Posts

May 20 2011 21:16 GMT

#125

I normally don't like these types of posts-but as I read on I quickly agreed with every point. This is probably one of my favourite reads on TL so far. Good job man!

L3gendary

Canada1470 Posts

May 20 2011 21:16 GMT

#126

Not sure what significance lower leagues have with regard to imbalance. And stats are utterly meaningless when it comes to balance. Blizzard doesn't balance their game around percentages but instead try to balance things that break the game.

Look at the thor change for example. Was there a lot of people using thors in TvP before? no Yet they nerf the thor because of an abusive strategy that was used very rarely.
edit: same with 4 gate and so on.

iSTime

1579 Posts

May 20 2011 21:30 GMT

#127

On May 21 2011 06:02 avilo wrote:

Show nested quote +

Just because people have a massively flawed understanding of game theory doesn't mean they are correct.

EDIT:
To give an example: Take the game of Go. Black goes first, but white gets some number of points at the beginning of the game for free, this is called komi. The game is clearly imbalanced if komi is 0, since with perfect play black will always win. With 50 komi it is clearly imbalanced, since with perfect play white will always win.

Pros used to believe 5 komi made the game balanced. This was their opinion, but as pros became stronger they now believe that 7 or 8 komi is actually balanced. Again, komi is more or less set by current professional opinion, though I'm sure statistics from recent games are taken into account.

Anyway, despite any professional player's opinion, there is actually some value for komi at which the game of Go is perfectly balanced. Even if it were set to that value, though, I'm sure plenty of pros would feel that the game isn't balanced.

You might argue that Go is different because it is a finite deterministic game, but that is actually not relevant. If you'd like to read up on game theory you can do that yourself, though.

Sleight

2471 Posts

May 20 2011 21:45 GMT

#128

On May 21 2011 06:30 PJA wrote:

Show nested quote +

Simple and precise. And this illustrates how modeling works. Chess, despite more possible games than atoms in the UNIVERSE, is easily modeled and cheated to simplicity by Computers. Programmers first tried brute force and failed epically. Then they "taught" computers candidate move theory. Then working with GMs they have finally taught it sufficient positional understanding that the top computers dominate most top GMs handily.

Go on the other hand cannot be simplified usefully at our understanding of the game. So computers barely beat mediocre amateurs.

Starcraft sits at the Go level, look at competitive AIs in BW and SC2, if they don't cheat, they can't possibly handle the data in a relevant manner. Humans however can ignore pieces selectively on the fly and assess "what matters," allowing us to create models computation fails to resolve.

Statistics are perfectly useful in appropriate settings, aka trying to determine the gestalt of the MU imbalances. Peer review is the method for fine tuning. This is what the OP supports, this is what is done.

Cheers.

thebole1

Serbia126 Posts

May 20 2011 21:48 GMT

#129

this is topic abouth imbalace ye ? if it is i have few things to say...

first there is imbalance around unites that their conters dont work...and some abylitys help to be unconterible...

simply exemples : colloss you cant conter em with imortals...(maby void ray work but not in game..)

mass marines with stim pack... : there is no unites to conter em in bettle expect colloss and tunder storm that is ability and banglings...fungel... problem is that that is only splash dps can stop it...

simply fact is that mass dps colloss banlgings stim pack MMM you cant conter without AOE unites or abilitys...i have 1000 what unite (compositions) dont conter em but i dont have time to write..

problem is in gameplay 2... i look about 5 min ago liquid tayler great player great skill everything but he played agenst Terran and he didnt build colloss and hts.. but becous of presure he builed gateway unites...

what hepened he menage to great macro micro pull out (kill) few medivac but simply without any skill and A MOVE MMM is able to kill Great player army like nothing....you dont need skill to a move and win whatewer your oponent is..

gameplay mestace is that unites and abilitys that killed skill and micro is STIM PACK banglings and colloss and FF...but if they nerf this abilitys and unites DPS maby gameplay will not be broken as it is now... sry for my ENG....thx for reading..

VIB

Brazil3567 Posts

May 20 2011 22:05 GMT

#130

On May 21 2011 06:45 Sleight wrote:
Chess, despite more possible games than atoms in the UNIVERSE, is easily modeled and cheated to simplicity by Computers. Programmers first tried brute force and failed epically. Then they "taught" computers candidate move theory. Then working with GMs they have finally taught it sufficient positional understanding that the top computers dominate most top GMs handily.

What are you talking about, computers still use brute force to beat GMs with varying levels of candidate moves to try to reduce time. Candidate moves for computers is a tradeoff that makes the computer play worse but is faster to calculate. So they try to balance how much brute forcing to use for slower computers.

The only reason Go is harder for computers is because there are more possible moves to analyze through brute force.[/quote]

On May 21 2011 06:45 Sleight wrote:
Starcraft sits at the Go level, look at competitive AIs in BW and SC2, if they don't cheat, they can't possibly handle the data in a relevant manner. Humans however can ignore pieces selectively on the fly and assess "what matters," allowing us to create models computation fails to resolve.

One thing has nothing to do with the other, existing SC AI only lose to humans because they suck and no one with the right skills ever tried to seriously tackle that problem. Making a SC AI that beats humans is technologically trivial. The only hard problem would be reverse engineering the game engine to realize what effect each move will have before executing it (in chess you already know that if you place your peon in range of another peon, you lose your peon, in SC that depends on the inside of the game engine). Once that's done, there's very very few possible moves to analyze. It would be a very easy game to brute force.

scorch-

United States816 Posts

May 20 2011 22:32 GMT

#131

On May 21 2011 07:05 VIB wrote:
One thing has nothing to do with the other, existing SC AI only lose to humans because they suck and no one with the right skills ever tried to seriously tackle that problem. Making a SC AI that beats humans is technologically trivial. The only hard problem would be reverse engineering the game engine to realize what effect each move will have before executing it (in chess you already know that if you place your peon in range of another peon, you lose your peon, in SC that depends on the inside of the game engine). Once that's done, there's very very few possible moves to analyze. It would be a very easy game to brute force.

You don't need access to the game engine to do this. You only need to know the game rules, which are easy enough to determine.

I like that OP tries to find a balance discussion for the ladder at steady-state, rather than just talking about the game. There's an inherent problem in this though, because it assumes everyone on the ladder has reached their level of incompetency and is no longer improving. In actuality, players are constantly improving and doing so at possibly different rates.

If Player A and Player B play 100 games with a 50% win ratio, only in a theoretical world could you say with any certainty that they are evenly skilled. The order that they won games is important. If Player A wins games 1-90 with one strategy and player B discovers a new strategy and wins games 91-100... who is more skilled? The player with 90% win rate or the player who will now win every game until the other improves?

Why does this effect your statistical summary? Skill gain does not occur evenly for each player, and cannot be predicted by any conceivable ladder-based metric. There is no state that would allow an analysis of the type you're proposing. It's an interesting idea, but I don't think it's ultimately useful for balance discussions.

VIB

Brazil3567 Posts

May 20 2011 22:36 GMT

#132

On May 21 2011 07:32 scorch- wrote:
You don't need access to the game engine to do this. You only need to know the game rules, which are easy enough to determine.

Yes you do, no it's not easy to determine. Bugs and other peculiarities in the engine are not technically possible to determine from an outsider point of view. Reverse engineering is required.

iSTime

1579 Posts

May 20 2011 22:43 GMT

#133

Show nested quote +

I don't understand how there are very very few possible moves to analyze... Even in a simple engagement of, say, 1 zealot 2 stalkers vs 1 zealot 2 stalkers there are so many different variations. If you include possible locations to have your scouting probe for each player and all the possible paths you could be taking it gets even worse.

EDIT: In case I was not clear, the location of the scouting probes is important because they sometimes come into play in these situations.

scorch-

United States816 Posts

May 20 2011 22:54 GMT

#134

On May 21 2011 07:36 VIB wrote:

Show nested quote +

Yes you do, no it's not easy to determine. Bugs and other peculiarities in the engine are not technically possible to determine from an outsider point of view. Reverse engineering is required.

Human beings play the game. An AI plays the game. The AI performs the exact same actions a human being does when a human plays the game, only instead of actually moving a mouse, it just tells the computer it moved a mouse. Did humans "reverse engineer" starcraft to be able to play it? No they played it, learned the rules, and got better at it. Stop making shit up.

Wrongspeedy

United States1655 Posts

May 20 2011 23:00 GMT

#135

On May 20 2011 14:10 usa11220 wrote:
best first post ever?

Best first reply to best first post. Wait has this been done?

This is def my favorite OP since starting to visit this website I think. At least when it comes to any kind of strategy or balance discussion.

Duravi

United States1205 Posts

May 20 2011 23:31 GMT

#136

The biggest problem I think this model has is that it takes too broad of an approach to what "balance" is. You are modeling the statistics around trying to show if matchups are imbalanced. According to your article as far as I understand it we could have a statistical situation of balance even if each race has very abusive strategies for certain maps and match ups.

I do not think blizzard "balances" the game at this large a scale, nor is it necessarily useful to. What blizzard considers balance is outcomes being highly dependent on player skill and dynamic play. They do not want a situation where one strategy is always the best against x race, no matter what that race does even if statistically we could consider that balanced. They have been known to make changes based on replay packs sent in by pros showing certain abusive timings for instance. This is what any kind of statistical analysis should be focused on not a broad race vs. race discussion. A race is a set of all possible units and strategies that race is able to use. In any particular game specific parts of that set are used in quantities that vary with time. That is why even a mirror match up can be "imbalanced", if both races arrive at a place where they must play out the game the exact same way. The player with more skill will undoubtedly win but there is no variation in game play. Of course some strategies will be more useful than others but the goal of blizzard is to have the widest array of viable strategies well at the same time keeping player skill the predominant factor in determining outcomes and you are completely ignoring the first part.

In this sense it is much more useful for a pro player to say to blizzard, "this strategy on this map is abusive it limits my options too severely and/or it does not allow the more skilled player to win as often as they should" and then provide examples, than it is to give a statistical analysis of a ton of different strategies across a large number of maps. By ignoring the fact that blizzard wants to have varied game play in starcraft you are not correctly defining what balance is.

TLDR; Race vs. Race balance is not important, strategy vs. strategy balance is; and blizzard has two goals with balance to favor the player with more skill AND to have varied and dynamic gameplay which this analysis does not even consider

thebole1

Serbia126 Posts

May 20 2011 23:40 GMT

#137

TLDR; Race vs. Race balance is not important, strategy vs. strategy balance is; and blizzard has two goals with balance to favor the player with more skill AND to have varied and dynamic gameplay which this analysis does not even consider

Man problem is that gameplay dising is broken... idra said the truth whan he said that low skill ppl can win agenst high skilled ppl...

i whatch liquid tayler in game where he faced MMM (stimed) and he only pull out gateway unites and he losed EASY...

so no mether how much skill you have its not inportant THIS GAME BECOMED A GAME OF UPGREADS NOT SKILL...END...

no mether how much micro you have stim pack colloss banglings Force fealds kill your skill...Game LIMITE skill of great PRO players like IDRA and other ppl...that is BIG PROBLEM...

The my conclusion that unites That does ULtra mass DPS they are killing gameplay and skill..

Slago

Canada726 Posts

May 20 2011 23:43 GMT

#138

I appreciate the work you put into this, but really don't agree with alot of it

ploy

United States416 Posts

May 21 2011 00:17 GMT

#139

Funny coming from the biggest balance whiner around... only about their own race, of course.

Sentient

United States437 Posts

May 21 2011 00:23 GMT

#140

On May 21 2011 05:56 VIB wrote:
No they don't. Did you ever read any book on game design? Balancing is a very simple process. It's a brute force repetition of trial and error. Nothing more than that. It's pure intuition. Pure guess work. There's zero science. There's zero math. Blizzard has no statisticians at all.

To be fair, Blizzard did hire some statisticians to develop the new ladder system, though I don't know if they are still around.

I agree with your sentiment though. I don't think posts like these are that helpful. Balance is about perception and gut instinct, not about statistics. It's about game design -- losses should feel "fair". From an entertainment perspective, the breakdown of win percentages is less important than whether or not losses feel like the game has conspired against you. As someone else mentioned, mirror matches can feel incredibly unfair at times, which hints at balance problems even though the win percentage itself is exactly balanced.

tldr: Balance is a subjective opinion and is reached by consensus, not by statistics.

VIB

Brazil3567 Posts

May 21 2011 00:34 GMT

#141

On May 21 2011 09:23 Sentient wrote:

Show nested quote +

I agree with your sentiment though. I don't think posts like these are that helpful. Balance is about perception and gut instinct, not about statistics. It's about game design -- losses should feel "fair". From an entertainment perspective, the breakdown of win percentages is less important than whether or not losses feel like the game has conspired against you. As someone else mentioned, mirror matches can feel incredibly unfair at times, which hints at balance problems even though the win percentage itself is exactly balanced.

tldr: Balance is a subjective opinion and is reached by consensus, not by statistics.

Balance can be objective, it doesn't need to be done with only intuition. I was only saying that how games are made today, it is made with intuition, no one is trying to make a perfectly mathematically balanced game. And SC2 certainly isn't any different. Instead of brute force test-patch iterations blizzard does. We could have built a game designed around a balanced model. But SC2 would have to be redone from scratch for that to happen.

Other than that I do agree with what you're saying. Statistic inference is worthless balancing. And above anything else, before we start any objective discussion about balance, we must first define what "balance" means. There's a lot of disagreement to what a balanced would be. Most people arguing about balance are comparing apples to oranges and thus won't get anywhere.

Beakyboo

United States485 Posts

May 21 2011 00:43 GMT

#142

On May 21 2011 04:11 Sleight wrote:

Show nested quote +

And why exactly would you think the "methodology presented" is in fact correct? I think most people would argue that that is totally false in fact. For an extreme counterexample, imagine automaton 2000 playing TvZ. I don't think Zerg could ever beat marines with flawless micro. Unless you want to imply that a human skill cap actually creates the situation described, but it seems a rather arbitrary perspective.

Balancing SC2 around statistics would be a rather silly approach, especially considering that we want more from SC2 than simply a 50% win rate among all races at a pro level. Coin flipping is a fair game but it's not a very interesting one. Using anecdotal evidence from players while at the same time considering their potential bias is really the most logical way to go about balance in my opinion. SC2 could be a fair game in the sense that every race is competitive at a pro level but there can still exist fundamental problems that would be hard to reveal through pure statistics.

PR4Y

United States260 Posts

May 21 2011 00:47 GMT

#143

any1 else think this guy is secretly a blizzard employee feeding the TL community information?

nobody does this much statistical analysis and writing just to make a 1st post on teamliquid, unless they have a hidden agenda.

conspiracy.

xlava

United States676 Posts

May 21 2011 00:55 GMT

#144

I understood pretty much all of it. The only thing I disagree with is that you can reference balance to sub diamond players. In those leagues statistics are irrelevant. You'd have to define a secondary function relating player skill to league, else it doesn't work.

Overall though, tremendous post, its saturated with awesome stuff, I love your applications of math to the game. This is truly a lot to handle, must have taken you so much time to make.

Thanks for posting.

figq

12519 Posts

May 21 2011 00:59 GMT

#145

Our statistics from GSL would show MKP and MVP winning most of their games against other terrans. Conclusion? It’s not terran that’s OP, it’s the players.

That assumes direct correlation between the skill of a player in different matchups, which isn't even the case.

I think for the sake of analysis the game should be looked at as 9 different games, and players to have different ELO for each matchup.

Also, perhaps a revolutionary thought:
What if the endless debate about (im)balance is in the flaw of the whole concept of competition.
Or more precisely, the attempts to quantify the efforts involved in heterogeneous competition (which coincidentally is all human competition, because we are different). It's like measuring a bird flying and an elephant running on some unified scale of effort. It doesn't work. The abilities and the efforts are simply unique to each individual.

I don't know why it has never been taken seriously to have pro-leagues of only random players, or even better - that you have to play a game of each matchup once. In chess, they play turns, one with white, one with black, and so forth. In football they take turns playing home and away. Only in Starcraft you get this weird social experiment where a swimmer and a runner compete, with the pool liquid density, and the runner's shoe weight, being adjusted to find some elusive concept of balance.

Obviously, this process is doomed to be always corrupt and disputed, because it can't be objectively measured. But I guess that's the hook - the hook is this paradox of perpetual injustice. A game which is perfectly just is also usually solved and uninteresting. But hook them up with endless dynamic injustice, and they will play it and play it, till the end of the world. (:<

Side note regarding the argument about gender equality:
+ Show Spoiler +

A summarised report on college admission rates show that 40% of college entrants are female and 60% of college entrants are male. Is this “good enough”?

To the untrained eye, the numbers look pretty close to 50%. But in reality this is a massive difference. The true numbers crop up when we compare them with each other. So out of 100 entrants, 40 are women and 60 are men. That means 150% as many men enter college as women. When presented that way, the magnitude of the difference becomes much more evident and it becomes clear there is sexism problem with the college admission system.

I'm a very strong believer in equality for both sexes, however this case does not necessarily mean injustice of the admission tests, but, for example, injustice of the whole cultural environment and educational process. It's hard to twist it that tests in mathematics are sex-biased inherently by their design - but rather our culture as a whole does not manage to operate in such way that both sexes develop equal cognitive mathematical ability.

MrCon

France29748 Posts

May 21 2011 01:08 GMT

#146

The OP was right when he said in introduction that this kind of threads are destroyed by nitpickers

(or people that made their idea based on what they want, then argue only on things that favor their point)

avilo

United States4100 Posts

May 21 2011 01:35 GMT

#147

On May 21 2011 09:17 ploy wrote:

Show nested quote +

Funny coming from the biggest balance whiner around... only about their own race, of course.

If you have nothing to say then don't talk at all. Don't be one of "those" people that reference random things from months ago that aren't relevant. Did you just come into this thread to do that? Where in this thread did I utter a word about what I (personally) think of the balance of the game?

usethis2

2164 Posts

May 21 2011 01:47 GMT

#148

Excellent work but you're completely ignoring the fact that Activision Blizzard is a corporation, which exists to generate profit and to raise the value of their stocks. You seemed to have spent a lot of time on "affinity" section (as well as advice for would-be pros), but I think that's clearly misleading, at least until SC2 sales (including upcoming 2 expansions) plateau.

It's clear that Blizzard sees the future (i.e. money) in e-sports, but that is, well, future. Their revenue comes from the sale of games by and large. So I wasn't surprised at all the introductory race, the Terrans, to have most-finished look and it has been the most dominant race at release. (compare it with zerg's artwork and non-existent sound/voice acting) So the race that new players are most likely to play has been also the strongest race, unlike your "affinity" claim where less popular race would be more likely to be treated favorably by patches. Why? I strongly suspect that Blizzard (or any sane corporation who is obliged to make profit for shareholders) did not want to frustrate the general public market who just finished tasting SC2 by finishing single player campagin then jumped into the sea of Battle.net for some live action.

When each expansions hit, the number one priority for Blizzard is profit. Everything else can wait. Once all the expansions are out and sales become complacent the balance will take the first seat. And even then, you really can't trust a corporation to behave in a completely neutral manner at the cost of making money. For example, Blizzard released the first patch in years for WarCraft 3 (no one thought any more patch is coming for that game), and interestingly the patch buffed "Human" race after all those years. At the same time we learn that WC3 is rocking in China and Human race is by far the most popular race there. Does it get you thinking?

Jombozeus

China1014 Posts

May 21 2011 02:06 GMT

#149

On May 21 2011 05:46 Sleight wrote:

Show nested quote +

On May 21 2011 04:54 Jombozeus wrote:

On May 21 2011 04:11 Sleight wrote:

On May 21 2011 03:28 Jombozeus wrote:

On May 21 2011 03:17 sylverfyre wrote:

On May 21 2011 02:59 infinity2k9 wrote:

Contrary to popular belief, anecdotal evidence from pros usually do. Statistics is the only metric to measure data? Since when have we concluded that a metrics is necessary?

The assumption you make with the net result is absolutely preposterous. Its grossly abusing inductive reasoning. The stats themselves show that at different levels, the win% of different races is different in each matchup. I don't understand how you can convince yourself that is a valid argument.

As there are still those who have not realized, identification of short term "imbalance" is easy with statistics, we say "hey, we see 55% winrate over terran as zerg at X point master level, hence zerg is more imba than terran at X point master level." That does NOT mean:

1. Zerg is imbalanced compared to terran at all levels
2. Zerg will exhibit the same winrate vs. terran tomorrow, next week, or next month due to a new paradigm shift
3. Zerg players and terran players will exhibit the same level of increase in general skill at the same rate
4. We shouldn't listen to IdrA because of his cognitive bias towards the zerg race

1,2,3 are assumptions that lapses in logic, while 4 is a conclusion the OP made with the utmost lack of respect for pro gamers.

Would you go up to a scientist and tell him: "Hey, I know you're a scientist, but because you have cognitive bias I don't believe you should be able to make conclusions about science."

PS: Short term can be as short as an infinitely small amount of time

You're carrying this analogy out of proportions lol.

Scientists always come up with conclusions. You are required to come up with a conclusion in a scientific paper. Of course if IdrA says something, there will be other pros to argue with or against him, but no one will say his opinion is invalid because of bias.

To all these new posts:
Go is different from Starcraft. Go has an astronomically high amount of possible moves, but there are only two variables. There are thousands of variables in Starcraft with equally astronomically high amount of possible moves.

The variable for go is the black and white piece, while Starcraft has many different pieces, hence the analogy is a bit flimsy. You can theoretically get Go to be played at a 100% level given all the responses possible with a supersupersupersupercomputer that doesn't exist, because it has a finite amount of possible moves.

You can never get a computer to beat a player 100% of the time. If you are to do that, freedom of information must be given to the computer, or in other words, fog of war must be removed. Go gives perfect information.

Geniuszerg

Canada454 Posts

May 21 2011 02:08 GMT

#150

mind blown.
this was an excellent read, totally agreed!

Andwhy

United States91 Posts

May 21 2011 02:10 GMT

#151

I have more analysis on specific mechanics within SC2 but am not sure if anybody would be interested in seeing them.

The more the merrier. I love mathcraft, even if its theory mathcraft. Thanks for all the effort you put into it.

Sleight

2471 Posts

May 21 2011 02:43 GMT

#152

On May 21 2011 11:06 Jombozeus wrote:

Show nested quote +

On May 21 2011 05:46 Sleight wrote:

On May 21 2011 04:54 Jombozeus wrote:

On May 21 2011 04:11 Sleight wrote:

On May 21 2011 03:28 Jombozeus wrote:

On May 21 2011 03:17 sylverfyre wrote:

On May 21 2011 02:59 infinity2k9 wrote:

Contrary to popular belief, anecdotal evidence from pros usually do. Statistics is the only metric to measure data? Since when have we concluded that a metrics is necessary?

The assumption you make with the net result is absolutely preposterous. Its grossly abusing inductive reasoning. The stats themselves show that at different levels, the win% of different races is different in each matchup. I don't understand how you can convince yourself that is a valid argument.

As there are still those who have not realized, identification of short term "imbalance" is easy with statistics, we say "hey, we see 55% winrate over terran as zerg at X point master level, hence zerg is more imba than terran at X point master level." That does NOT mean:

1. Zerg is imbalanced compared to terran at all levels
2. Zerg will exhibit the same winrate vs. terran tomorrow, next week, or next month due to a new paradigm shift
3. Zerg players and terran players will exhibit the same level of increase in general skill at the same rate
4. We shouldn't listen to IdrA because of his cognitive bias towards the zerg race

1,2,3 are assumptions that lapses in logic, while 4 is a conclusion the OP made with the utmost lack of respect for pro gamers.

Would you go up to a scientist and tell him: "Hey, I know you're a scientist, but because you have cognitive bias I don't believe you should be able to make conclusions about science."

PS: Short term can be as short as an infinitely small amount of time

You're carrying this analogy out of proportions lol.

Scientists always come up with conclusions. You are required to come up with a conclusion in a scientific paper. Of course if IdrA says something, there will be other pros to argue with or against him, 1[1]but no one will say his opinion is invalid because of bias.

To all these new posts:
Go is different from Starcraft. Go has an astronomically high amount of possible moves, but there are only two variables. There are thousands of variables in Starcraft with equally astronomically high amount of possible moves.

[2] The variable for go is the black and white piece, while Starcraft has many different pieces, hence the analogy is a bit flimsy. You can theoretically get Go to be played at a 100% level given all the responses possible with a supersupersupersupercomputer that doesn't exist, because it has a finite amount of possible moves.

[3]You can never get a computer to beat a player 100% of the time. If you are to do that, freedom of information must be given to the computer, or in other words, fog of war must be removed. Go gives perfect information.

[1] Incorrect. That is EXACTLY what you will say. Listen to the first balance conversation Day[9] had on SOTG after SC2 officially released for the exact methodology of explaining this. Effectively, it comes to this; You, meaning every single person, cannot have an unbiased opinion in any meaningful way. Results are based on data and anyone who believes data does not lie has never worked on major publications in the sciences. The gold standard is a Peer-Review paper because ONLY IF other people find it not just plausible, but recreatable and demonstrable, is it acceptable. If you say ZvT is imbalanced, you have to show the exact situation and go thru and show each moment of contention in context. Data is meaningless unless it is unassailable.

[2] Incorrect. You literally have never played Go to say this. Playing White vs Black is one of the infinite variables. Every move in Go after the opening series has between 50-250 possible followups. Like Chess, Go has unbelievable depth of strategy. There are positional imbalances that change and grow as the game progresses, their are space advantages, their is the advantage of playing first wiht black, etc. Variables mean values subject to change. Every single piece in Go is subject to change within a few short moves.

[3] Incorrect. You could easily build a computer to beat a human 100% of the time in a game of incomplete information given the right definition of "winning." Ever played against a good Poker AI? They obviously have perfect calculation of hand value, don't tilt, and a perfect memory. Furthermore, there is variability inserted into its play that makes it very difficult to read while you yourself are an open book after the first 30 hands. Heads up? Yeah people can sometimes get super lucky and beat the computer before it compiles a useful hand history, but at a table? A computer always wins given enough time.

Starcraft is the same idea. There are strictly superior strategies with perfect execution, we just havent identified them, then you script down an AI to execute the most powerful tactics blended with just the right timing sense and variability, I bet it could beat MC, Nestea, or MVP forever. In a single event/game/hand? Sure a human might win. But play a BoX and as X grows, computers will dominate, as the stack size expands relative to the blinds, poker AI's wipe the floor with humans.

joshboy42

Australia116 Posts

May 21 2011 03:07 GMT

#153

Very interesting read to the OP, I'd be interested to see what conclusions you come to

Jombozeus

China1014 Posts

May 21 2011 03:07 GMT

#154

1. lol. Don't go all smartypants and explain what a peer review is, everyone knows. You can disprove a scientist with peer review, but you cannot say his opinion is invalid. I don't see the concept that is so hard to grasp. You put IdrA and Cruncher in a room to have a debate, they will give you different answers on balance but their opinions are equally valid. You put a Keynesian economist and a neo-classical economist in a room and the same thing happens.

Peer review is used to prove (to a certain extent) and disprove a theory raised by a scientist, to see if the scientist is in essence right or wrong. The scientist can be wrong, but hes entitled to make his own conclusion. No one will say his opinion is invalid because of bias. I stand by that. You are taking this analogy and twisting in your own words. His opinion may be wrong because he has made a mistake or was outright lying about the data, but no one will say "I reject your hypothesis because you have bias." Not at least until there is sufficient proof that it is the case. Simple: A man is innocent until proven guilty.

2/3. I used to play Go at a semi-competitive level when I was young, don't make wild assumptions. White vs. Black are two variables. You can put in a white or a black piece, and the variable you can plug in is the place where you place the piece. You can map out an entire game of Go with two variables just like you can do with chess.

You cannot do this with Starcraft, and hence the impossibility of bruteforcing a build. The amount of variables, including the lack of information, will always result in some kind of coin flip. For example, after a prepatch 4gate encounter, the computer techs to colossus, while you tech to stargate. He does not know there is a stargate because he is denied this scouting information by vigilant stalker/detection micro. A perfectly microed colossus will never beat a phoenix.

Starcraft is similar to poker, but poker also has a finite number of moves. I say 100% and you say a poker program can beat the player 99% of the time. 99% is so much different from 100%. The whole point is the approach towards infinite, not how large that number is.

ploy

United States416 Posts

May 21 2011 05:15 GMT

#155

roooofl, don't want to derail or anything, but there is no good AI poker bot except for perhaps heads up games. Other than that... not even close.

Also, you are oversimplifying chess/go. If you could map out an entire game of go/chess so easily, then the games would have already been "solved" by computers.... which computers are not close to doing at all as of right now.

Edit: On second thought, there are definitely no good AI poker bots. The only bots that can play heads up well are based purely on game theory, which means whatever the other player does means nothing to the bot. No AI involved, just game theory (simplified forms of heads up poker have been solved by game theory).

Jombozeus

China1014 Posts

May 21 2011 05:55 GMT

#156

On May 21 2011 14:15 ploy wrote:
roooofl, don't want to derail or anything, but there is no good AI poker bot except for perhaps heads up games. Other than that... not even close.

Also, you are oversimplifying chess/go. If you could map out an entire game of go/chess so easily, then the games would have already been "solved" by computers.... which computers are not close to doing at all as of right now.

Edit: On second thought, there are definitely no good AI poker bots. The only bots that can play heads up well are based purely on game theory, which means whatever the other player does means nothing to the bot. No AI involved, just game theory (simplified forms of heads up poker have been solved by game theory).

The amount of processing power in our supercomputers is a fraction of what is needed to solve every move of go.

Samhax

1054 Posts

May 21 2011 06:06 GMT

#157

I think computers in chess are pretty damn good, they can beat or do draws against the top players, But in Go it's impossible for a computer to match a top player. Go is a lot more complicated than chess.

babyToSS

233 Posts

May 21 2011 07:01 GMT

#158

Wonderful post from the OP. I would like to add some more food for thought and would love it if other users could respond to my comments -

Have you considered the GSTL race matching preferences? Is there a trend in what race certain teams prefer to matchup against other races/teams? Is this pattern entirely player dependent or race dependent? even on a sub-conscience level is there a pattern of a certain race being more favored against other races?

Even if the OP has not considered GSTL I do think, racial imbalances aside, other very interesting patterns regarding player styles, reputations etc can result in very interesting patterns of what a team plays against specific players/races.

Raid

United States398 Posts

May 21 2011 07:11 GMT

#159

I totally agree, I think balance should not be left to the community because we have to admit we are all biased towards a specific race. If you ask any pro player they will usually feel like their race is the weakest and no one is going to say their race is the strongest except maybe Genius because he is cocky.

Blizzard needs to stop patching and listening to pro players/community qq and look at the results/games themselves devote some sort of branch in their company that handles balance for sc2 and have them study/analyze games themselves. People who work for their company in order to better develop a game are generally not biased and should be credited.

krews

United States1308 Posts

May 21 2011 08:01 GMT

#160

wow what a smart and amazing first post

elkram

United States221 Posts

May 21 2011 08:36 GMT

#161

well after a day I was finally able to read your OP

, most of it at least (i'm not into statistics all that much, at least the nitty-gritty of it). I thought it was some pretty awesome conclusions, just b/c I love learning new things, but that's just me. Anyways, I'm really hoping that you don't get dissuaded by all the random nit-pickers who have some issues with your OP, I'm really looking forward to another fantastic statystical analysis as you do two things:

1) keep in the statistical parts and laymen parts in spoilers so it is optional read depending on how bold you are

2) you keep both the statistical parts and laymen parts in the OP, where most other OPs would have just kept either only the statistical part of the laymen part but not both.

Must have been a lot of effort to code everything and go back and catch most edits and everything, all-in-all a fantastic thread, would love to rate it a 5/5 if it were blog. Keep up the good work

EzEnd

United States17 Posts

May 21 2011 08:36 GMT

#162

Good post but the word balance is just to hard to start a discussion just because it is mostly about each players opinion. Now if we think about balance in a game where there are 3 races which are zerg, terran, and protoss and each race with specific units and different types of maps this just takes the word balance to a new level. But it was an interesting post and fun to read

Candles

United Kingdom103 Posts

May 21 2011 09:05 GMT

#163

Incredible post. Good work indeed.

WinterNightz

United States111 Posts

May 21 2011 09:35 GMT

#164

So if a matchup is imbalanced, a disproportionate amount of race X players will be promoted to higher leagues, rather than maintaining equal league-population/total-population ratios to the other races.

^ OP in one sentence. I have to agree with the person who said "an incredibly long post does not a great OP make". (or whatever it was). It's a cute idea, but it's pretty short sighted. You assume so many things, the first of which being that a single point of imbalance affects all leagues equally, which is just plain not true.

And really, your entire mythbusting section just makes you sound like you haven't thought any of this through.

And even still, what's the point? Your entire analysis is "If this matchup is imbalanced, the distributions will move this way". Truth is, imbalance is not the only thing that can move those population distributions. The easiest example is when a new popular strategy comes out and nobody has any clue how to deal with it. No patch was released. It's just people playing the game in different ways, and yet your analysis would conclude there's imbalance now when there wasn't before. And then in a few months, people figure out how to scout and respond, and the "imbalance" is gone.

The only thing this analysis can actually provide is a test to see if there -isn't- imbalance, and it can barely do that. If it sees imbalance, it could be purely an effect of the matchup not being understood on one side. If it doesn't see imbalance, that would be just because we haven't found the imbalance yet.

There's so many words, but a complete lack of a point. Even if this sort of analysis showed conclusively that an imbalance existed, then what? You would still need to look at individual games to see what it is about strategy X that is imbalanced. You can't just say, "Hey, PvZ is imbalanced. Let's give Archons +5 more damage vs. bio and call it a day".

It just sounds like a nice idea, but it has absolutely nowhere to go.

Authweight

United States304 Posts

May 21 2011 09:46 GMT

#165

It seems like some people are missing that this post is not about the best way to balance the game, but rather is about the best way to statistically analyze the balance of the game. He is saying that pros are statistically unhelpful. If you're balancing based in reasons besides statistics, obviously pro's opinions become much more important.

forSeohyun

504 Posts

May 21 2011 21:14 GMT

#166

On May 21 2011 18:46 Authweight wrote:
It seems like some people are missing that this post is not about the best way to balance the game, but rather is about the best way to statistically analyze the balance of the game. He is saying that pros are statistically unhelpful. If you're balancing based in reasons besides statistics, obviously pro's opinions become much more important.

1. He did not, in fact, say anything, on how one should, practically, analyze the balance statistically. He is not, specifically,suggesting what one should be doing with the following methods: analyzing win ratios over a career, win ratios over some fixed time, ladder points, number of race x in league y. He only presents 4 cases with arbitrary proportions of players.

2. He says that "pros are statistically unhelpful" but it is wrong, you can statistically balance the game using only pros, if and only if (usually called iff) you: a) want to balance the game primarily for the pros and/or b) the balance changes gives equal results over all leagues.

If the win ratios from 99 GM-players (where 33 of each race are randomly selected from 3 lists of GM-players according to race) if these win ratios differ by much then it might very well be a result of imbalance.

That we need to include all players down to 50% win ratio to perform a statistical analysis is a pretty big fallacy.
Is every opinion poll in the US conducted by interviewing half the population? No, but a larger sample gives a smaller confidence interval.

I am sure that there are better and more experienced statisticians than myself, I would welcome them to take the debate if they feel that it's worth it.

This article does however contains errors and shortcomings, and so many people seem to get it, objectively, wrong, that in the mean time I will try to point out the faults lest people will fall into the wrong mindset about statistics and game balance.

flowSthead

1065 Posts

May 22 2011 23:52 GMT

#167

On May 21 2011 15:06 Samhax wrote:

Show nested quote +

I read a pretty good article about how it has nothing to do with how complicated Go is. The issue between Go and Chess is that Go pieces don't move. The article was guessing and not sure, but it essentially argued that because Chess pieces move, it is easier for the computer to change the positional structure of the board in such a way as to benefit itself. With Go, the position changes only in so far as you add other pieces to change the position.

Part of it though is of course the large size of the board. A 5x5 Go board has in fact been solved. A 7x7 and 9x9 have not though, so of course a 19x19 board hasn't been solved either.

The analogy doesn't work with Starcraft very much though since units are not equal between races. Although they may serve the same function, zerglings, marines and zealots are different enough between each other that they are not strictly comparable the way a black piece and a white piece can be compared in Go or Chess (they are the same piece).

Anyways, I liked the OP's post. The OP may have made certain mistakes in logic, I cannot say since my math background is very poor compared to some other people in the thread. I do have to say that I agree with the notion that listening to pros discuss balance is usually pointless. Someone compared this with peer reviewed science, but I think it should be easily pointed out that peer reviewed science articles are reviewed by multiple scientists, and then ideally other scientists will try to recreate the experiments to see if there are problems in the initial experiment, or unaccounted for variables.

Starcraft pros don't do this. A few pros sitting around a table or in an interview discussing balance hardly makes for a good discussion. Looking at Idra and Day9 on SotG made this super obvious. Idra gave a number of problems he sees with Zerg, and Day9 asked to look at specific games to analyze. For all I know, Idra may have been 100% correct, 100% wrong, or anywhere in between. But seeing as they did not sit down to discuss it for any long period of time looking at a bunch of games, it really doesn't matter. And even if they did, they would need to have more than just Idra and Day9 looking at it since two people is hardly enough to smooth out any biases.

I also want to point the obvious point against Random players (if it hasn't been already, I tried reading all of the posts but I might have missed it). Random players can be biased against races too. Just as an example, TLO's least favorite race to play was Protoss, and I believe his least favorite matchup was PvP (I don't remember). TLO might be slightly less biased than single race players, but there is no reason to assume that Random players are automatically unbiased. And anyways TLO switched back to Terran, but the analogy works with any Random player.

Anyways, even if OP was wrong in many places, I don't understand why people are coming in here and shitting on him. Pointing out his mistakes is fine sure, but why is this post any more useless than any other post? Even if every point OP made was wrong, at least he is trying to think critically about it. Point out the mistakes, but leave personal insults out of it.

Normal

Please or register to reply.

Model for imbalance, with myths

Completed

Ongoing

Upcoming