Blizzard Blog: Balance Snapshot - Page 20

polysciguy

United States488 Posts

September 22 2011 22:16 GMT

#381

On September 23 2011 07:11 Vindicare605 wrote:

You realize that Tournament results come from a ridiculously small pool of players right?

Blizzard's sample size in those stats is made up of literally MILLIONS of games played. You can't argue with that with results pooled from less than a couple hundred games played by less than 100 total players. There's just way too many variables to take into consideration and not nearly large enough of a sample size to draw any real conclusions from.

perhaps but they are also the pool of players closest together in skill,

eleaf

526 Posts

September 22 2011 22:18 GMT

#382

On September 23 2011 07:10 polysciguy wrote:

Show nested quote +

On September 23 2011 07:08 eleaf wrote:

On September 23 2011 07:03 polysciguy wrote:

On September 23 2011 07:01 eleaf wrote:

On September 23 2011 06:45 polysciguy wrote:

On September 23 2011 06:30 Dragar wrote:

On September 23 2011 06:21 SeaSwift wrote:

On September 23 2011 06:15 Mindcrime wrote:

On September 23 2011 06:03 SeaSwift wrote:

On September 23 2011 05:52 ChriseC wrote:
there are some things that i dont understand

ladder system tries to keep you to 50% win/lose ratio, so isnt it representitive at all?

AHHHHHHHHHHHHHH

There really should be a link that is required viewing for every new TL member to the Blizzcon in which they explain their way of taking away that. Oh, and the squillions of other posts that ask exactly the same question.

Basically, Blizzard has a fucking huge equation to remove the matchmaking system from the equation (allegedly), among other things. So NO, the 50% win/lose ratio SHOULDN'T have any impact.

Without the formula to examine, there's no fucking reason to believe that.

You can find the formula, look for Blizzcon 2010 then search for part 3(?) of the panel section, AFAIK, then pause when they show the formula and take it down. Nope, it meant literally nothing to me either.

At 2.50.

It looks (to me) like a Bayesian-esque sort of analysis.

The basic process appears iterative; they start with a prior distribution (probably flat) and produce a posterior distribution by feeding it the result of a game, by using the win-percentage probability distributions for each player, for each race. Note that this doesn't need to know any true sense of skill, it just needs to know the outcome of the match and the predicted win percentages for those players - the latter of which are obviously accurate as it uses them for match-making.

The process is repeated over and over, adding the results of all the thousand (millions?) of matches taking place on Battlenet, with the previous posterior distribution becoming the new prior distribution each time the process is run. Eventually one hopes the distribution converges on a distribution from which the win percentages can then be simply extracted, as a simple function of league level.

I will have to think about this some more, but this makes sense and I think should work.

I don't think they are trying to 'scare' anyone away, as some other posters seem to imply. I think they don't want to have to wheel their math-guy out to give a series of long, dry seminars to explain Bayesian analysis. Personally, given how the match-maker seems to work pretty well, I'm inclined to take their word for this process working to the level of accuracy (+/-5%) that they claim.

it makes some sense...not the equation but the simple example that they gave....but it doesn't seem liek it would work in practice

they said they don't take into account how the player wins....perhaps the million zerg player have cheesed their way to victory.....that doesn't make the matchup balanced or say much about the skills of the player.
2 it doesn't take into account that some players are better against certain races or racial playstyles than others, meaning that there isn't a static "skill" that they can look at.
example: idra is great at the long game, so if you play against him and play a macro game and lose, assuming races are balanced, that means hes better, however if you rush him early and he loses that doesn't mean that you are a better player, just better against that long view style.
3. it doesn't take into account pure build order losses, i don't really see how it could.

you can't take into account all the variables that affect a player and put them into a formula, its not possible.
it also doesn't take into account that most of the pro's don't actually ladder for practice they ladder to refine a build or get a new build down.

You are partially correct. Cheese wont be take into consideration here. This winning percentage estimation system is only result based.

But I do believe they have another system to solve the strategy imbalance. In all, they have all the data. They can do whatever they want. And data dont lie. Yet here majority of the members just make conclusion based on their instinct.

id argue that they are making conclusions based on actual results

Well, u got to trust the 12k+ paid Ph.D's from Blizzard cuz they are suppose to be much better in mathematics than the community member here.

Ppl always thought they are smarter, but 99.9% of them cant figure out why they just cant have a A+ on their statistic class

12k+ phd's and 6 years of development should have yielded a game that was almost completely balanced already

They are supposed to ... but skill is the factor that keeps changing and make the math complicated. If they were allow to limit the apm to 20, they might actually design a perfect balanced game.

Vindicare605

United States16117 Posts

September 22 2011 22:19 GMT

#383

On September 23 2011 07:16 polysciguy wrote:

Show nested quote +

perhaps but they are also the pool of players closest together in skill,

I have to disagree highly with that.

The difference in skill level between MVP and say.... Ensnare is pretty substantial. The difference in skill level between Nestea and say... Inca in ZvP was astronomical.

The numbers that come from the GSL are heavily skewed by individual player skill, and because the sample size is so small that affects the total results much more so than stats pulled from a much higher sample size.

You'd know this obviously if you've taken even a brief intro to Statistics.

Umpteen

United Kingdom1570 Posts

September 22 2011 22:20 GMT

#384

There was a very long and well-presented thread some time ago detailing the effects imbalance would have on the ladder. I can't find it, unfortunately, but here's the short version:

Take imaginary races A, B and C, and imagine we also have some magical way of knowing in advance exactly how 'skilled' all the players are so that when we place them in the ladder they are exactly where they should be (ie, where they should end up if the game were perfectly balanced). Then we let them play.

Now suppose A is inherently favoured versus B, all other matchups equal. From the start, As will win > 50% of their games, Bs will win < 50%, and Cs will win 50%. This pushes As up the ladder, and Bs down the ladder. This stabilises when the extra losses As have versus better Cs counteract the diminishing racial benefit As get facing MUCH better Bs, and the same in reverse for Bs.

At the point of stability, how does everyone feel?

If you're an average A, you win AvB more than 50% of the time. You might get the niggling feeling that you're not having to work as hard for those wins as your opponents are. You lose AvC more than 50% of the time, and you might get the feeling that you're being outplayed. If you blame imbalance, you'll think C is overpowered and B underpowered.

If you're an average B, you struggle against A, and might feel like they have an easier ride. You win BvC more than 50% of the time, and might feel like you're outplaying them. To the extent you blame imbalance, you'll think A is overpowered and C underpowered.

If you're an average C, you win more often against A's and lose more often against B's. How you feel about this is hard to say. You might feel like the extra wins and losses are justified, or you might think B is overpowered and A is weak.

In other words, everyone sees rock/paper/scissors, even though only one matchup is imbalanced.

Now suppose A is favoured against both B and C. From our initial 'perfect' situation, As will tend to rise and Bs and Cs fall, stabilising when A's inherent advantage is countered by the higher skill of the Bs and Cs he's facing.

At the point of stability, how does it feel?

Everyone wins 50% of the time. Bs and Cs might feel like they have to work harder than As, so Bs and Cs will whine a lot, while As point to their 50% win/loss ratios and say 'QQ more noobs'.

It's a similar situation if A is underpowered against B and C: everyone wins 50% of the time but As might feel like they have to work a bit harder. They'll whine a lot, and Bs and Cs will tell them to cut the QQ because the win/loss ratios are 50%.

You can also directly superimpose combinations. Say A is overpowered against B and C, but particularly so against B. What will everyone see? Paper/scissors/stone again.

Dealing with the extremes of the ladder

You might expect, if A were underpowered, that bronze leagues would be overstuffed with As. But there are good reasons why this might not manifest. Firstly, any imbalance sufficiently pronounced as to be detectable in Bronze could induce a relatively higher drop-out rate of A players, reducing the numbers in those leagues. Secondly, not all imbalances (or balances) manifest at every level of skill (eg marine splitting), softening the effect towards the bottom of the ladder.

The same reasoning applies if A is overpowered: we should not expect to see disproportionately fewer As in Bronze (more Bs and Cs might quit, and not all imbalances can manifest).

However, we should see the effects at the higher end of play. Yes, the reduction in sample size does make the 'Flash Effect' a problem for analysis at the very top level, but there ought to be a sweet spot around the GM/Master level where the numbers involved are still high enough to be significant, but where any anomalous 'buoyancy' can still be detected.

The Upshot

The existence of single-matchup imbalances can be detected statistically (via paper/scissors/stone win/loss ratios) throughout the leagues, although pinpointing which matchup is imbalanced can be tricky (the order of paper/scissors/stone narrows it down to one of three). The existence of an OP or UP race, however - I cannot see how that can be detected at low to mid levels of play, no matter what maths you apply, because it looks exactly the same as if the races were balanced: close to 50% win ratios all round.

It might sound daft, but very likely the only useful statistic for gauging balance below pro/gm/high masters is the amount of QQ coming from each race, because that exposes the sensations engendered by imbalance that are hidden by the matchmaking system.

Looking at the figures provided, it seems safest to assume that the 'truth' lies somewhere between the NA and Korean numbers - in other words, Zerg is a little too predictable and easy to blindside, Terran a little too safe, resilient and flexible (relatively speaking). PvZ domination could well be an artifact of transient PvT weakness pushing good Protosses way down, so that they do better against Zergs, but it's hard to know when Blizzard have already eliminated some factors.

In other words, no huge surprises.

Koshi

Belgium38799 Posts

September 22 2011 22:21 GMT

#385

On September 23 2011 05:31 kubiks wrote:
I know I shouldn't post this but It seemed quite funny to caculate so I tryed.
If we consider that the game is balance means there is a uniform distribution on the code S spots for the 3 races (so basically we can toss a dice for each code S spot , so we throw 32 dices and look what is the result).
Now the probability that terran got k (20 here) spot in code S which have n players (32 here) is :
(n! 2^(n-k))/(k!(n-k)! 3^n).
It's quite ugly but I'm not sure it's possible to make a better formula (well at least it works for n=1 and n=2

)
This formula gives that the probability that there is 20 terrans out of 32 players is 0,05%.
In comparison the probability that there is 12 terran is 12% and 11 terran is 14%....

To be fair, 0.05% is for exactly 20 Terrans. Probability for 20 or more Terrans should be (guessing) around 0.4%?

Doesn't make it better. It feels too low. But it has been way too long since I worked with factorial equations (I don't know if I correctly translated in English).
I need sleep. Long day and tomorrow won't be better, I re- check it when I am @work.

Vindicare605

United States16117 Posts

September 22 2011 22:23 GMT

#386

On September 23 2011 07:20 Umpteen wrote:
There was a very long and well-presented thread some time ago detailing the effects imbalance would have on the ladder. I can't find it, unfortunately, but here's the short version:

Take imaginary races A, B and C, and imagine we also have some magical way of knowing in advance exactly how 'skilled' all the players are so that when we place them in the ladder they are exactly where they should be (ie, where they should end up if the game were perfectly balanced). Then we let them play.

Now suppose A is inherently favoured versus B, all other matchups equal. From the start, As will win > 50% of their games, Bs will win < 50%, and Cs will win 50%. This pushes As up the ladder, and Bs down the ladder. This stabilises when the extra losses As have versus better Cs counteract the diminishing racial benefit As get facing MUCH better Bs, and the same in reverse for Bs.

At the point of stability, how does everyone feel?

If you're an average A, you win AvB more than 50% of the time. You might get the niggling feeling that you're not having to work as hard for those wins as your opponents are. You lose AvC more than 50% of the time, and you might get the feeling that you're being outplayed. If you blame imbalance, you'll think C is overpowered and B underpowered.

If you're an average B, you struggle against A, and might feel like they have an easier ride. You win BvC more than 50% of the time, and might feel like you're outplaying them. To the extent you blame imbalance, you'll think A is overpowered and C underpowered.

If you're an average C, you win more often against A's and lose more often against B's. How you feel about this is hard to say. You might feel like the extra wins and losses are justified, or you might think B is overpowered and A is weak.

In other words, everyone sees rock/paper/scissors, even though only one matchup is imbalanced.

Now suppose A is favoured against both B and C. From our initial 'perfect' situation, As will tend to rise and Bs and Cs fall, stabilising when A's inherent advantage is countered by the higher skill of the Bs and Cs he's facing.

At the point of stability, how does it feel?

Everyone wins 50% of the time. Bs and Cs might feel like they have to work harder than As, so Bs and Cs will whine a lot, while As point to their 50% win/loss ratios and say 'QQ more noobs'.

It's a similar situation if A is underpowered against B and C: everyone wins 50% of the time but As might feel like they have to work a bit harder. They'll whine a lot, and Bs and Cs will tell them to cut the QQ because the win/loss ratios are 50%.

You can also directly superimpose combinations. Say A is overpowered against B and C, but particularly so against B. What will everyone see? Paper/scissors/stone again.

Dealing with the extremes of the ladder

You might expect, if A were underpowered, that bronze leagues would be overstuffed with As. But there are good reasons why this might not manifest. Firstly, any imbalance sufficiently pronounced as to be detectable in Bronze could induce a relatively higher drop-out rate of A players, reducing the numbers in those leagues. Secondly, not all imbalances (or balances) manifest at every level of skill (eg marine splitting), softening the effect towards the bottom of the ladder.

The same reasoning applies if A is overpowered: we should not expect to see disproportionately fewer As in Bronze (more Bs and Cs might quit, and not all imbalances can manifest).

However, we should see the effects at the higher end of play. Yes, the reduction in sample size does make the 'Flash Effect' a problem for analysis at the very top level, but there ought to be a sweet spot around the GM/Master level where the numbers involved are still high enough to be significant, but where any anomalous 'buoyancy' can still be detected.

The Upshot

The existence of single-matchup imbalances can be detected statistically (via paper/scissors/stone win/loss ratios) throughout the leagues, although pinpointing which matchup is imbalanced can be tricky (the order of paper/scissors/stone narrows it down to one of three). The existence of an OP or UP race, however - I cannot see how that can be detected at low to mid levels of play, no matter what maths you apply, because it looks exactly the same as if the races were balanced: close to 50% win ratios all round.

It might sound daft, but very likely the only useful statistic for gauging balance below pro/gm/high masters is the amount of QQ coming from each race, because that exposes the sensations engendered by imbalance that are hidden by the matchmaking system.

Looking at the figures provided, it seems safest to assume that the 'truth' lies somewhere between the NA and Korean numbers - in other words, Zerg is a little too predictable and easy to blindside, Terran a little too safe, resilient and flexible (relatively speaking). PvZ domination could well be an artifact of transient PvT weakness pushing good Protosses way down, so that they do better against Zergs, but it's hard to know when Blizzard have already eliminated some factors.

In other words, no huge surprises.

Even if the truth was somewhere between the Korean and NA results as you're suggesting.

That still leaves it MOSTLY within the 5% ratio that Blizzard defines as acceptably balanced. No matter how you slice it, according to these stats the game is more balanced than the forum QQ would have you think.

kubiks

France1328 Posts

September 22 2011 22:37 GMT

#387

On September 23 2011 07:21 Koshi wrote:

Show nested quote +

To be fair, 0.05% is for exactly 20 Terrans. Probability for 20 or more Terrans should be (guessing) around 0.4%?

Well I don't have to make any more calculation to say you that the probability to have 20 or more terran is not much more than 0.05%.
In fact for each terran you had you divide the chances by 2 (from the 2^(n-k)), and you still decrease a little more (because binomial coeeficient decrease when you are after the middle). The max would be 0.1% (and it's below that).

Paladia

802 Posts

September 22 2011 22:45 GMT

#388

On September 23 2011 07:21 Koshi wrote:

Show nested quote +

To be fair, 0.05% is for exactly 20 Terrans. Probability for 20 or more Terrans should be (guessing) around 0.4%?

Instead counting the probability of there being 20 or more Terrans in Code S doesn't really change anything. It just makes it 0.069% instead of 0.05%. As such, your guess is completely off.

Or in reverse, there is a 99.93% chance that something other than player skill is causing code S to have so many Terrans.

arbitrageur

Australia1202 Posts

September 22 2011 22:46 GMT

#389

On September 23 2011 01:56 Keula wrote:
So according to this Zerg struggle the most in EU and NA. Tournament results kinda showed that but I didnt think that there is such a difference between korea and the rest.

I suspect it's because zerg is a race that requires a huge amount of skill.

radiatoren

Denmark1907 Posts

September 22 2011 22:48 GMT

#390

Only one matchup seems to stand out: T has an advantage vsZ in all regions. If it was acceptable to mean the values (it is most likely not), T would win 56% of the time.

The rest of the numbers are rather inconclusive.

T might also have an advantage against P, but that is speculative.

The sample is not as diverse as most would think. It was only data from one day (september 13.). Having a weeks data would be much more convincing as more of the good players would be in the dataset.

Also remember that a huge patch just hit the servers. Therefore don't get too hung up on the numbers. They are historical and can't be used for the game as it stands now.

Kaxon

United States117 Posts

September 22 2011 22:52 GMT

#391

Interesting that Zerg is really struggling in NA/EU at the master+ level. Also highlights the point that balance at the GSL level isn't necessarily the same as balance at other levels.

JustPassingBy

10776 Posts

September 22 2011 22:52 GMT

#392

I agree with some of my pre-posters. the ladder system itself will make that fine statistics "balanced" over time. Say one race is favoured over another in a certain matchup, then players from that race would drop down the ladder on average and thus would only face lesser skilled players. What is more interesting, in my opinion, is the distribution of the races throughout the ladder.

Amui

Canada10567 Posts

September 22 2011 23:00 GMT

#393

On September 23 2011 07:52 Kaxon wrote:
Interesting that Zerg is really struggling in NA/EU at the master+ level. Also highlights the point that balance at the GSL level isn't necessarily the same as balance at other levels.

A year after release, top zergs, without pressure, hit every single larva inject and every single larva made from the injects gets turned into a drone unless they suspect pressure. They hit as many injects as possible when faced with an attack, and in general have more stuff. I don't think Protoss or Terran has that same level of difficulty, but such potential. With minimal pressure I can have just as much stuff as a pro just by never forgetting pylons and constantly building stuff. Looking at a lot of the zergs around my level, by 10 minutes they've missed 1-2 full injects usually. While it is incredibly difficult to hit every single inject, I think that is the reason why the top zergs are fully capable of making the matchup look imbalanced against protoss.

Wuster

1974 Posts

September 22 2011 23:05 GMT

#394

On September 23 2011 07:45 Paladia wrote:

Show nested quote +

The danger of using straight probability is that Code S was not seeded at once or even over a short period of time, so existing trends going as far back as the Open Season when maps were heavily Terran Favored, tanks did 50 dmg period take time to filter out.

I went into more detail in an earlier post, but I think the warning signs was when Code A was basically a TvT fest, now we're seeing the results. And even if Blizzard patched out all imbalance today, it'll take at a few seasons for Code S to have a more balanced distribution.

Sukari

Australia183 Posts

September 22 2011 23:06 GMT

#395

If anything at least this shows that Blizzard is always monitoring the state of the game.

redemption289

United States9 Posts

September 22 2011 23:12 GMT

#396

Shouldn't MMR guarantee a near 50% win rate except at the very top and bottom levels of play (unless there is some sort of mass exploit, but even then, multiple games should smooth this)? If your race, X, is inherently OP, then this will be compensated for by you playing more skilled Y and Z players. The true measure of balance should be the expected MMR of each race and the concentration of its distribution. This isn't an advocacy for Blizzard not balancing, merely the observation that I don't think win % tells all that much.

Holykitty

Netherlands246 Posts

September 22 2011 23:15 GMT

#397

nothing can be taken from these numbers. because of blizzards forced 50% system only the people at the very very top, ie GM can break out of 50% win rates and give any kind of indication on balance, and even then everyone knows koreans are the best

korean GM vs GM win rates are all that matters for people concerned about balance, and even then id argue that ladder stats are meaningless

Snorkle

United States1648 Posts

September 22 2011 23:17 GMT

#398

The fact that these stats are from the ladder means that they are on ladder maps. Ladder maps are not used in tournaments because they are terrible. Sure protoss might have a high winrate vs zerg on maps with no real third for zerg to take but seriously... blizz we don't give a fuck about the ladder balance. Including masters in the second group of stats removes its small bit of validity. Rather, explain to us with your pretty numbers and algorithms how this has come to pass http://wiki.teamliquid.net/starcraft2/2011_Global_StarCraft_II_League_October/Code_S

Deleted User 183001

2939 Posts

September 22 2011 23:22 GMT

#399

On September 23 2011 08:17 Snorkle wrote:
The fact that these stats are from the ladder means that they are on ladder maps. Ladder maps are not used in tournaments because they are terrible. Sure protoss might have a high winrate vs zerg on maps with no real third for zerg to take but seriously... blizz we don't give a fuck about the ladder balance. Including masters in the second group of stats removes its small bit of validity. Rather, explain to us with your pretty numbers and algorithms how this has come to pass http://wiki.teamliquid.net/starcraft2/2011_Global_StarCraft_II_League_October/Code_S

This reminds me of some spoof of those diabetes treatment commercials with Wilford Brimley where he says he eats people with "diabeetus", realizes what he says, stutters and looks uncomfortable, not knowing what to say, and then says "Have a nice day!"
I think Blizzard would be just the same way in answering that question lol

. Why stop at October? May as well show them July and August as well lol.

Umpteen

United Kingdom1570 Posts

September 22 2011 23:28 GMT

#400

On September 23 2011 07:23 Vindicare605 wrote:

Show nested quote +

I don't think it works like that. QQ is a statistic in itself, something to include (very carefully) alongside the NA and Korean ladder results. You have to take outbursts like the one sparked by 1-1-1 in TvP with a pinch of salt, because they come and go with evolving strategies, and the state of Code S will inevitably stiffen the spine of those with an anti-Terran agenda. But summed and averaged, QQ can I think give valuable insight above and beyond win/loss figures (as I outlined above).

Let's face it: how often and for how long have Terrans been genuinely stuck for an answer, in any matchup? The race is so well put-together that the obvious answer has pretty much always been the right one, and racing to almost any combination of tech has yielded powerful builds. It's not been like Zerg, scratching their heads because the obvious Roach counter to Hellions fucks them over almost as badly as letting the hellions in, or Protoss trying to find an answer to 1-1-1 that isn't an auto-loss to a quick barracks switch if scouted. This is a separate issue from how difficult the races are to play, execution-wise; I just feel that P and Z have always been playing catch-up, which is not imbalance per se.

Prev 1 18 19 20 21 22 28 Next All

Please or register to reply.

Blizzard Blog: Balance Snapshot - Page 20

Completed

Ongoing

Upcoming