• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 05:06
CEST 11:06
KST 18:06
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Tournament Spotlight: FEL Cracow 20256Power Rank - Esports World Cup 202576RSL Season 1 - Final Week9[ASL19] Finals Recap: Standing Tall15HomeStory Cup 27 - Info & Preview18
Community News
Google Play ASL (Season 20) Announced21BSL Team Wars - Bonyth, Dewalt, Hawk & Sziky teams10Weekly Cups (July 14-20): Final Check-up0Esports World Cup 2025 - Brackets Revealed19Weekly Cups (July 7-13): Classic continues to roll8
StarCraft 2
General
Tournament Spotlight: FEL Cracow 2025 #1: Maru - Greatest Players of All Time I offer completely free coaching services Power Rank - Esports World Cup 2025 What tournaments are world championships?
Tourneys
Esports World Cup 2025 $25,000 Streamerzone StarCraft Pro Series announced $5,000 WardiTV Summer Championship 2025 WardiTV Mondays FEL Cracov 2025 (July 27) - $10,000 live event
Strategy
How did i lose this ZvP, whats the proper response
Custom Maps
External Content
Mutation #239 Bad Weather Mutation # 483 Kill Bot Wars Mutation # 482 Wheel of Misfortune Mutation # 481 Fear and Lava
Brood War
General
Google Play ASL (Season 20) Announced Dewalt's Show Matches in China BGH Auto Balance -> http://bghmmr.eu/ BW General Discussion Flash Announces (and Retracts) Hiatus From ASL
Tourneys
[Megathread] Daily Proleagues [BSL20] Non-Korean Championship 4x BSL + 4x China CSL Xiamen International Invitational [CSLPRO] It's CSLAN Season! - Last Chance
Strategy
Simple Questions, Simple Answers [G] Mineral Boosting Does 1 second matter in StarCraft?
Other Games
General Games
Stormgate/Frost Giant Megathread Nintendo Switch Thread Total Annihilation Server - TAForever [MMORPG] Tree of Savior (Successor of Ragnarok) Path of Exile
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread Vanilla Mini Mafia
Community
General
US Politics Mega-thread UK Politics Mega-thread Stop Killing Games - European Citizens Initiative Things Aren’t Peaceful in Palestine Russo-Ukrainian War Thread
Fan Clubs
INnoVation Fan Club SKT1 Classic Fan Club!
Media & Entertainment
[\m/] Heavy Metal Thread Anime Discussion Thread Movie Discussion! [Manga] One Piece Korean Music Discussion
Sports
2024 - 2025 Football Thread Formula 1 Discussion TeamLiquid Health and Fitness Initiative For 2023 NBA General Discussion
World Cup 2022
Tech Support
Installation of Windows 10 suck at "just a moment" Computer Build, Upgrade & Buying Resource Thread
TL Community
TeamLiquid Team Shirt On Sale The Automated Ban List
Blogs
Ping To Win? Pings And Their…
TrAiDoS
momentary artworks from des…
tankgirl
from making sc maps to makin…
Husyelt
StarCraft improvement
iopq
Socialism Anyone?
GreenHorizons
Eight Anniversary as a TL…
Mizenhauer
Customize Sidebar...

Website Feedback

Closed Threads



Active: 732 users

Blizzard's "skill-adjusted-win-percentages" - Page 6

Forum Index > SC2 General
Post a Reply
Prev 1 4 5 6 7 Next All
Lysenko
Profile Blog Joined April 2010
Iceland2128 Posts
Last Edited: 2011-08-05 12:16:48
August 05 2011 12:09 GMT
#101
On August 05 2011 20:29 bmn wrote:
Now you're just muddying the waters by saying that we can't possibly know what they do, so we have no reason to criticize anything they say, because maybe they're doing something brilliant and we'll never know.
That's a cop-out :-)


I'm not saying that it's invalid to criticize anything they say or any change they make.

I am saying that it's not possible, on the outside looking in, to criticize their statistical analysis on technical grounds, because we can't possibly know exactly what they're doing in terms of data collection or analysis. Importantly, though, that's a very small part of how they balance the game (as that video points out.)

In fact, they use statistics that appear out of whack as hints to suggest where to follow up in other areas, looking at player feedback, playing the game themselves as David Kim does, using their own testing tools, etc. Then, they make choices about how to change game rules, if they should think it's warranted, based on the totality of all of that combined with their own personal judgements as game designers and what they think feels right when played.

This isn't chess, with two identical sides and a turn-based system where you can instantly tell who has the sole advantage the game offers. It's a game with three asymmetric race choices and complex mechanics where any change, large or small, may have unintended, unanticipated consequences down the line. Every change to the game is completely subjective -- it has to be. You can't simulate what a change's impact is going to be, because the impact may rely on tactics or strategies that have not yet been thought up. You have to try it and hope, which is one reason that they put balance changes on the PTR.

This is why they look at a 55% win ratio in a matchup and say they're reasonably happy with that. It's not that that isn't statistically significant, it's that you can know nearly absolutely that the balance isn't perfect in a matchup and still have no way to tell in advance that a change you think might help won't make things worse.

And that, by the way, goes double when they see a matchup leaning one way in one region and another way in another region.

I don't see anyone suggesting that the match-making system is designed to balance the races.


There are several comments that allude to this, but here's the one that particularly set me off about it, because it's the best-expressed version of an idea that's simply not true, that striving for a 50% win/loss for players somehow obscures racial differences in the aggregate:

A system that ensures a 50% win rating not only in general, but race to race will hide imbalance by virtue of actively seeking that 50% regardless of skill level. This means two players of identical skill with two different races will both be at 50%, but will have very different MMRs if their respective races are imbalanced against one another.

http://en.wikipedia.org/wiki/Lysenkoism
bmn
Profile Joined August 2010
886 Posts
August 05 2011 12:23 GMT
#102
On August 05 2011 21:09 Lysenko wrote:
Show nested quote +
On August 05 2011 20:29 bmn wrote:
Now you're just muddying the waters by saying that we can't possibly know what they do, so we have no reason to criticize anything they say, because maybe they're doing something brilliant and we'll never know.
That's a cop-out :-)


I'm not saying that it's invalid to criticize anything they say or any change they make.

I am saying that it's not possible to criticize their statistical analysis on technical grounds, because we can't possibly know exactly what they're doing in terms of data collection or analysis. Importantly, though, that's a very small part of how they balance the game (as that video points out.)


I think your statement is far too strong, the amount of available data and man-power Blizzard has here is very limited, and the solution space is nowhere as large in practice as it is in theory, simply because there are only that many proven systems that are sufficiently robust.

But fair enough: If you contend that we can't even criticize the statistical analysis, there is no point in discussing any analysis based on their statistics. You cannot validly argue any further saying "they use statistics that appear out of whack as hints" if you yourself state that we cannot assume anything about the validity of their statistics.

I don't really see the point of the rest of your reply if we start from the assumption that we cannot know what their statistics do and, based on that, cannot argue technically about what they might be measuring or not.


This is why they look at a 55% win ratio in a matchup and say they're reasonably happy with that. It's not that that isn't statistically significant, it's that you can know nearly absolutely that the balance isn't perfect in a matchup and still have no way to tell in advance that a change you think might help won't make things worse.


Since we already started out by saying that we cannot know what their statistics actually mean, trying to defend their decision-making is entirely baseless here too. Yeah, they probably have an idea what they're doing -- but that's an appeal to their authority, there's nothing you or I could actually discuss meaningfully about this.

I know this sounds harsh, but starting with the assumption that we're talking about an entirely unknown black-box system (without fully knowing either input or output of it, let alone the inner workings) leaves the only conclusion that there is no merit in technically discussing how good or bad that system is.
I would not have said that we can't know anything about the match-making system, but that's a personal choice based on how confident you are about knowing what general approaches they might use to create such a ladder system.


Show nested quote +
I don't see anyone suggesting that the match-making system is designed to balance the races.


There are several comments that allude to this, but here's the one that particularly set me off about it, because it's the best-expressed version of an idea that's simply not true, that striving for a 50% win/loss for players somehow obscures racial differences in the aggregate:

Show nested quote +
A system that ensures a 50% win rating not only in general, but race to race will hide imbalance by virtue of actively seeking that 50% regardless of skill level. This means two players of identical skill with two different races will both be at 50%, but will have very different MMRs if their respective races are imbalanced against one another.



I didn't interpret that comment the way you did. I took it as saying that the match-making system will hide imbalance at the player level by matching you up with "weaker" players who still have an even chance of winning, and that observation is entirely correct (if trivial).

If you take it as saying that it will hide imbalance in that Blizzard cannot detect this skew, then it is wrong.

But I don't think this implies that exposing the balance of races was the goal of the match-making system. As a player, the match-making system is all we see, so that's the only way we can judge the racial balance from ladder play.
IronDoc
Profile Joined August 2011
United Kingdom27 Posts
August 05 2011 12:24 GMT
#103
On August 05 2011 11:12 carwashguy wrote:
It may be instructive to consider how, in chess, white pieces tend to have an advantage over black pieces. Among weaker players, there's not much advantage. At higher levels, it starts to become a factor. Among top rated computers, white scores about 55%. However, in chess, players take turns playing white and black. It seems to me that the best way to do it for Starcraft would be to look at top-rated random race players. If there is a tendency for them to win with certain races, then I believe this would reveal something meaningful. To put it briefly, the best random race players' should be immune to MMR's affect on their races winnability. An obvious problem is that the random matchups are not totally analogous to the standard matchups since the non-random player doesn't know his opponent's race off hand.

This seems like a good point that's been glossed over a bit. Random players' win percentages should be free of any effect of race on a player's MMR. I'd be surprised if this wasn't used as a pretty significant indicator of balance for Blizzard. Actually, working through it in my head, it may be the case that you would need to only take RvR matchups, since only then will the opponents MMR be free of influence from race.

I can see 2 main problems with this.
Firstly, it still doesn't address the issue of any systematic bias for or against playing random. The sc2ranks data shows that it's much less common in Master league than any of the lower 5.
Secondly, Terran and Protoss are arguably share more similarities than either race does with Zerg. This might mean that skill is more transferable between the 2 and thus a random player's win rate for each race are not independent.
Sbrubbles
Profile Joined October 2010
Brazil5776 Posts
August 05 2011 12:27 GMT
#104
On August 03 2011 14:09 whacks wrote:
Thanks for the responses all.

A lot of people have responded with some variant of "If Blizzard sees all the Zerg players have significantly lower MMR, they'll know there's something wrong."
If Blizzard is doing this, then what they're basically doing is comparing the average zerg player's MMR with the average Terran player's MMR. This approach can break for so many reasons, which I'm not going to get into now.


It can break for so many reasons, but comparing average MMR and league placements is the only way the adjusted ELO system allows to account for balance.

For example: if Protoss is 20% of the player base and the masters and grandmasters leagues have less than 20% of Protosses while diamond and lower have more than 20%, that's a sign (just a sign) of race (or map) imbalance.

There may be other explanations to this that don't involve race/map imbalance, but still, comparing average MMR is the best we got.
Bora Pain minha porra!
ChickaChuckWally
Profile Joined July 2011
Australia85 Posts
August 05 2011 12:33 GMT
#105
lol at the guy talking about the muta at the end.
:^) Puppy is love, Puppy is life
Fungal Growth
Profile Joined November 2010
United States434 Posts
August 05 2011 13:18 GMT
#106
Nice posts by bmn.

ChickaChuckWally...The kid obviously was trying to ask if the Thor was supposed to be an answer to the mutalisk, and it wasn't, then doesn't make the mutalisk a balance problem (good implied question as it does force terrain to be one dimensional in going mass marine and blizzard had no idea the magic box would be so effective). What's even funnier is it appears David Kim wasn't even paying attention because then he talked about marauders in his answer.

In fact in that interview the Blizzard had a number of interesting things to say... They strongly defended the marauder as a needed answer to zerg. They also felt the marauder wasn't even that great of a unit and the benefits units marauders got from stimming frequently didn't counter the damage done. Oh boy...
Lysenko
Profile Blog Joined April 2010
Iceland2128 Posts
Last Edited: 2011-08-05 18:09:46
August 05 2011 17:58 GMT
#107
On August 05 2011 21:23 bmn wrote:
I don't really see the point of the rest of your reply if we start from the assumption that we cannot know what their statistics do and, based on that, cannot argue technically about what they might be measuring or not.


You're going way too far with this. We can speculate all we want about what they might be measuring based on what they've said, and that might be an interesting discussion, but it would be a mistake to turn around and say they're idiots because they're doing some unjustifiable thing or another that we've imagined they might be doing. Also, the question of how a statistical analysis of any kind fits into their larger decision-making is perfectly reasonable to discuss even if we don't know the details.

The only thing we can't do is break down the exact mathematics of their statistical analysis and say it's valid or invalid for this or that reason, and that's what this thread appears to be trying to do.

Blizzard's operation may be small, but they absolutely have one or a couple experienced statisticians on their Battle.net team who are fully capable of performing some kind of reasonable analysis. How useful those results are in a larger sense may be difficult to say, but that's probably not the statisticians' fault. Furthermore, last I checked they didn't need our approval before balancing their game however they saw fit, so I don't see why our opinion on their statistical approach matters beyond entertaining ourselves with speculation, or alternatively to entertain ourselves by complaining just to complain.

My point is this: We can criticize the specific changes to the game based on the difference between the impact you think the change will have vs. the impact they think it will have. We can say we'd rather they have a greater focus on whatever race's issues / whatever league's issues we happen to be in. We can even slice and dice their offhand comments about this or that unit or whatever and call them ill-considered.

What we can't do is assess the accuracy of their specific statistical analysis or the mathematical underpinnings of their matching system beyond what limited information they choose to share with us. That limited info is not nearly enough to say they're doing it wrongly.
http://en.wikipedia.org/wiki/Lysenkoism
Lysenko
Profile Blog Joined April 2010
Iceland2128 Posts
August 05 2011 18:04 GMT
#108

Regarding using random players as a control group, that's an interesting idea. I can think of a third problem, though:

On August 05 2011 21:24 IronDoc wrote:
I can see 2 main problems with this.
Firstly, it still doesn't address the issue of any systematic bias for or against playing random. The sc2ranks data shows that it's much less common in Master league than any of the lower 5.
Secondly, Terran and Protoss are arguably share more similarities than either race does with Zerg. This might mean that skill is more transferable between the 2 and thus a random player's win rate for each race are not independent.


A third issue is that it's simply not possible for a random player to practice any one race with the depth that players who prefer one race can devote to theirs. This means that they're likely to have deficient and maybe early-game-centric play with all three races, and that may eliminate their value as a control.
http://en.wikipedia.org/wiki/Lysenkoism
dreamsmasher
Profile Joined November 2010
816 Posts
Last Edited: 2011-08-05 18:36:42
August 05 2011 18:33 GMT
#109
On August 05 2011 15:01 kckkryptonite wrote:
Show nested quote +
On August 05 2011 14:15 Alyoshka wrote:
On August 05 2011 13:51 kckkryptonite wrote:
On August 04 2011 11:22 seaofsaturn wrote:
The whole purpose of differential equations is to measure things that are constantly changing...

Here is the differential equation from the video:

[image loading]

If you can't make sense of that (I can't!) then I don't know why you're trying to criticize them. The percentages are just simplified representations to present the data to people who aren't math majors, you can't really use them to support random theories.



It's funny they put up some supposedly insane math equation (it's not, you need one year of calculus to understand it), but they don't tell you what anything represents; theta, beta, gamma? Equations are meaningless if that information is omitted. Lol at Blizzard trying to appear transparent, patronization at its best, imo.

"We'll spare you the details, but these are the percentages", sketchy.


you need more than one year of calculus to understand this type of equation. Cal I/II don't even sniff DEs. All the posts in this thread are total crap except those which point out that <1% of the population comprehends the math involved. The details aren't important, because it works. I am amazed at the number of numbskulls who piss and moan about this or that policy from blizzard without doing any anything to contribute to the solution side of things.

Math is extremely hard, the kind of programming talent Blizzard can hire, while not Google/MSFT/AAPL level, is incredibly high. Just be glad that the smartest guys in the gaming industry are working on the IP you love.

citation: http://www.animationarena.com/video-game-salary.html (on Blizz leading the way for pay, which in turn allows them to leverage top talent)

actual data of the survey: https://spreadsheets.google.com/ccc?key=0Aou3k7ExaTQjdHZ0S2dKMjhfY0lmN2tmTDRESEhjbHc&hl=en&authkey=CNDxyJwF#gid=0

wiki post on DiffE: http://en.wikipedia.org/wiki/Differential_equation


No, you don't. The level of ignorance in your post is astonishing, after the first year of Calc, you are be able to solve basic differential equations (in my curriculum at least). Pouring money into something must mean it's the best right (US HEALTHCARE/EDUCATION)? To top it off you cite wikipedia.

Show nested quote +
On August 05 2011 14:15 Alyoshka wrote:
All the posts in this thread are total crap except those which point out that <1% of the population comprehends the math involved.


Really? WTF. There are derivatives and integrals, fractions and exponents. Seriously? 1%? Where did you get this number? How are you coming to your various conclusions?

Show nested quote +
On August 05 2011 14:34 Disquiet wrote:
I agree he is wrong. even without having the variables defined you can still tell what the thing does if you can understand it. Everyone can recognize the formula for a parabola without knowing how x or y is applied.

Actually, the variables are assumed to be defined as the set of all real numbers.

W/e guys, I'm not gonna get into a math debate with people who know how to use google and wikipedia.


i've taken more than my fair share of math in college and I have no idea what that formula exactly means other than the fact that there are some normal distributions involved in the calculation, and it is used to solve some sort of conditional probability problem.

there is no way you can understand what that formula means with only one year of calculus.

i'm pretty sure differential equations isn't even involved in that formula
Wren
Profile Blog Joined February 2011
United States745 Posts
August 06 2011 03:14 GMT
#110
On August 05 2011 11:55 Lysenko wrote:
Show nested quote +
On August 05 2011 07:05 Wren wrote:
~Ladder stats are essentially meaningless. Blizzard's correction may undo the effect of the matchmaking system gradually, but cannot fix the fact that the matchmaking system generated the data in the first place. You cannot analyze the flaw out of a flawed data set.


The problem is that everyone who's been arguing that the data set is "flawed" somehow have been saying so without any reasoning or explanation behind it, other than to completely misunderstand or misrepresent the impact of the matchmaking system on the data set.

Nobody in this thread knows what their data set is or exactly how they're analyzing it, so all the criticism of it is fantasy based on imagined details to fill in the blanks.

They've said that their data set is every ladder game, repeatedly. My understanding of the match-making and MMR system is that it's essentially the same as every other computerized ranking system: who you played, if you won, how much the other guy wins. All blizzard tracks is who beat who on which map.

This data is flawed because it cannot tell (just an example) if Terran is the best because the best people play Terran or if Terran is the best because the balance is skewed. All it can tell you is that Terran wins x% of their games.

Apply MMR to GSL open 3 and it will tell you that Rain was a better player than IMMvp, because Rain cheesed to the finals while Mvp was cheesed out in Ro16.

disclaimer: I'm not a math expert, just trying to understand things like everyone else. If I've made a mistake, please correct it.

----------------------------
Ok, Lysenko, I've read this thread fairly carefully, and have a question to pose to you.

On August 04 2011 10:55 Lysenko wrote:
The way you adjust for skill is to look at overall MMR distribution among each race's population. If one race, let's say Zerg, has a population distribution that's weighted toward lower MMRs, chances are it's the race that's doing it unless there's some external indication that better players systematically favor the other races for some reason.

Is this the only worthwhile balance-related statistic we can get from the ladder?

If so, maybe the OP claim is correct, and even adjusted win-rates aren't very useful.
We're here! We're queer! We don't want any more bears!
dreamsmasher
Profile Joined November 2010
816 Posts
Last Edited: 2011-08-06 03:43:17
August 06 2011 03:27 GMT
#111
On August 06 2011 12:14 Wren wrote:
Show nested quote +
On August 05 2011 11:55 Lysenko wrote:
On August 05 2011 07:05 Wren wrote:
~Ladder stats are essentially meaningless. Blizzard's correction may undo the effect of the matchmaking system gradually, but cannot fix the fact that the matchmaking system generated the data in the first place. You cannot analyze the flaw out of a flawed data set.


The problem is that everyone who's been arguing that the data set is "flawed" somehow have been saying so without any reasoning or explanation behind it, other than to completely misunderstand or misrepresent the impact of the matchmaking system on the data set.

Nobody in this thread knows what their data set is or exactly how they're analyzing it, so all the criticism of it is fantasy based on imagined details to fill in the blanks.

They've said that their data set is every ladder game, repeatedly. My understanding of the match-making and MMR system is that it's essentially the same as every other computerized ranking system: who you played, if you won, how much the other guy wins. All blizzard tracks is who beat who on which map.

This data is flawed because it cannot tell (just an example) if Terran is the best because the best people play Terran or if Terran is the best because the balance is skewed. All it can tell you is that Terran wins x% of their games.

Apply MMR to GSL open 3 and it will tell you that Rain was a better player than IMMvp, because Rain cheesed to the finals while Mvp was cheesed out in Ro16.

disclaimer: I'm not a math expert, just trying to understand things like everyone else. If I've made a mistake, please correct it.

----------------------------
Ok, Lysenko, I've read this thread fairly carefully, and have a question to pose to you.

Show nested quote +
On August 04 2011 10:55 Lysenko wrote:
The way you adjust for skill is to look at overall MMR distribution among each race's population. If one race, let's say Zerg, has a population distribution that's weighted toward lower MMRs, chances are it's the race that's doing it unless there's some external indication that better players systematically favor the other races for some reason.

Is this the only worthwhile balance-related statistic we can get from the ladder?

If so, maybe the OP claim is correct, and even adjusted win-rates aren't very useful.


statistics are about averages, things occuring in the long run. you can't really do that to GSL open 3 since it is an extremely small sample size.

if you watch the video statistics are there to see if there any statistical evidence (significance) for racial imbalance across leagues. they address a lot of factors beyond statistics (such as his comment about TvP, they dont want a game of defend and win even if that led to a 50% winrate matchup average even at top leagues). they also stated that they were careful with balance changes due to the qualitatively different nature of korean ladder.

mathematics only gives positive analysis, for example they cited statistical evidence suggesting that P was too strong against T, however its possible to make a plethora of changes to 'balance' the game. some 'balances' might not be fun, some might, statistics gives you no idea of those types of things. this combined with their insight that 4G was too strong in a myriad of situations and their desire to change PVP is what led to WG nerf. its important to note that its important to balance around both aspects -- if you have statistically significant data across all leagues saying that one race is dominant against another, that it is an important issue to address because there *shouldn't* be huge skill disparity between players of each race.

for example if they found statistical evidence that P was too strong against Z they could just systematically lower the dps of the most popular P unit (the stalker) until win rates adjusted to ~50% even at the highest end, but that wouldn't exactly be what i call good game design.
Lysenko
Profile Blog Joined April 2010
Iceland2128 Posts
Last Edited: 2011-08-06 06:04:55
August 06 2011 06:03 GMT
#112
On August 06 2011 12:14 Wren wrote:
All blizzard tracks is who beat who on which map.


That's what the matchmaking system uses. I guarantee you that they store more info than that for each game -- for example, you can go back in someone's game history and look at their build orders. They store information in resources collected over time and units produced over time. Do they use any of that additional information in their analysis? Do they filter their matchmaking data in a way that provides greater insight than just looking at the whole population? You don't know the answer to those questions, and neither do I, so it makes no sense to criticize their quantitative analysis.

Ultimately, how that information gets fed back into changes to the game is a fuzzy process anyway.
http://en.wikipedia.org/wiki/Lysenkoism
paralleluniverse
Profile Joined July 2010
4065 Posts
Last Edited: 2011-09-23 13:25:54
August 06 2011 07:56 GMT
#113
On August 03 2011 13:07 whacks wrote:
Disclaimer: I’m not concerned about game balance at all. I’m hoping to have a discussion on the math & statistics behind Blizzard's adjusted-win-percentage that they rely on heavily.

Late last year, Blizzard released a bunch of ladder statistics on “skill-adjusted-win-percentages” for the different matchups. The reason I have it in quotes, is because they never really explained how they did the skill-adjustments. I’ve always been skeptical about whether such a “skill-adjustment” is really possible.

Well recently, I found the following video where Blizzard partially explains how they calculate the “skill-adjusted-win-percentages.” Watch the first 5 minutes of the following video:


Gist of what they said: Raw league matchup numbers aren’t very meaningful because of matchmaking’s system ability to matchup players with equally challenging opponents. The math guy mentions specifically: Not only does the system put players in 50-50 matches, it also tries to keep the race matchups at 50-50 as well. Because of this, we have to adjust for player skill to calculate the true matchup win rates. Example: a ZvP match is about to be played. The Zerg player’s rating (odds of winning) relative to the Protoss player is 55-45. The Zerg race’s rating relative to the Protoss race is 53-47. If the Protoss player ends up winning, the player ratings will then converge to 51-49. The race ratings will also converge to 52-48.

Their explanation just didn’t click with me. Rating systems such as ELO are great when you’re dealing with a single unknown (relative player strength). But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.

It’s important to first establish the following: Any good rating system, including ELO & the point system, relies on the following principle:
• Give each agent (could be a player, or a race) a certain rating as an estimate for how strong the agent is
• If 2 agents play and one wins at a higher percentage, the more successful agent should eventually end up with a higher rating
• If a higher rated agent & a lower rated agent play against each other, and each wins with an equal percentage, the 2 ratings should eventually converge

The ELO system that Blizzard uses for MMR is an optimized algorithm that allows ratings to stabilize much quicker, but other rating systems that utilize the above principle (including the point-system), can achieve the same results in the long run.

Now going back to the scenario, consider the case where Blizzard releases a new patch which nerfs Zerg and makes it UP relative to both Protoss & Terran (eg, drones now cost 60 min). Consider what will happen to the average Zerg player. He will start losing more than 50% of his games, and his MMR will start dropping. Because of his lower MMR, he’ll start playing against weaker opponents. Eventually, his MMR will stabilize at a level where he starts winning 50% of his future games.

Now let’s say Blizzard had assigned each race a rating as well, to track how “strong they think it is.” Suppose that before the patch, all the races were balanced & had equal rating. Immediately after the patch, because the Zerg population goes through a losing streak, the Zerg rating will drop.

But eventually, the Zerg players will have stabilized their MMR and start winning 50% of their games. At this point, because of the last bullet point in the rating system’s principles (ratings will converge at 50% win rates), the Zerg rating will start increasing again. Remember also that the stabilized Zerg players are playing against opponents of the same MMR, so there’s no way to “account for player skill.” Eventually, the zerg rating will once again converge with the other races, even though Zerg is now UP.

Based on this scenario, it seems impossible to determine whether a race is truly UP, using Blizzard’s rating system. Thoughts? Any ideas on how Blizzard could possibly be “accounting for player skill” in calculating race balance?

I'll do my best to explain this.

Blizzard uses Bayesian Inference. This is usually taught as a 3rd year or honors year statistics course at most universities. I only say this to impress upon you that this is not simple stuff.

The formula that is shown in the Youtube video is this:
[image loading]

Firstly, notice that the fraction in the formula looks the same as the formula here: http://en.wikipedia.org/wiki/Posterior_probability#Calculation
The fraction is a posterior probability.

Now notice that this is multiplied by a function and then integrated. This gives a Bayesian estimator, it looks the same as the formula shown here:
http://en.wikipedia.org/wiki/Bayesian_estimator#Definition

So, the whole formula is for the Bayesian estimator where the posterior probability is the product of 3 normal distributions (the 3 MMR variables), multiplied over all g (probably stands for games, i.e. takes into account all games played).

Now what does this mean?

What a Bayesian estimator does is it estimates a parameter (in this case the probability of winning) given the evidence (in this case the skill of the player).

Essentially, they have a prior belief about the probability of winning (very likely the simple unadjusted win ratio), this probability is updated by the skill of the player over all games, forming a posterior distribution, and then using this, the probability of winning given the skill of the player is calculated with a Bayesian estimator.

What isn't clear is what each variable stands for, so we don't know if they take into account the map or game length or other variables. Although from the talk, the impression is that only skil, (i.e. MMR) is taken into account to adjust the probabilities of winning.
paralleluniverse
Profile Joined July 2010
4065 Posts
Last Edited: 2011-08-06 08:07:05
August 06 2011 08:05 GMT
#114
On August 04 2011 10:32 bamman1108 wrote:
I like that part where they're satisfied with 5% differences in W/L when that percent is based off millions of matches. Even a 1% difference with that many matches means that one race very, very significantly favors the other. Wtf are they talking about when a 55% win rate for a specific race matchup is just "borderline?"

Given a sufficiently large sample size, it's possible to make a 0.00001% difference statistically significant, because a 0.00001% difference is a nonzero difference.

But statistically significant doesn't imply an actual appreciable significant difference in everyday language,

The following example from Wikipedia (http://en.wikipedia.org/wiki/Statistical_significance) explains this concept well:
As used in statistics, significant does not mean important or meaningful, as it does in everyday speech. For example, a study that included tens of thousands of participants might be able to say with great confidence that residents of one city were more intelligent than residents of another city by 1/20 of an IQ point. This result would be statistically significant, but the difference is small enough to be utterly unimportant.
whacks
Profile Joined July 2011
25 Posts
August 29 2011 01:20 GMT
#115
I just got back from vacation, so forgive me for resurrecting this thread so late

Paralleluniverse, thanks for taking the time to clarify. It sounds like that formula is conceptually pretty similar to ELO, possibly taking into account multiple factors other than player skill, such as racial "scores." This is exactly what I suspected in my OP.

Lysenko, you mention over & over again that the ladder data can yield useful balance information by letting us compare average-MMR difference across the races. I agree completely on this. However, this is NOT what Blizzard is doing. How do I know this?

1) There is actually VERY significant MMR differences between the races. Terran is skewed very heavily towards Bronze, and Zerg is skewed very heavily towards Plat/Diamond/Master's. Blizzard's numbers paint a very rosy picture, but if you compared average MMRs, you'll see very wide differences.

2) Calculating average MMRs is 7th grade math. You sum up all the MMRs of each player, and divide by number of players. You definitely won't need any complicated math like what they announced.

Clearly, when Blizzard presented to us the ladder data, they weren't basing it off average-MMR. You haven't presented any other methods they could be using that works in our ladder system, so you might actually be in agreement with the point I'm trying to make in my OP, and with what others like lhpares & bmn have been saying.

Again, if you have "blind faith" in Blizzard's abilities... I respect that, but that's not what this thread is about.
darmousseh
Profile Blog Joined May 2010
United States3437 Posts
Last Edited: 2011-09-29 23:18:22
September 29 2011 22:48 GMT
#116
Bumping this because I found a good article on a bayesian approximation method for online ranking. I have started deciphering the variables in the equation.

http://jmlr.csail.mit.edu/papers/volume12/weng11a/weng11a.pdf


So far it appears that they are analyzing the games as if the race the person has chosen is considered an additional player in the match. I'm assuming that the different sigmas represent the sigmas of the 3 races (meaning each race has a MMR). Normally I would just scoff this off as "impossible to calculate", but since they are using the sigmas themselves to calculate the values, it seems a lot more reasonable as it doesn't matter what type of matchmaking system is being used. Like the above post, this is the posterior probability function.

I will provide an update once I figure out all of the variables. I'm mostly having trouble on that Psi.

Edit; In games where ties are not allowed or very infrequent, gamma is typically used for score variance. It's possible that the score at the end of the game is being used to calculate it.


Edit 2; After consideration, the 3 sigmas being used in the equation are "player skill", "matchup skill", and "overall skill". For example. A player might have an mmr of 2000/100. Protoss (vs zerg) has an mmr or 1500/50. and protoss (vs all) has an mmr of 1600/75. That's the conclusion I am coming to so far. I will attempt to verify this hypothesis.
Developer for http://mtgfiddle.com
CluEleSs_UK
Profile Blog Joined August 2010
United Kingdom583 Posts
September 29 2011 22:55 GMT
#117
But surely this doesn't work out? Each race will have a different win percentage at each league. Zerg for instance has a high win percentage in lower leagues, because bronzies can't deal with ling runbys, but at high levels where this isn't as viable, the Zerg win rate is far lower.
"If it turns out he is leaving the ESL to focus on cooking crystal meth I'll agree that it is somewhat disgraceful, but I'll hold off judgement until then."
Warble
Profile Joined May 2011
137 Posts
September 30 2011 02:51 GMT
#118
Perhaps I am wrong about this, but it seems like Blizzard's approach to balancing is to assume that each race has approximately the same skill distribution.

This certainly simplifies the task.

And, personally, I think it is probably the most practical way to approach this matter. The more readily pros switch to races they consider overpowered, the more likely this approach is to yield an outcome close to objective balance.

We could argue that switching races is quite difficult, which means there will be more imbalance.

It's hard to see any other way to simplify the task, which we have already established is intractable without any simplifications.
whatthefat
Profile Blog Joined August 2010
United States918 Posts
September 30 2011 03:08 GMT
#119
On August 03 2011 13:07 whacks wrote:
Their explanation just didn’t click with me. Rating systems such as ELO are great when you’re dealing with a single unknown (relative player strength). But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.


This has come up a few times, and yes you're right, it is impossible. It's possible that on average players of one race are actually better players than those of another. Based just on game results, there is absolutely no way of distinguishing that from the race being overpowered. Somewhere along the line you have to make an assumption, and I think the assumption they have used is that the player pool for each race is equally "skilled" (another problem is that there's no formal definition of skill), and any further discrepancies in win/loss rates (once matchmaking is accounted for) are due to imbalances in the game. Is it a reasonable assumption? Maybe.
SlayerS_BoxeR: "I always feel sorry towards Greg (Grack?) T_T"
FieryBalrog
Profile Blog Joined July 2007
United States1381 Posts
September 30 2011 06:37 GMT
#120
Very interesting thread to read, particularly Lysenko's posts.
I will eat you alive
Prev 1 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
FEL
09:00
Cracow 2025
Krystianer vs sOs
SKillous vs ArT
MaNa vs Elazer
Spirit vs Gerald
Clem vs TBD
uThermal vs TBD
Reynor vs TBD
Lambo vs TBD
ComeBackTV 208
IndyStarCraft 160
RotterdaM49
CranKy Ducklings37
Rex36
3DClanTV 24
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Nina 229
IndyStarCraft 160
ProTech59
RotterdaM 49
Rex 36
StarCraft: Brood War
Sea 4403
Hyuk 3606
firebathero 1638
Hyun 863
Larva 657
Backho 144
Mini 123
EffOrt 106
Mind 102
ZerO 72
[ Show more ]
zelot 64
Noble 57
Free 48
scan(afreeca) 43
Bale 23
soO 22
Sacsri 19
Shinee 17
sorry 17
BeSt 14
NaDa 11
IntoTheRainbow 10
Sharp 9
yabsab 7
Dota 2
XcaliburYe494
Counter-Strike
Stewie2K1055
shoxiejesuss296
Heroes of the Storm
Khaldor258
Other Games
gofns5389
Happy434
SortOf172
ZerO(Twitch)8
Organizations
StarCraft: Brood War
lovetv 15
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 13 non-featured ]
StarCraft 2
• Berry_CruncH353
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• WagamamaTV303
• lizZardDota2191
Upcoming Events
BSL20 Non-Korean Champi…
4h 55m
BSL20 Non-Korean Champi…
8h 55m
Bonyth vs Zhanhun
Dewalt vs Mihu
Hawk vs Sziky
Sziky vs QiaoGege
Mihu vs Hawk
Zhanhun vs Dewalt
Fengzi vs Bonyth
Sparkling Tuna Cup
2 days
WardiTV European League
2 days
Online Event
2 days
uThermal 2v2 Circuit
3 days
The PondCast
4 days
Replay Cast
4 days
Korean StarCraft League
5 days
CranKy Ducklings
6 days
Liquipedia Results

Completed

CSLPRO Last Chance 2025
Esports World Cup 2025
Murky Cup #2

Ongoing

Copa Latinoamericana 4
Jiahua Invitational
BSL 20 Non-Korean Championship
BSL 20 Team Wars
FEL Cracov 2025
CC Div. A S7
Underdog Cup #2
IEM Cologne 2025
FISSURE Playground #1
BLAST.tv Austin Major 2025
ESL Impact League Season 7
IEM Dallas 2025
PGL Astana 2025
Asian Champions League '25

Upcoming

ASL Season 20: Qualifier #1
ASL Season 20: Qualifier #2
ASL Season 20
CSLPRO Chat StarLAN 3
BSL Season 21
RSL Revival: Season 2
Maestros of the Game
SEL Season 2 Championship
WardiTV Summer 2025
uThermal 2v2 Main Event
HCC Europe
ESL Pro League S22
StarSeries Fall 2025
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025
BLAST Bounty Fall Qual
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.