Blizzard's "skill-adjusted-win-percentages"

whacks

25 Posts

August 03 2011 04:07 GMT

Disclaimer: I’m not concerned about game balance at all. I’m hoping to have a discussion on the math & statistics behind Blizzard's adjusted-win-percentage that they rely on heavily.

Late last year, Blizzard released a bunch of ladder statistics on “skill-adjusted-win-percentages” for the different matchups. The reason I have it in quotes, is because they never really explained how they did the skill-adjustments. I’ve always been skeptical about whether such a “skill-adjustment” is really possible.

Well recently, I found the following video where Blizzard partially explains how they calculate the “skill-adjusted-win-percentages.” Watch the first 5 minutes of the following video:

Gist of what they said: Raw league matchup numbers aren’t very meaningful because of matchmaking’s system ability to matchup players with equally challenging opponents. The math guy mentions specifically: Not only does the system put players in 50-50 matches, it also tries to keep the race matchups at 50-50 as well. Because of this, we have to adjust for player skill to calculate the true matchup win rates. Example: a ZvP match is about to be played. The Zerg player’s rating (odds of winning) relative to the Protoss player is 55-45. The Zerg race’s rating relative to the Protoss race is 53-47. If the Protoss player ends up winning, the player ratings will then converge to 51-49. The race ratings will also converge to 52-48.

Their explanation just didn’t click with me. Rating systems such as ELO are great when you’re dealing with a single unknown (relative player strength). But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.

It’s important to first establish the following: Any good rating system, including ELO & the point system, relies on the following principle:
• Give each agent (could be a player, or a race) a certain rating as an estimate for how strong the agent is
• If 2 agents play and one wins at a higher percentage, the more successful agent should eventually end up with a higher rating
• If a higher rated agent & a lower rated agent play against each other, and each wins with an equal percentage, the 2 ratings should eventually converge

The ELO system that Blizzard uses for MMR is an optimized algorithm that allows ratings to stabilize much quicker, but other rating systems that utilize the above principle (including the point-system), can achieve the same results in the long run.

Now going back to the scenario, consider the case where Blizzard releases a new patch which nerfs Zerg and makes it UP relative to both Protoss & Terran (eg, drones now cost 60 min). Consider what will happen to the average Zerg player. He will start losing more than 50% of his games, and his MMR will start dropping. Because of his lower MMR, he’ll start playing against weaker opponents. Eventually, his MMR will stabilize at a level where he starts winning 50% of his future games.

Now let’s say Blizzard had assigned each race a rating as well, to track how “strong they think it is.” Suppose that before the patch, all the races were balanced & had equal rating. Immediately after the patch, because the Zerg population goes through a losing streak, the Zerg rating will drop.

But eventually, the Zerg players will have stabilized their MMR and start winning 50% of their games. At this point, because of the last bullet point in the rating system’s principles (ratings will converge at 50% win rates), the Zerg rating will start increasing again. Remember also that the stabilized Zerg players are playing against opponents of the same MMR, so there’s no way to “account for player skill.” Eventually, the zerg rating will once again converge with the other races, even though Zerg is now UP.

Based on this scenario, it seems impossible to determine whether a race is truly UP, using Blizzard’s rating system. Thoughts? Any ideas on how Blizzard could possibly be “accounting for player skill” in calculating race balance?

roymarthyup

1442 Posts

August 03 2011 04:26 GMT

agreed. this formula blizzard is using is dumb, however i think (HOPEFULLY) they are using this just to see when one race has a clear advantage (which results could show) and then when they look at the advantages they must look at the races units and high level games to try to decide where the strength of one race is coming from to allow them to win more

KiLL_ORdeR

United States1518 Posts

August 03 2011 04:34 GMT

The third and arguably most important factor that they exclude from that though is map balance. They will never get a perfect rating unless the system takes the maps in account.

Snaphoo

United States614 Posts

August 03 2011 04:36 GMT

On August 03 2011 13:34 KiLL_ORdeR wrote:
The third and arguably most important factor that they exclude from that though is map balance. They will never get a perfect rating unless the system takes the maps in account.

Even in terms of position. ZvT close positions on Shattered versus far positions, for example, has got to be pretty skewed.

Ketara

United States15065 Posts

August 03 2011 04:37 GMT

Measuring balance using any single measurement tool is going to be faulty. That's why Blizzard uses a large number of different tools and gathers data from all of them individually in order to make balance decisions.

Because they use multiple different tools, an argument like "measuring balance based on this tool is faulty" is silly. If you're trying to say that Blizzard does a bad job at balancing the game, then attack the entirety of their balance system altogether (pro feedback, community feedback, balance design team, matchmaker statistics, tournament statistics, etc).

Don't pick out one part of a system and then act all surprised when it doesn't work when separated from the rest of the system.

Omnipresent

United States871 Posts

August 03 2011 04:37 GMT

It's true, the idea of adjusting for skill (however that's determined) seems shaky at best.

I think this thread has a lot of the discussion you're looking for.

Fugue

Australia253 Posts

August 03 2011 04:45 GMT

Your example assumes that Blizzard would simply ignore the downward trend in both players and the race ratings post patch and wait for the statistics to stabilise before analysing them. While you may be correct that they inevitably will stabilise, history has shown Blizzard won't just wait for that to happen and then announce that everything is hunky dory.

There are a lot of variables (such as maps, new strategies, patches, etc.) which will cause shifts in these trends, and it no doubt takes quite a while to do a thorough analysis on the root cause of such shifts, but I think there's a lot of evidence that this sort of thing is being taken into account.

aksfjh

United States4853 Posts

August 03 2011 04:47 GMT

There's a lot of points to go through this topic so I'll try to answer them as I see them.

But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.

There's a TON of math out there that deal with 2+ independent or unknown variables. You can basically use probability and statistics to predict with a degree of certainty where each unknown lies.

The ELO system that Blizzard uses for MMR is an optimized algorithm that allows ratings to stabilize much quicker, but other rating systems that utilize the above principle (including the point-system), can achieve the same results in the long run.

"Optimized algorithm" is almost ubiquitous with professionally created ladder systems. The general consensus about MMR, however, isn't that it "stabilizes" much quicker, but that it gets to the general skill area much quicker. Evidence seems to indicate that MMR actually isn't very stable at all. It varies widely after each match and gives more of a window of skill range for each player and less of an absolute position among players. Even with perfectly differentiated skill levels, you're likely to see statistical "noise" in every player's MMR.

What you can obtain from MMR, however, is relative win percentage expectations, just like a normal Elo system. However, the uncertainty of this prediction is probably much greater (statistically) than a traditional Elo. Where a fully matured Elo scale may have an expanded uncertainty of ±0.5%, Blizzard's may be closer to an expanded uncertainty of ±1.2%. A greater number of matches and predictions will lower each of these numbers. ***These numbers are completely made up and used as an example.***

• Give each agent (could be a player, or a race) a certain rating as an estimate for how strong the agent is
• If 2 agents play and one wins at a higher percentage, the more successful agent should eventually end up with a higher rating
• If a higher rated agent & a lower rated agent play against each other, and each wins with an equal percentage, the 2 ratings should eventually converge

For these points, the stability and gradual increase in "points" actually come from ladder points. Since MMR is littered with too much noise, ladder points act as a sort of anchor to place people where they "belong." Your ladder points will drift much slower to the general area of your MMR, while your MMR can properly reflect who to place you against on a good and bad day. This way, you don't have to lose 300 ladder points to start playing people a division lower if you're just doing awful today. You'll maintain a ~50% winrate, but your point gains and losses will become unbalanced and you'll slowly slide downward.

Now going back to the scenario, consider the case where Blizzard releases a new patch which nerfs Zerg and makes it UP relative to both Protoss & Terran (eg, drones now cost 60 min). Consider what will happen to the average Zerg player. He will start losing more than 50% of his games, and his MMR will start dropping. Because of his lower MMR, he’ll start playing against weaker opponents. Eventually, his MMR will stabilize at a level where he starts winning 50% of his future games.

(...etc. until end of post)

This is more of a philosophical question applied to real world statistics. Since the criteria for skill is based on results and not quantitative characteristics, a bit of guesswork and intuition have to come into play. This is where they break up the statistics between leagues, look at tournament results, ask players about balance, et al. If they began to see all Zerg drop down the ladder, they would know that they swung the nerf bat too hard. Even then, however, balance is relatively subjective. Depending on how you look at data, all 3 races could look OP. If you throw psychology into the mix and say things like "people won't try to get better if they're winning more than X% of the time," it gets even harder to balance. As long as they keep a vigilant eye on results across the spectrum without seeing any glaring imbalances (like Zerg get knocked out first round every tournament ever), we can be assured that balance isn't broken.

reneg

United States859 Posts

August 03 2011 04:49 GMT

On August 03 2011 13:36 Snaphoo wrote:

Show nested quote +

Even in terms of position. ZvT close positions on Shattered versus far positions, for example, has got to be pretty skewed.

You'd think something like this, and then you learn that metal (with all positions enabled), Z has something like a 60% win ratio.

That's factoring in an "auto-loss" 1/3 of the time.

Personally, I don't feel the game is as imbalanced as a lot of people tend to think

Phaded

Australia579 Posts

August 03 2011 04:50 GMT

#10

Think of it like weighted win percentages

Simplistic example
P1 is Zerg, he has 10 wins vs Protoss and 9 losses vs Protoss. Each of these Protoss players had a lower MMR and P1 was favored

Since all 19 games played had P1 as being favoured to win, it does not necessarily translate to a 52.63% win ratio for the Zerg player P1.
If overall, Zerg can only hold an aggregate 52% win ratio against substantially less skilled players, then there is potentially something wrong with the match up.

These are all simplistic numbers, but expand it for the entire population and you can see why you need to adjust win percentages for relative skill level.

whacks

25 Posts

August 03 2011 05:09 GMT

#11

Thanks for the responses all.

A lot of people have responded with some variant of "If Blizzard sees all the Zerg players have significantly lower MMR, they'll know there's something wrong."
If Blizzard is doing this, then what they're basically doing is comparing the average zerg player's MMR with the average Terran player's MMR. This approach can break for so many reasons, which I'm not going to get into now. But if this is indeed what they're doing... why even put up the smoke screen of complicated formulas leading to "adjusted-win-rates"?

As a lot of people have mentioned, much of Blizzard's balancing act revolves around subjective judgments, which is certainly a valid approach. However, I'm uncomfortable with Blizzard constantly bringing up their "adjusted-win-rates" to support their claim that everything is balanced. If they're going to present this statistic as proof, it only makes sense that we question whether this statistic makes sense.

My central question is this: Does Blizzard's "adjusted-win-rate" hold any value at all? The example I brought up was to show that no matter how UP a race is, the win-rate statistics will still show perfect balance. If so, why shouldn't we throw these statistics in the trash?

whacks

25 Posts

August 03 2011 05:13 GMT

#12

Omnipresent, thanks for the awesome link. I was gonna make a very similar post actually, and am very amused/disappointed to see that someone has taken the words out of my mouth

aks, thanks for the detailed reply. When I was talking about stabilization, I was assuming consistent skill levels, to give Blizzard's statistics further benefit. Regarding the math tools that you mentioned exist... if you could give an example of how one of them can be used in this situation, that would be great. In my mind, the problem does seem unsolvable... kinda like a single equation with 2 unknowns.

aksfjh

United States4853 Posts

August 03 2011 05:23 GMT

#13

On August 03 2011 14:09 whacks wrote:
Thanks for the responses all.

A lot of people have responded with some variant of "If Blizzard sees all the Zerg players have significantly lower MMR, they'll know there's something wrong."
If Blizzard is doing this, then what they're basically doing is comparing the average zerg player's MMR with the average Terran player's MMR. This approach can break for so many reasons, which I'm not going to get into now. But if this is indeed what they're doing... why even put up the smoke screen of complicated formulas leading to "adjusted-win-rates"?

As a lot of people have mentioned, much of Blizzard's balancing act revolves around subjective judgments, which is certainly a valid approach. However, I'm uncomfortable with Blizzard constantly bringing up their "adjusted-win-rates" to support their claim that everything is balanced. If they're going to present this statistic as proof, it only makes sense that we question whether this statistic makes sense.

My central question is this: Does Blizzard's "adjusted-win-rate" hold any value at all? The example I brought up was to show that no matter how UP a race is, the win-rate statistics will still show perfect balance. If so, why shouldn't we throw these statistics in the trash?

Because they won't show perfect balance... You act as if skill and metagame never change. They use these win-rate statistics to track trends and follow the "migration" of large groups of players. In your example, Zerg won't drop down over night. Even after a devastating patch in your example, they have a window of probably a month or 2 to watch this happen and make adjustments accordingly. If you always judge the merits of a system by its ends, you're going to have a very depressing and disappointing life in general...

Msr

Korea (South)495 Posts

August 03 2011 05:23 GMT

#14

On August 03 2011 13:49 reneg wrote:

Show nested quote +

Every time I get delta quadrant I leave, and same with other similar maps, so when i do get meta I am playing somebody I am 99% supposed to win vs.

RandomAccount139135

40 Posts

August 03 2011 05:35 GMT

#15

--- Nuked ---

aksfjh

United States4853 Posts

August 03 2011 05:37 GMT

#16

On August 03 2011 14:13 whacks wrote:
aks, thanks for the detailed reply. When I was talking about stabilization, I was assuming consistent skill levels, to give Blizzard's statistics further benefit. Regarding the math tools that you mentioned exist... if you could give an example of how one of them can be used in this situation, that would be great. In my mind, the problem does seem unsolvable... kinda like a single equation with 2 unknowns.

In general, you use systems of equations/linear algebra to solve these problems. You've probably seen stuff like this:

3x+5y=5
5x+2y=7

As long as the results are linearly independent (where basically one equation isn't a multiple of another), you can get answers for x and y. If you then throw in another equation where the answer creates a "contradiction":

3x+5y=7

This is where you get uncertainty (more or less). You can quantify this uncertainty to give you an idea of where the values of x and y lie. The way Blizzard's equations work look, at the very core, like this:

Ax+By+Cz=D

Where x, y, and z each represent race, rating, and map variables (probably winrates ), and A, B, C, and D represent some constant values, like actual the race they're facing, rating of their opponent (or difference), map selection, and the projected winrate. Different operators are used (+, -, *, /, ^, etc), but that's essentially how it works.

whacks

25 Posts

August 03 2011 05:37 GMT

#17

On August 03 2011 14:23 aksfjh wrote:

Show nested quote +

LOL, don't take it so personally. I'm not here seeking life coaching

The example I gave is an extreme one, but you can easily think up more subtle cases. What if the imbalance has existed ever since game launch? In this case, there is no adjustment at all for Blizzard to track. In fact, any fixes that Blizzard puts in to fix the imbalance, will only show an upward shift for the previously-UP race & a downward-shift for the previously-OP race.

The point I'm trying to get across: If Blizzard's statistics don't work even in the ideal case where player skill & metagame is static and one race is very obviously UP, then of what possible value are they?

puppykiller

United States3137 Posts

August 03 2011 05:37 GMT

#18

did anyone else watch the question and answer section of the video. That was the most tragic thing I have ever seen. Obviously when the majority of the ppl who play ur games have that mindset your going to be in a position where u cant make a well balanced game.

whacks

25 Posts

August 03 2011 05:47 GMT

#19

On August 03 2011 14:35 Akari Takai wrote:

Show nested quote +

Yes, Blizzard has equations to match people by their skill levels and then uses another equation to convert that skill level across races (like converting the US dollar to the Euro).

Let's pretend there's a Terran player with a skill level of 2100 ELO and a Zerg player of 1900 ELO. The difference of 200 ELO (in most systems) is probably enough to be sure that the Terran player will almost always win against the Zerg player if TvZ was perfectly balanced. Now let's assume that that TvZ is favorable to the Zerg player, such that, a Zerg player with an intrinsic skill level of 1900 is on even footing with a Terran player of 2100, and is expected to win 50% of the time and lose 50% of the time.

So Blizzard has a formula to convert Terran ELO to Zerg ELO to Protoss ELO, etc. I figure the way they probably do this is look at the statistical curve for mean of each race's ELO. And figure out the disparity. And then they do this at each league.

It's not a perfect system, and there is some uncertainty, but if there were severe issues with balance, they would become very obvious, very quickly.

You bring up a good point, which is that things get tricky in the case where ZvP is balanced, but not ZvT. Check out the link by Omni earlier in the thread. The post is regarding this very topic & analyzes it very well.

However, your analogy breaks down if Zerg is equally overpowered against both Terran and Protoss. Suppose the Zerg player deserves to be at 1900, but because of the racial imbalance, has a 50-50 chance of beating 2100 Terran & Protoss players. His MMR will then rise to 2100 because of the way the ELO system works (in fact, it would never have stabilized at 1900 to begin with). Because of this, when Blizzard looks at ladder results, all they see is a bunch of 2100 Zergs, 2100 Terrans, and 2100 Protoss with 50-50 win rates.

This is exactly the point I'm trying to get across. If ZvT is unbalanced but ZvP is balanced (or unbalanced the other way), then you will see the results reflected in the statistics. But if one race is equally UP or OP against both other races, then all the ladder statistics would become absolutely meaningless.

aksfjh

United States4853 Posts

August 03 2011 05:47 GMT

#20

On August 03 2011 14:37 whacks wrote:

Show nested quote +

LOL, don't take it so personally. I'm not here seeking life coaching

That's where the philosophy and subjectivity comes in. In this case, you take the faith and credit that the game was, for the most part, balanced upon release. This is the ONLY reason phrases like, "That's because Zerg players are better than ____, but it only shows when they play on a fair map!" and "So many crappy Terran players relied on all-ins, and now since the game takes REAL skill, they're losing all the time!" hold any merit what-so-ever. In reality, there wasn't a race at release that so many people flocked to and stayed with that it was overwhelming for the other races.

Essentially, these win-rates are one of the tools in the shed we can use, but we should never rely on them too much. We have to take input from the community, theorycrafting, and our own play experiences to put those win-rates into perspective, and that's exactly what Blizzard does.

I also want to point out that I was just commenting on the very nature of using the "ends" of these systems as a way to rate them. In reality, the reason why Elo systems SUCK for a majority of people isn't because you start at some median and can fall below it, but because it takes so long to start getting matched up against people you can expect to win (or lose) against. The end system is perfect and beautiful, but in a constantly flowing system, it's not so much. Blizzard took a different approach with MMR and allowed people to get 50% win-rates fairly quickly, then let the point system fill in the rest. The result is a developing system that is more pretty, but the end result not so much.

1 2 3 4 5 6 7 Next All

Please or register to reply.

Blizzard's "skill-adjusted-win-percentages"

Completed

Ongoing

Upcoming