On August 03 2011 13:07 whacks wrote: Disclaimer: I’m not concerned about game balance at all. I’m hoping to have a discussion on the math & statistics behind Blizzard's adjusted-win-percentage that they rely on heavily.
Late last year, Blizzard released a bunch of ladder statistics on “skill-adjusted-win-percentages” for the different matchups. The reason I have it in quotes, is because they never really explained how they did the skill-adjustments. I’ve always been skeptical about whether such a “skill-adjustment” is really possible.
Well recently, I found the following video where Blizzard partially explains how they calculate the “skill-adjusted-win-percentages.” Watch the first 5 minutes of the following video:
Gist of what they said: Raw league matchup numbers aren’t very meaningful because of matchmaking’s system ability to matchup players with equally challenging opponents. The math guy mentions specifically: Not only does the system put players in 50-50 matches, it also tries to keep the race matchups at 50-50 as well. Because of this, we have to adjust for player skill to calculate the true matchup win rates. Example: a ZvP match is about to be played. The Zerg player’s rating (odds of winning) relative to the Protoss player is 55-45. The Zerg race’s rating relative to the Protoss race is 53-47. If the Protoss player ends up winning, the player ratings will then converge to 51-49. The race ratings will also converge to 52-48.
Their explanation just didn’t click with me. Rating systems such as ELO are great when you’re dealing with a single unknown (relative player strength). But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.
It’s important to first establish the following: Any good rating system, including ELO & the point system, relies on the following principle: • Give each agent (could be a player, or a race) a certain rating as an estimate for how strong the agent is • If 2 agents play and one wins at a higher percentage, the more successful agent should eventually end up with a higher rating • If a higher rated agent & a lower rated agent play against each other, and each wins with an equal percentage, the 2 ratings should eventually converge
The ELO system that Blizzard uses for MMR is an optimized algorithm that allows ratings to stabilize much quicker, but other rating systems that utilize the above principle (including the point-system), can achieve the same results in the long run.
Now going back to the scenario, consider the case where Blizzard releases a new patch which nerfs Zerg and makes it UP relative to both Protoss & Terran (eg, drones now cost 60 min). Consider what will happen to the average Zerg player. He will start losing more than 50% of his games, and his MMR will start dropping. Because of his lower MMR, he’ll start playing against weaker opponents. Eventually, his MMR will stabilize at a level where he starts winning 50% of his future games.
Now let’s say Blizzard had assigned each race a rating as well, to track how “strong they think it is.” Suppose that before the patch, all the races were balanced & had equal rating. Immediately after the patch, because the Zerg population goes through a losing streak, the Zerg rating will drop.
But eventually, the Zerg players will have stabilized their MMR and start winning 50% of their games. At this point, because of the last bullet point in the rating system’s principles (ratings will converge at 50% win rates), the Zerg rating will start increasing again. Remember also that the stabilized Zerg players are playing against opponents of the same MMR, so there’s no way to “account for player skill.” Eventually, the zerg rating will once again converge with the other races, even though Zerg is now UP.
Based on this scenario, it seems impossible to determine whether a race is truly UP, using Blizzard’s rating system. Thoughts? Any ideas on how Blizzard could possibly be “accounting for player skill” in calculating race balance?
Indeed it is quite hard to judge balance through Blizzard's system. However, when a player starts winning so much that there isn't an opponent fit for him the system can't do anything but match that player with other people that are lower in skill. Now if you were to use this as a way to judge which race is better/easier/overpowered/whatever you want to call it:
The race with the most players excelling in terms of skill should be the easiest race (not neccessarily, but a good assumption if seen on most realms)
This shows the top people in the world in terms of ladder points in descending order
Now I don't have time to do a huge analysis, so let's just take the top 25 from that list and break it down to how many of each race we find (server = Americas):
Zerg: 4 Protoss: 6 Random: 1 Terran: 14
It's clear that there are more terran players dominating the scene (i.e. terran players that are winning more often than losing, therefore consistently facing opponents worse than them)
My take on this... The way Blizzard has defined their balance equation leads to circular logic.
Chess is a good analogy to how badly Blizzard messed up. White has an obvious advantage and it is OP'ed...so say you had players who played exclusively either white or black. Then say Blizzard comes along to determine whether white is OP'ed. They say, we'll going to compare the win/loss records of white players of equal skill level who have a 50/50 win percentage against black players who have a 50/50 win percentage against white players.
So Blizzard fixes their match making system so white players are winning 50/50 against black players of 'equal skill level'. If a white player just keeps on winning...? Well then he was 'just that good' so we won't count him in the stats because we only like matches where the players are equal. That blizzard could run a fancy formula and come to this conclusion that players of equal skill had a 50/50 win % against the opposite color, thus determines that the colors are even is preposterously circular. They are using the adjusted win percentage to determine skill and also to determine matchups of even skill and you can't double dip like that mathematically.
The one plus of the interview was Dustin referring to early/late game balance. THIS is what blizzard should be focused on instead of the pseudo mathematical formulas. If in PvZ, protoss say wins 70% of the time when they rush zerg under 10 minutes, but lose 30% of the time after 10 minutes when zerg macro kicks in, then this is an example of a serious 'fun balance' issues that is far more important than an win/loss balance issue. The game first and foremost has to be fun and interesting and it it just degenerates into 'construct the perfect timing attack' or die because your race can't macro that well, then this makes all the minutes outside of that timing attack very boring. Blizzard messed up with larvae inject...it has made zerg too boom or bust unlike SCBW. Things like micro/positioning/skirmishes just don't happen that often because everything revolves around stupid timing attacks and whether they succeed or not because zerg is too exponential and all or nothing. Would love to see larvae inject nerfed and the zerg units buffed to compensate to make their growth more linear and thus making the other races easier to balance as well.
It may be instructive to consider how, in chess, white pieces tend to have an advantage over black pieces. Among weaker players, there's not much advantage. At higher levels, it starts to become a factor. Among top rated computers, white scores about 55%. However, in chess, players take turns playing white and black. It seems to me that the best way to do it for Starcraft would be to look at top-rated random race players. If there is a tendency for them to win with certain races, then I believe this would reveal something meaningful. To put it briefly, the best random race players' should be immune to MMR's affect on their races winnability. An obvious problem is that the random matchups are not totally analogous to the standard matchups since the non-random player doesn't know his opponent's race off hand.
I think the formula is too complicated. Im in gold but i pwn plats and diamonds(the occasional masters) like nothing. One time a diamond literally moved his army into mine w/o attacking then proceeded to BM me. No one is in leagues they belong, and its f*cking impossible to rank up...
On August 05 2011 11:16 BushidoSnipr wrote: I think the formula is too complicated. Im in gold but i pwn plats and diamonds(the occasional masters) like nothing. One time a diamond literally moved his army into mine w/o attacking then proceeded to BM me. No one is in leagues they belong, and its f*cking impossible to rank up...
I don't think this a thread to whine about "why I am not in the league I deserve". There are too many out there. There are a lot cheesy players in the dia and plat. Winning against them means nothing.
I feel that the current map pool in the ladder does not give us a good balance view. It may make Zerg look underpowered compared to Terran. Blizzard should learn to solve one problem at a time. They have huge complains about the current map pool, and yet they refuse to budge. If they continue to give us such maps, we will never obtain a statisically accurate result.
On August 05 2011 11:16 BushidoSnipr wrote: I think the formula is too complicated. Im in gold but i pwn plats and diamonds(the occasional masters) like nothing. One time a diamond literally moved his army into mine w/o attacking then proceeded to BM me. No one is in leagues they belong, and its f*cking impossible to rank up...
I don't think this a thread to whine about "why I am not in the league I deserve". There are too many out there. There are a lot cheesy players in the dia and plat. Winning against them means nothing.
I feel that the current map pool in the ladder does not give us a good balance view. It may make Zerg look underpowered compared to Terran. Blizzard should learn to solve one problem at a time. They have huge complains about the current map pool, and yet they refuse to budge. If they continue to give us such maps, we will never obtain a statisically accurate result.
the problem is blizzard is ok with their map pool being imbalanced. a quote i used in a blog
Our goal for these two formats is for players to be able to enjoy variety in the gameplay, rather than trying to provide an eSports level of game balance.
just shows that at the end of the day balance is a relative term.
blizzard are ok with their map pools being bad, and tourneys using different maps for competitive play. at the same time blizzard say they use 'ladder data' as a point of reference for balance decisions. they cant have it both ways
On August 05 2011 04:09 Ihpares wrote: - A system that ensures a 50% win rating not only in general, but race to race will hide imbalance by virtue of actively seeking that 50% regardless of skill level. This means two players of identical skill with two different races will both be at 50%, but will have very different MMRs if their respective races are imbalanced against one another.
But the distributions will be different over a large number of players in the imbalanced case. If it's harder to win as Zerg, Zerg players will have systematically lower MMRs, and that'll be visible in the distributions, unless you can make a convincing case that worse players choose Zerg.
- The law of large numbers means that in such a system, EVERY matchup should be EXACTLY 50%. ANY margin of error lends itself to either race, map or system imbalance. Remember, as the sample size becomes larger, the acceptable margin of error becomes smaller., and to call 2-5% acceptable is silly.
I said it before: If you're looking for a quantitative measure of player skill, you're right, but that's not what they're trying to do. They're trying to get close enough to 50% that the game feels fun while you're playing it. That's an enormously looser constraint.
On August 05 2011 07:05 Wren wrote: ~Ladder stats are essentially meaningless. Blizzard's correction may undo the effect of the matchmaking system gradually, but cannot fix the fact that the matchmaking system generated the data in the first place. You cannot analyze the flaw out of a flawed data set.
The problem is that everyone who's been arguing that the data set is "flawed" somehow have been saying so without any reasoning or explanation behind it, other than to completely misunderstand or misrepresent the impact of the matchmaking system on the data set.
Nobody in this thread knows what their data set is or exactly how they're analyzing it, so all the criticism of it is fantasy based on imagined details to fill in the blanks.
On August 05 2011 10:05 Fungal Growth wrote: So Blizzard fixes their match making system so white players are winning 50/50 against black players of 'equal skill level'. If a white player just keeps on winning...? Well then he was 'just that good' so we won't count him in the stats because we only like matches where the players are equal. That blizzard could run a fancy formula and come to this conclusion that players of equal skill had a 50/50 win % against the opposite color, thus determines that the colors are even is preposterously circular. They are using the adjusted win percentage to determine skill and also to determine matchups of even skill and you can't double dip like that mathematically.
If you assume that the distributions of actual player skill for each race are roughly similar, you can derive a mapping from one race's MMRs to another based on the MMR distributions of each race. Doing this requires the matchmaking system they're using. It depends on it completely.
Whether that's a reasonable assumption is something you'd have to look elsewhere to decide, but they do listen to anecdotal evidence from pros and the community, which is what you'd have to do to decide that. So, what's the problem?
On August 05 2011 11:16 BushidoSnipr wrote: I think the formula is too complicated. Im in gold but i pwn plats and diamonds(the occasional masters) like nothing. One time a diamond literally moved his army into mine w/o attacking then proceeded to BM me. No one is in leagues they belong, and its f*cking impossible to rank up...
Indirectly you bring up another point. Blizzard's bonus points reward you for playing...so weak players that play more out rank strong player's that don't play as much. So when the 'matchmaker' as David Kim called it tries to match players of equal ability, how can it ignore points gained from bonus wins? Yet another reason why Blizzard's fancy balance equations can't be trusted.
On August 05 2011 11:16 BushidoSnipr wrote: I think the formula is too complicated. Im in gold but i pwn plats and diamonds(the occasional masters) like nothing. One time a diamond literally moved his army into mine w/o attacking then proceeded to BM me. No one is in leagues they belong, and its f*cking impossible to rank up...
Indirectly you bring up another point. Blizzard's bonus points reward you for playing...so weak players that play more out rank strong player's that don't play as much. So when the 'matchmaker' as David Kim called it tries to match players of equal ability, how can it ignore points gained from bonus wins? Yet another reason why Blizzard's fancy balance equations can't be trusted.
On August 05 2011 11:12 carwashguy wrote: At higher levels, it starts to become a factor. Among top rated computers, white scores about 55%. .
Not to detract the thread too much, but that seems kind of low for white. In the last World chess championships of Anand/Topalov black only won once out of 12 matches...just goes to show the better the skill level...the better able the player (at any game) can magnify his advantage.
On August 05 2011 12:19 KevinIX wrote: Bonus points have no effect on MMR.
Curious...so when the matchmaker determines players of equal skill level, it uses a completely separate point system than the public one (which is largely comprised of bonus points)? If that were the case this shadow rank and the public rank could have quite the disparity. If this shadow rank was so wonderful in determining the skill level of an opponent then why wouldn't Blizzard use it instead of the public ranking system?
This just crosses my mind. After looking at the statistical equation, nature of data set and what Ihpares said, I feel like that the balance and players' skill follow the Uncertainty Principle .
It feels like you cannot really measure two quantities (player skills [MMR] and racial balance [win/lose ration of players]) at the same time so what you can do is to estimate the data as close to the actual data as possible. Like we only can calculate probability density of electron(s), we might only be able to calculate probability of the data of racial balance by using skill adjusted system (similar to using a basic set).
Or maybe I just review too much Physical Chemistry
On August 05 2011 12:19 KevinIX wrote: Bonus points have no effect on MMR.
Curious...so when the matchmaker determines players of equal skill level, it uses a completely separate point system than the public one (which is largely comprised of bonus points)? If that were the case this shadow rank and the public rank could have quite the disparity. If this shadow rank was so wonderful in determining the skill level of an opponent then why wouldn't Blizzard use it instead of the public ranking system?
Because the MMR does not really increase over time when a player reach certain point. So it does not give a positive reinforcement (not showing progress), if we would speak psychologically, to players and would lead to many players stop playing the ladder. However, because the ladder point always increase and can change abruptly, it give the sense of progression.
My question is why do they think statistics from lower leagues matter as much as higher ones? To but it bluntly, everyone is so bad in bronze that race imbalances aren't what's going to decide the winner, plain and simple (contrary to plat and above, where people don't get supply blocked at 11).
On August 05 2011 12:22 Fungal Growth wrote:Not to detract the thread too much, but that seems kind of low for white. In the last World chess championships of Anand/Topalov black only won once out of 12 matches...just goes to show the better the skill level...the better able the player (at any game) can magnify his advantage.
You have to consider what is meant by "score." You have to add half of the draw percentage to white's win percent (and the other half to black's). White won four, black won one, and the other seven were drawn. That means white won 33.33%, 58.33% were drawn, and black won 8.33%. That means white scored 62.5% ((33.33+(58.33/2)). In any case, it's best to take the computer engines' score of 55%, since they're way better than grandmasters.
On August 03 2011 13:07 whacks wrote: Disclaimer: I’m not concerned about game balance at all. I’m hoping to have a discussion on the math & statistics behind Blizzard's adjusted-win-percentage that they rely on heavily.
Late last year, Blizzard released a bunch of ladder statistics on “skill-adjusted-win-percentages” for the different matchups. The reason I have it in quotes, is because they never really explained how they did the skill-adjustments. I’ve always been skeptical about whether such a “skill-adjustment” is really possible.
Gist of what they said: Raw league matchup numbers aren’t very meaningful because of matchmaking’s system ability to matchup players with equally challenging opponents. The math guy mentions specifically: Not only does the system put players in 50-50 matches, it also tries to keep the race matchups at 50-50 as well. Because of this, we have to adjust for player skill to calculate the true matchup win rates. Example: a ZvP match is about to be played. The Zerg player’s rating (odds of winning) relative to the Protoss player is 55-45. The Zerg race’s rating relative to the Protoss race is 53-47. If the Protoss player ends up winning, the player ratings will then converge to 51-49. The race ratings will also converge to 52-48.
Their explanation just didn’t click with me. Rating systems such as ELO are great when you’re dealing with a single unknown (relative player strength). But can they really work if you’re trying to differentiate between 2 unknowns? Both relative player skill & race balance? I constructed the following scenario which seems to suggest that this is impossible.
It’s important to first establish the following: Any good rating system, including ELO & the point system, relies on the following principle: • Give each agent (could be a player, or a race) a certain rating as an estimate for how strong the agent is • If 2 agents play and one wins at a higher percentage, the more successful agent should eventually end up with a higher rating • If a higher rated agent & a lower rated agent play against each other, and each wins with an equal percentage, the 2 ratings should eventually converge
The ELO system that Blizzard uses for MMR is an optimized algorithm that allows ratings to stabilize much quicker, but other rating systems that utilize the above principle (including the point-system), can achieve the same results in the long run.
Now going back to the scenario, consider the case where Blizzard releases a new patch which nerfs Zerg and makes it UP relative to both Protoss & Terran (eg, drones now cost 60 min). Consider what will happen to the average Zerg player. He will start losing more than 50% of his games, and his MMR will start dropping. Because of his lower MMR, he’ll start playing against weaker opponents. Eventually, his MMR will stabilize at a level where he starts winning 50% of his future games.
Now let’s say Blizzard had assigned each race a rating as well, to track how “strong they think it is.” Suppose that before the patch, all the races were balanced & had equal rating. Immediately after the patch, because the Zerg population goes through a losing streak, the Zerg rating will drop.
But eventually, the Zerg players will have stabilized their MMR and start winning 50% of their games. At this point, because of the last bullet point in the rating system’s principles (ratings will converge at 50% win rates), the Zerg rating will start increasing again. Remember also that the stabilized Zerg players are playing against opponents of the same MMR, so there’s no way to “account for player skill.” Eventually, the zerg rating will once again converge with the other races, even though Zerg is now UP.
Based on this scenario, it seems impossible to determine whether a race is truly UP, using Blizzard’s rating system. Thoughts? Any ideas on how Blizzard could possibly be “accounting for player skill” in calculating race balance?
Indeed it is quite hard to judge balance through Blizzard's system. However, when a player starts winning so much that there isn't an opponent fit for him the system can't do anything but match that player with other people that are lower in skill. Now if you were to use this as a way to judge which race is better/easier/overpowered/whatever you want to call it:
The race with the most players excelling in terms of skill should be the easiest race (not neccessarily, but a good assumption if seen on most realms)
This shows the top people in the world in terms of ladder points in descending order
Now I don't have time to do a huge analysis, so let's just take the top 25 from that list and break it down to how many of each race we find (server = Americas):
Zerg: 4 Protoss: 6 Random: 1 Terran: 14
It's clear that there are more terran players dominating the scene (i.e. terran players that are winning more often than losing, therefore consistently facing opponents worse than them)
Now if you look at this data and also look at winning percentage of these top players:
Highest of protoss: 79% Highest of zerg: 75% Highest of terran: 90%
All in all (through this small amount of data, obviously more would be more accurate) it seems:
-Zerg is hardest to reach high level of play with -Terran is most abundant, and also highest winning ratio
In my opinion, this (with more data) is a very good way to show imbalance within the game.
Bias aside, feedback?
Basing any conclusions on statistical data without any sort of null hypothesis testing is horrible. There will always be patterns in data, even if it is entirely random, so you cannot draw conclusions this way.
First, there seem to be a lot of people in this thread who are emotionally invested in saying that Blizzard doesn't know what they're doing despite not having access to what they are doing. Maybe this is because they nerfed your favorite unit. Maybe this is because you don't like their map designs. Maybe it's because Dustin Browder (who isn't deeply involved in game balance) made some comment about the metagame that you think sounded dumb. In any case, having that kind of emotional investment in arguing they're wrong regardless of what they're doing just isn't rational.
Second: While we know from their statements that various in-game statistics and build order information aren't used by the matchmaking system directly, it's quite possible that such information might be a way to adjust for player skill, at least in gross ways. Without a statement from them on whether they're using that data in that way, it's impossible to tell. If they were, of course, it could open up all kinds of arguments about whether their skill-adjusted data were valid, but it's possible that it could be used in some ways that might be insightful.
On August 05 2011 12:13 Fungal Growth wrote: So when the 'matchmaker' as David Kim called it tries to match players of equal ability, how can it ignore points gained from bonus wins?
On August 05 2011 12:19 KevinIX wrote: Bonus points have no effect on MMR.
Curious...so when the matchmaker determines players of equal skill level, it uses a completely separate point system than the public one (which is largely comprised of bonus points)? If that were the case this shadow rank and the public rank could have quite the disparity. If this shadow rank was so wonderful in determining the skill level of an opponent then why wouldn't Blizzard use it instead of the public ranking system?
Forgive me for saying that you have no business posting in this discussion at all if you have not read and understood this post, which, incidentally, is a sticky in this forum:
That post's information is not speculation. Excalibur_Z has direct contact with the developers on these matters. They don't share all the details but they do confirm what he's written there. (That post, btw, is also a sticky on the Battle.net forums, which attests to its accuracy.)
On August 04 2011 11:22 seaofsaturn wrote: The whole purpose of differential equations is to measure things that are constantly changing...
Here is the differential equation from the video:
If you can't make sense of that (I can't!) then I don't know why you're trying to criticize them. The percentages are just simplified representations to present the data to people who aren't math majors, you can't really use them to support random theories.
It's funny they put up some supposedly insane math equation (it's not, you need one year of calculus to understand it), but they don't tell you what anything represents; theta, beta, gamma? Equations are meaningless if that information is omitted. Lol at Blizzard trying to appear transparent, patronization at its best, imo.
"We'll spare you the details, but these are the percentages", sketchy.