Blizzard's "skill-adjusted-win-percentages" - Page 2
Forum Index > SC2 General |
whacks
25 Posts
| ||
McFortran
United States79 Posts
Consider a hypothetical situation with 3 races, X,Y, and Z, and 3 players, x, y, and z of these respective races. x is highly skilled, y is moderately skilled, and z has low skill. X is vastly underpowered, y is relatively balanced, and Z is vastly overpowered. Suppose this balances in such a way that x, y, and z each have a 50% win ratio against each other. According to Blizzard, all 3 players are of equal skill due to their win rates (equal win rate against players of equal MMR means their MMR will be equal and therefore their skill will be equal), and since players of equal skill have equal win rates, the game must be balanced. This of course contradicts our hypothesis regarding balance. Regardless of the underlying math, these statistics appear to be flawed based on this (they make assumptions that are likely untrue). | ||
h0oTiS
United States101 Posts
| ||
dhe95
United States1213 Posts
| ||
thenexusp
United States3721 Posts
On August 04 2011 09:13 McFortran wrote: I feel like a large issue is that Blizzard is measuring the skill of a player as the success of a player. Consider a hypothetical situation with 3 races, X,Y, and Z, and 3 players, x, y, and z of these respective races. x is highly skilled, y is moderately skilled, and z has low skill. X is vastly underpowered, y is relatively balanced, and Z is vastly overpowered. Suppose this balances in such a way that x, y, and z each have a 50% win ratio against each other. According to Blizzard, all 3 players are of equal skill due to their win rates (equal win rate against players of equal MMR means their MMR will be equal and therefore their skill will be equal), and since players of equal skill have equal win rates, the game must be balanced. This of course contradicts our hypothesis regarding balance. Regardless of the underlying math, these statistics appear to be flawed based on this (they make assumptions that are likely untrue). I think one underlying assumption is that there will be some highly skilled players who plays the overpowered race, which is not that big an assumption to make. If that race is overpowered and the player is highly skilled there is no way to get that high an MMR otherwise, and would show up in the tool. | ||
Micket
United Kingdom2163 Posts
On August 04 2011 09:32 h0oTiS wrote: I don't think they take into account when (gametime) the cert ant race wins, although this is only important in masters or diamond league because of all the cheese in the lower league, but something could be found like that zergs win 70% in the late game or something like that, that you just can't feel by yourself, obviously the 70% win rate in the late game is not ture but something like that could arise. Define late game. I have seen TvZ matches where both players are barely mining and it has gone on for ages. Is nestea vs sc late game? I certainly saw no hive tech. Was Destiny holding off a push with infestoRs and waiting 20 minutes to break a Terran turtle considered late game. How many bases does Zerg have? How many does Terran have? Did DRG vs MVP reach late game? The metalopolis game was 5 base vs 4 base, late game? It is too abstract a term to be defined with game time. Some would say late game is brood lord/corruptor/infestor vs marine tank medivac Viking ghost. But does rushing to brood lords count as late game then? Nestea had 15 minute brood lords vs Ensnare. | ||
maddogawl
United States63 Posts
On August 03 2011 14:47 whacks wrote: You bring up a good point, which is that things get tricky in the case where ZvP is balanced, but not ZvT. Check out the link by Omni earlier in the thread. The post is regarding this very topic & analyzes it very well. However, your analogy breaks down if Zerg is equally overpowered against both Terran and Protoss. Suppose the Zerg player deserves to be at 1900, but because of the racial imbalance, has a 50-50 chance of beating 2100 Terran & Protoss players. His MMR will then rise to 2100 because of the way the ELO system works (in fact, it would never have stabilized at 1900 to begin with). Because of this, when Blizzard looks at ladder results, all they see is a bunch of 2100 Zergs, 2100 Terrans, and 2100 Protoss with 50-50 win rates. This is exactly the point I'm trying to get across. If ZvT is unbalanced but ZvP is balanced (or unbalanced the other way), then you will see the results reflected in the statistics. But if one race is equally UP or OP against both other races, then all the ladder statistics would become absolutely meaningless. This is not true because if the Zerg is at 1900 and the Terran at 2100, and Zerg is racially imbalanced towards Terran and the other way against Zerg versus protoss, theres no way the Zerg will stabilize at 2100. So in your example 1900 Zerg vs 2100 Terran will be 50/50 1900 Zerg vs 1700 Protoss will be 50/50 1900 Zerg vs 1900 Zerg will be 50/50 So Zerg remains relatively at 1900, even though the Terran is of higher skill. I've studied how TrueSkill works on the Xbox 360 and a single game out of a ton of games played has a very small impact on the change in skill, to the point that the zerg might go up to 1905 and the Terran down to 2095 (if Zerg wins), and theres a smaller pool of games played, if more games are played the change will be way less. | ||
bamman1108
United States35 Posts
| ||
Lysenko
Iceland2128 Posts
On August 03 2011 13:07 whacks wrote: But eventually, the Zerg players will have stabilized their MMR and start winning 50% of their games. At this point, because of the last bullet point in the rating system’s principles (ratings will converge at 50% win rates), the Zerg rating will start increasing again. This is the fallacy in your argument. If Zerg players start losing, their ratings fall, and they stabilize at 50/50 win/loss ratios against other-race players with lower MMRs, their rating will be stabilized at a lower MMR. It won't start increasing again without the players getting better or another change to the game. The way you adjust for skill is to look at overall MMR distribution among each race's population. If one race, let's say Zerg, has a population distribution that's weighted toward lower MMRs, chances are it's the race that's doing it unless there's some external indication that better players systematically favor the other races for some reason. | ||
Lysenko
Iceland2128 Posts
On August 03 2011 14:09 whacks wrote: If Blizzard is doing this, then what they're basically doing is comparing the average zerg player's MMR with the average Terran player's MMR. This approach can break for so many reasons, which I'm not going to get into now. Why don't you get into it? I argue that it's an absolutely valid approach, particularly if one evaluates differences in the entire distribution and not just, say, an average. | ||
Lysenko
Iceland2128 Posts
On August 03 2011 14:37 whacks wrote: What if the imbalance has existed ever since game launch? In this case, there is no adjustment at all for Blizzard to track. In fact, any fixes that Blizzard puts in to fix the imbalance, will only show an upward shift for the previously-UP race & a downward-shift for the previously-OP race. The point I'm trying to get across: If Blizzard's statistics don't work even in the ideal case where player skill & metagame is static and one race is very obviously UP, then of what possible value are they? Well, we don't really know what they're doing exactly, so it's difficult to evaluate (and equally difficult to discount.) However, I don't see why looking at the MMR distributions of race populations wouldn't show a static imbalance just as easily as one that were the result of a new adjustment to the game. If one race's MMR distribution is shifted lower on the MMR axis, has a thinner tail at the low or high end, or has a shape other than a normal distribution, that all points pretty clearly to some type of imbalance, without regard to how or when it was introduced to the game. | ||
Lysenko
Iceland2128 Posts
On August 04 2011 09:13 McFortran wrote: Regardless of the underlying math, these statistics appear to be flawed based on this (they make assumptions that are likely untrue). Your argument is true if your sample size is one of each race. However, when you look at tens of thousands of players, why would you argue it's not a good assumption that each race attracts a similar (and normal) distribution of player talent? | ||
Lysenko
Iceland2128 Posts
On August 04 2011 09:36 dhe95 wrote: So blizzard is matching up based on both player win percentage and race win percentage? That should mean that the win percentage of all the races should also eventually equal 50% given an infinite number of games played, not because of any map/balance success but instead because of a bad matchmaking algorithm. You've misunderstood. The video is discussing not how their matchmaking works, but how they evaluate whether the races are balanced against each other post-matchmaking. The matchmaking system assigns a score that numerically predicts the likelihood of that player beating another prospective player. Then, it strives to find other players who are likely to have a 50/50 win percentage vs. the player who hit the button, without regard to race. | ||
Lysenko
Iceland2128 Posts
On August 04 2011 10:32 bamman1108 wrote: I like that part where they're satisfied with 5% differences in W/L when that percent is based off millions of matches. Even a 1% difference with that many matches means that one race very, very significantly favors the other. Wtf are they talking about when a 55% win rate for a specific race matchup is just "borderline?" It may be significant in a statistical sense, but remember that the basic problem they're trying to solve is not achieving statistically-insignificant balance but balance that's sufficient for the game to remain entertaining. So, 55% being tolerable is a judgement about how much imbalance will take away from the entertainment value of the game, not a judgement about whether the game is the optimal measurement of the player. That said, those numbers in their presentation have shifted around a bunch since then, and I think the fact that all races have had success in major tournaments is a pretty good sign that any imbalances that exist are not having a huge impact on the competitive scene. | ||
McFortran
United States79 Posts
On August 04 2011 11:11 Lysenko wrote: Your argument is true if your sample size is one of each race. However, when you look at tens of thousands of players, why would you argue it's not a good assumption that each race attracts a similar distribution of player talent? If you look at Korea, it would appear that there are significantly more terran players at the highest level than other races. So I don't think the assumption is particularly good in theory or in practice. | ||
seaofsaturn
United States489 Posts
Here is the differential equation from the video: ![]() If you can't make sense of that (I can't!) then I don't know why you're trying to criticize them. The percentages are just simplified representations to present the data to people who aren't math majors, you can't really use them to support random theories. | ||
Nerski
United States1095 Posts
Anyway back on topic, the way they balance the ladder is not with the highest level in mind and probably never will be. Things they do or balance for that are counter intuitive to actual balance. - Constantly add maps that are unique and fun....reality is they force you to play in X way assuming your race is capable of that. - They balance for the lowest level as well as the highest level....balancing for the fact say zerg bronze players are terrible at X isn't a good way to balance a game even if it means bronze zergs lose a lot and Master / GM zergs don't. - They balance based on raw statistics with 'player skill' factored in, but due to potentially bad maps by sheer random luck, poor positions, or possibly unknown balance factors...there is frankly no humanly possible way to use a formulaic approach to determine balance on the ladder. Perfect example of the above...if say Zerg was OP, but only slightly, if zerg players were not truly better then say T or P players....the game would still appear balanced simply because the stats say so. Same goes for any factor such as maps making maybe P struggle but maybe the race is OP so despite the bad maps they win despite it...maps change and all the sudden P rolls everyone. Essentially unless the game was perfectly balanced already there would be no way to account for players actual skill in a formula. | ||
FXOjEcho
Canada318 Posts
| ||
Lysenko
Iceland2128 Posts
On August 04 2011 11:20 McFortran wrote: If you look at Korea, it would appear that there are significantly more terran players at the highest level than other races. So I don't think the assumption is particularly good in theory or in practice. The thing is, that simple fact doesn't say much. In the absence of any other information it could mean: * That Terran is imbalanced for game design reasons for players at the top level of skill. * That players at the top level of skill favor Terran because it plays in a way that rewards that skill for reasons other than a competitive advantage (for example a playstyle that makes them feel like they're making use of their skills.) * That the skill distributions of all three races are identical but for some reason Korean players favor Terran, thus all levels of the game are more populated with Terrans. Edit, sorry, I left an important one out: * That the most skilled players in Korea have decided among themselves that Terran is favored by imbalances, when in fact it's well-balanced. The video the OP linked explains that when the numbers suggest something, they go looking through other sources of evidence (pro player feedback, community feedback, their own play experiences, tournament replay analysis, results from testing tools) to try to distinguish between possible causes. Sounds to me like a very reasonable approach. As for how they create their "skill-adjusted win percentages," it's hard to criticize without a specific understanding of what they're doing. However, I do know that the Battle.net 2.0 team employs at least a few people with a more rigorous statistical background than most of the posters on Team Liquid's site. My point, of course, is to say that it's completely invalid to criticize any statistical analysis without understanding the details of what they're doing. | ||
NATO
United States459 Posts
Of course that something is more likely to be correlation with player skill or other factors such as race potential, rather than raw racial strength. Furthermore, accounting for meta game this will shift through time as races become more well understood than they were before. Because of this, there is no way for Blizzard to know actual balance through this means. | ||
| ||