• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 14:10
CEST 20:10
KST 03:10
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Power Rank - Esports World Cup 202560RSL Season 1 - Final Week9[ASL19] Finals Recap: Standing Tall15HomeStory Cup 27 - Info & Preview18Classic wins Code S Season 2 (2025)16
Community News
BSL Team Wars - Bonyth, Dewalt, Hawk & Sziky teams10Weekly Cups (July 14-20): Final Check-up0Esports World Cup 2025 - Brackets Revealed19Weekly Cups (July 7-13): Classic continues to roll8Team TLMC #5 - Submission re-extension4
StarCraft 2
General
The GOAT ranking of GOAT rankings Power Rank - Esports World Cup 2025 What tournaments are world championships? RSL Revival patreon money discussion thread Jim claims he and Firefly were involved in match-fixing
Tourneys
Esports World Cup 2025 Sparkling Tuna Cup - Weekly Open Tournament FEL Cracov 2025 (July 27) - $8000 live event Master Swan Open (Global Bronze-Master 2) Sea Duckling Open (Global, Bronze-Diamond)
Strategy
How did i lose this ZvP, whats the proper response
Custom Maps
External Content
Mutation #239 Bad Weather Mutation # 483 Kill Bot Wars Mutation # 482 Wheel of Misfortune Mutation # 481 Fear and Lava
Brood War
General
BGH Auto Balance -> http://bghmmr.eu/ BW General Discussion Dewalt's Show Matches in China [Update] ShieldBattery: 2025 Redesign Ginuda's JaeDong Interview Series
Tourneys
[Megathread] Daily Proleagues CSL Xiamen International Invitational [CSLPRO] It's CSLAN Season! - Last Chance [BSL 2v2] ProLeague Season 3 - Friday 21:00 CET
Strategy
Does 1 second matter in StarCraft? [G] Mineral Boosting Simple Questions, Simple Answers
Other Games
General Games
Stormgate/Frost Giant Megathread Nintendo Switch Thread [MMORPG] Tree of Savior (Successor of Ragnarok) Path of Exile CCLP - Command & Conquer League Project
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread Vanilla Mini Mafia
Community
General
Things Aren’t Peaceful in Palestine Stop Killing Games - European Citizens Initiative Russo-Ukrainian War Thread US Politics Mega-thread Post Pic of your Favorite Food!
Fan Clubs
SKT1 Classic Fan Club!
Media & Entertainment
[\m/] Heavy Metal Thread Anime Discussion Thread Movie Discussion! [Manga] One Piece Korean Music Discussion
Sports
Formula 1 Discussion 2024 - 2025 Football Thread TeamLiquid Health and Fitness Initiative For 2023 NBA General Discussion
World Cup 2022
Tech Support
Installation of Windows 10 suck at "just a moment" Computer Build, Upgrade & Buying Resource Thread
TL Community
The Automated Ban List
Blogs
Ping To Win? Pings And Their…
TrAiDoS
momentary artworks from des…
tankgirl
from making sc maps to makin…
Husyelt
StarCraft improvement
iopq
Socialism Anyone?
GreenHorizons
Eight Anniversary as a TL…
Mizenhauer
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1070 users

Scientific proof that SC2 is imbalanced (sorta) - Page 10

Forum Index > SC2 General
Post a Reply
Prev 1 8 9 10 11 12 13 Next All
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-18 23:20:55
August 18 2010 23:12 GMT
#181
I was wondering if you had the p-value for the chi-square test, I didn't see a null hypothesis or whether or not the results were statistically significant..


p-values are nearly zero for all leagues except silver, where it is above 0.2 or 0.3 if memory serves. That's with a n = ~50 million.

Stats minor here. I redid (I hope) your results with the numbers you put in here, with a chi-square GOF test for homogeneity. In my findings, the only league that is statistically imbalanced is bronze, based solely on W/L you reported. (the P-value at the end was about .12, higher than the .05 I set as the requirements for the null hypothesis to be overruled. Silver to Diamond, it didn't go above .03, which essentially means that after bronze, no race has a W/L record that is significantly different from any othre race.


Unless I'm misunderstanding something, that's the test I've done, but with a different sample size. If you use percentages as your counts than your sample size is 100, unless you've adjusted them for the actual sample size (as I have). For a chi-square test it's important that the correct sample size be used.

@ all the people who think my results are meaningless due to matchmaking:

I've said this a lot by now, but I'll state it again for the last time here:

I have reasoned that if the game is imbalanced, that imbalance must manifest as either 1) a difference in win ratios for the different races or 2) a difference in race prevalence as you increase player skill level, except under one of several unlikely scenarios and one likely one.

The degree to which it shifts from condition 1) to condition 2) depends on the strength of the matchmaking system. Since we don't see 1) (as my data show), and we don't see 2) (as I have said and escapeartist has shown), we can conclude that the game is balanced, at least for regular league play.

The unlikely scenarios:

a. Blizzard's matchmaking system is wise to racial imbalances, and choses lower level opponents for a player of a given ranking if they play as a weak race vs. a strong race. The only reason they would do this would be to 'hide' racial imbalances from the player and/or the community.

b. Blizzard's matchmaking system does nothing, and each league is a random sample of the regional player population.

c. People have no race loyalty, and randomly pick their race before each match.

The likely scenario:

d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.

@ all the people who think the test is inappropriate because I haven't modeled enough variables that affect win rate

I don't have access to data that will allow me to do that. I'd like to, but I can't. In science when that's the case, you have to look for other ways that you can use to test a question. In my case initially reasoned that an imbalance would lead to a difference in win rates among races. People immediately pointed out that that wasn't the case, due to matchmaking. However, I then realized that the matchmaking system would force weak races into the lower leagues.

I checked to see if that happened and amended my analysis with a graph showing that it doesn't. Escapeartist has since analyzed this in more detail and come to the same conclusion, although nobody to my knowledge has done an analysis for lower league play. People have shown, however, that it's not true for the top hundred or so players in each region.

@ the people who think stats are useless

It's been shown before that stats are a much better way of assessing the truth than anecdotal knowledge. Even experts often have misperceptions, and misperceptions often produce feedback loops. Stats are at least partially resistant to this.

That said, I think opinions and impressions of top-level players (IdrA's thoughts on high level ZvT matchups, e.g.) still warrant attention, and consistently held beliefs warrant scientific investigation. In fact, that's what I did with respect to win rates for league play!

Finally, thanks everyone for your interest! I'll keep trying to answer questions but I know I'll miss some and for that I apologize.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-19 00:02:02
August 18 2010 23:48 GMT
#182
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.

All your stats have shown is that that the populations of zerg, protoss, and terran favorite players have similar win rates at various levels, and there is an interesting distribution of races across all the leagues. These populations of players aren't the same, so it doesn't show anything about the underlying skill level of players in those leagues nor does it eliminate selection biases for picking races or the possibility of varying levels of imbalances at different skill levels.

For example:
It might take a certain type of mind to play zerg well and not all players are suited to it, or the average zerg might have to be better, or the learning curve for zerg is easier but current skill ceiling is lower.

Everything in statistics is retrospective and only past state of affairs. It only captures up to the current state of the SC2 metagame. In fact, the statistics could be hopeless outdated if there is a sharp change in the metagame out there. Moreover, it doesn't say anything about the imbalances should players figure out how to play it optimally.
Moderator我们是个踏实的赞助商模式俱乐部
febreze
Profile Joined April 2010
167 Posts
August 18 2010 23:52 GMT
#183
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.


Its been said before, but, due to faulty assumptions made by the author this study must be redone to be relevant.
Beauty in truth, deception with dogma, meaning through life.
cocosoft
Profile Joined May 2010
Sweden1068 Posts
August 19 2010 00:10 GMT
#184
For those who says that the matchmaking system making players have 50-50 is causing the data the guy is using to be incorrect, READ what he wrote:
On August 17 2010 07:13 GagnarTheUnruly wrote:
People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.
Agreeing with this.
¯\_(ツ)_/¯
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-19 00:57:55
August 19 2010 00:44 GMT
#185
On August 19 2010 08:48 TanGeng wrote:
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.


Phenomena are proof of themselves. Mechanisms require validation. The only thing you need to do to prove that different races have different win rates is to show that they do (and they do... 56.06% is different from 55.56%). The question the statistics answer is: is the difference due to chance? Specifically, they estimate the chance that the observed test statistic (chi-square value representing the standardized difference between observed and expected win rates in this case) could occur if it was picked at random. In this case, we take a p-value of something on the order of 1e-12 to be sufficient evidence that the difference is not due to chance, that we can feel comfortable saying it with certainty. The difficulty comes with interpretation and generalization. These require reasoning and logic, and careful consideration of all possible explanations. Such analysis often leads to additional questions, as we've witnessed in this thread. It's the scientific process in a nutshell.

It's a common misconception that science and statistics can't prove anything. I like a statement I just read in an intro plant ecology textbook:

"The popular image of the scientific method protrays it as a process of falsifying hypotheses. This approach was codified by... Karl Popper (1959). In this framework we are taught that we can never prove a scientific hypothesis or theory. Rather, we propose a hypothesis and test it; the outcome of this test either falsifies or fails to falsify the hypothesis. While hypothesis testing and falsification is an important part of theory testing, it is not the whole story, for two reasons."

First, the approach "fails to recognize knowledge accumulation." The author goes on to say that although in a strictly philisophical scientific knowledge can never be known with absolute certainty, "we also recognize that some knowledge is so firmly established and bolstered by so many facts taht the chance taht we are wrong is very much less than the chance of winning the lottery several times in a row." It's important to note that we can estimate with some accuracy what are chance is of being correct.

I would also add that even when we use statistics to 'disprove' a hypothesis, that falsification is associated with it's own probability level. In actual practice, it's much easier to show that patterns exist and processes happen than it is to show the opposite, because if a pattern is not found it's generally impossible to know if it's because it shouldn't be found or because the research approach was inadequate.

Second, science often isn't concerned with falsification, and instead asks questions about the relative importance of processes, and this doesn't fit the Popperian framework. This is generally better science anyways.

- from Gurevitch et al. 2006. The Ecology of Plants
GameTime
Profile Joined May 2010
United States222 Posts
August 19 2010 01:01 GMT
#186
I just don't think 0.5% is enough for me to even consider that the game is imbalanced, if it's only .5% it seems pretty balanced to me.

It would be cool if you showed the winning percentages with all the matchups.
Only the winner deserves to win.
youngminii
Profile Blog Joined May 2010
Australia7514 Posts
August 19 2010 01:14 GMT
#187
I can't believe how many people are still using the 'matchmaking explains imbalance' argument when the OP fucking explained it (eventually).
lalala
Miros
Profile Joined August 2010
Australia10 Posts
Last Edited: 2010-08-19 01:24:48
August 19 2010 01:21 GMT
#188
What exactly is the point of analyzing the overall win percentage of the races? It would be much more interesting to see win percentages for each matchup (TvZ, TvP, PvZ). Maybe Terrans win 60% of their matches against Zerg, but less than 50% against Protoss.

Or am I missing something?
eiswand
Profile Joined July 2010
Germany44 Posts
August 19 2010 01:34 GMT
#189
Considering the game was released only 2 weeks ago the game is damn good balanced. But some pro gamers think Terran is a little bit too strong and what happens? Hordes of fanboys and noobs think this is the ultimate truth and start to go on a imbalance crusade.

Just today I met a Protoss player who said "Terran is so fucking imba". I asked him why. His answer: "Dunno, they all say it in the forums"........

Again: No RTS will ever be 100% balanced. NEVER. But considering the game is only 2 weeks in stores the balancing is DAMN GOOD. Now stop whining and enjoy the game.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 01:37 GMT
#190
Ugh...

Science can prove things with observation and experiment. With science you can even observe the processes and phenomena at work. With statistics, you have none of this. It's merely input states and observed states. All phenomena and mechanism is ignored.

With statistics, you can reach near certainty on certain specific statements, and those statements are very very specific. In general, science has massively abused statistics by jumping to conclusions that do not match the very specific statements being shown to be near certain.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.

Based off of your data, you haven't even shown that there is imbalance. It is merely that the population of zerg, protoss, and terran favorite players have observably different winning percentages across the various leagues, and that if it were a given that the game was perfectly balanced and populations at each level possess the exact same skill sets, it'd be hard to produce the observed winning percentage rates purely by chance. That is all the statistics tells you.

This is an extremely minimalistic conclusion and of no real value at all. There are some conclusions you can deduce from that by looking at it logically, but there isn't much there.
Moderator我们是个踏实的赞助商模式俱乐部
mierin
Profile Joined August 2010
United States4943 Posts
Last Edited: 2010-08-19 02:19:35
August 19 2010 02:17 GMT
#191
How many of you have watched G2 of TLO vs MadFrog in the IEM tournament? + Show Spoiler +
Madfrog played perfectly and still got steamrolled. TLO was even behind by a significant margin economically due to MF's counter, yet by the grace of MULEs managed to completely own MF.

EDIT: I'm even somewhat of a TLO fanboy, and have a problem with this. Both games of the series are really telling IMO.
JD, Stork, Calm, Hyuk Fighting!
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 19 2010 02:47 GMT
#192
I'm sorry, but I'm having a lot of trouble understanding what you're trying to say. I'll give it my best shot, though. Also, please stop prefacing all your posts with 'ugh...' If you're frustrated by what I'm saying, it's possible that you are the one who's missing something, and not me.

I think you're creating a little bit of a false dichotomy between science and statistics. Science is a method of obtaining understanding, and statistics is an important mathematical tool that scientists use. You're right that statistical methods ask and answer very specific questions, and that assumptions of certain tests can limit inference, but statistics isn't the only means of scientific inference. We also use logic and theoretical understanding to interpret statistical results. In fact, doing this is necessary in order to achieve scientific progress. I also think it's very unfair to state that science in general has jumped to conclusions and abused statistics. If you're going to make such sweeping statements you should present examples.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.


Here I'm not sure what you're saying. In this case the null hypothesis was no difference in win rates. My data suggest a significant but very small departure from the null hypothesis (a positive result). The chance of a false positive is the type 1 error, and is equivalent to the p-value. The 'negative' doesn't have a low p-score... a negative result would have a high p-score. I have not calculated the chance of a false negative, and I'm not interested in that question because I haven't seen a negative result. Also, I'm not sure what you mean by there being other false positives. There's only one test statistic and the chance of a false positive is almost zero. Are you talking about other explanations?

Also, the results don't involve the population of zerg, protoss and terran players -- it's the results of zerg, terran, and protoss games that I measured. Players aren't accounted for and aren't important given my reasoning described in other posts. You're right about the results, they show with near certainty that there's a small departure from the null hypothesis (which assumes the first condition you cite but not the second one). And you're right that there's no more that the statistics tells us. Which is why we move from statistical inference to logical inference. Then we learn that the results mean that the game is well balanced.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 19 2010 02:52 GMT
#193
On August 19 2010 11:17 mierin wrote:
How many of you have watched G2 of TLO vs MadFrog in the IEM tournament? + Show Spoiler +
Madfrog played perfectly and still got steamrolled. TLO was even behind by a significant margin economically due to MF's counter, yet by the grace of MULEs managed to completely own MF.

EDIT: I'm even somewhat of a TLO fanboy, and have a problem with this. Both games of the series are really telling IMO.


You totally freaked me out! I had a different TLO v MadFrog game going in a diff. window, and I thought you were sending some freaky evil post letting me know you were hacking me. Then I realized you were talking about a different game and why you brought it up...

It does seem that at high level play a lot of players feel zerg is too weak. But why is it doing so well in Asia?
andyrichdale
Profile Joined April 2010
New Zealand90 Posts
August 19 2010 03:03 GMT
#194
On August 19 2010 09:44 GagnarTheUnruly wrote:
Show nested quote +
On August 19 2010 08:48 TanGeng wrote:
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.


Phenomena are proof of themselves. Mechanisms require validation ... In actual practice, it's much easier to show that patterns exist and processes happen than it is to show the opposite, because if a pattern is not found it's generally impossible to know if it's because it shouldn't be found or because the research approach was inadequate.


TanGeng got owned

Nice work OP
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-19 04:06:56
August 19 2010 03:14 GMT
#195
On August 19 2010 11:47 GagnarTheUnruly wrote:
Also, the results don't involve the population of zerg, protoss and terran players -- it's the results of zerg, terran, and protoss games that I measured. Players aren't accounted for and aren't important given my reasoning described in other posts. You're right about the results, they show with near certainty that there's a small departure from the null hypothesis (which assumes the first condition you cite but not the second one). And you're right that there's no more that the statistics tells us. Which is why we move from statistical inference to logical inference. Then we learn that the results mean that the game is well balanced.


It's about the games? I don't see games statistics by race at sc2ranks. It's grouped by player and only their favorite race is shown. Anyone with a favorite race can play a lesser number of games as any of the other races (including random). There is no insight into the exact number of games won or lost by any of the races selections. You would have to assume that they played their favorite race exclusively.

Also you would also have to assume perfectly even skill distribution among player populations if you want perfectly matching win rates - unless you want to assume that players didn't actually pick their favorite races and they were assigned one randomly by battlenet.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.


This is basic Bayesian logic. Let's call perfect balance with equally skilled players your negative condition, and imbalanced game your positive condition. Your positive test is a significant difference in win rates among the populations players with favorite races. The scenarios covered by your assumption of perfect balance with equally skilled players overlaps a bit with the scenario where you get differences in win rates.

How likely do you think you will have a game with perfect balance with equally skilled players anyways?
If your estimate is close to 1, then the overall chances that you got a false positive increases while your chances of a true positive decreases. Your probability of a false positive given a positive test result is high. (In case of rare diseases, doctors often ask for a confirmation test since the false positive rates is nearly the same as a true positive rates.)

If it's close to 0, then your false positive chances arising from a perfectly balance and equally skilled players scenario were nearly nil anyways, so why would you care about it at all to begin with? You want to eliminate other reasons why you might register a false positive for imbalance instead.
Moderator我们是个踏实的赞助商模式俱乐部
hdkhang
Profile Joined August 2010
Australia183 Posts
August 19 2010 05:54 GMT
#196
On August 19 2010 09:10 cocosoft wrote:
For those who says that the matchmaking system making players have 50-50 is causing the data the guy is using to be incorrect, READ what he wrote:
Show nested quote +
On August 17 2010 07:13 GagnarTheUnruly wrote:
People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.
Agreeing with this.



Really? The OP is wrong, there is no two ways about it.

I'll repost here what I wrote back on page 8... which by the way, the OP conveniently ignores. Pay careful attention to the points regarding MMR and how they are used by the AMM to produce the exact result that Blizzard want you to see which is the exact result that the data shows. It is a SELF FULFULLING PROPHECY! All the math in the world, no matter how fancy you try to be won't be of any use since you are using faulty data/metrics to try and prove a point. It would be like me trying to show that the data is clean, by using only "cleaned" data and filtering out the "dirty" data. Blizzard have "cleaned" the data, it's plain and simple to see.


The problem is not in your methodology as much as it is in the assumptions you are making about the system.

What do we know about the system?

Anyone who wishes to play on the ladder is accepted

* You can have zero knowledge of the game and yet you will be placed in Bronze (in sports, that makes you 3rd place!), the only prerequisites for acceptance into a league is to just show up 5 times (you can even disconnect 5 times once the game starts and still get in).
* This is an automatic invalidation of any results emerging from the Bronze league as the range of skill present is astronomical!

You only play matches against people in your region

* Comparisons across different battle.net servers is worthless.

The hidden MatchMakingRating number is based only on wins/losses with respect to the current MMR of yourself and your opponent

* What was required to score that win is not considered at all, nor should it be. Unfortunately, this places the onus on making a game as balanced as possible all the more important or else the MMR is worthless.
* This point also explains why the data is "practically worthless".

The AMM will attempt to pair you up with a person with a similar MMR

* Note that I say similar MMR and not similar skill/ability.
* If a race imbalance existed, the MMR would not reveal it since it would simply consider the person using the weaker race as a "poorer" player hence a lower MMR and the person using an OP race as a better player rewarding them with a higher MMR.
* Thus if you were to compare even between players of similar MMR but across the three races, it would reveal nothing of significance since the reason they were given that MMR is due to their win/loss performance against the same people they are being compared against.

Not everyone will play the same number of games

* You may say "well duh" but it bears repeating.
* I think in the original top 200 list one of the players that made it had only played a handful of games, I think it was 7 all up, yet 7 games was enough for the system to determine their MMR to be one of the top 200 on the server.
* I honestly believe that a more stringent pre-requisite for diamond league is needed, e.g. 100 games played.

There are many more other points to make, but let's just start with the above for now.
texmix
Profile Joined May 2010
United States106 Posts
August 19 2010 12:51 GMT
#197
On August 19 2010 06:18 Hidden_MotiveS wrote:
Show nested quote +
On August 19 2010 04:59 texmix wrote:
As others have stated, the OP is based on a flawed methodology. If trying to use stats to figure out which race is overpowered, 4 items need to be controlled:
1. Homogeneous skill in race choice (maybe old BW semi-pro's just gravitate towards Terran in sc2)
2. The matchmaking system instead of random opponents
3. Player MU difference (one player may, in the long run, win 60% pvt, another lose 60% pvt)
4. Player MU skill changes over time (maybe a day9 video will change pvt win stats by several bps in a single week)

To control all 4 of these I suggest mining for at least 1,000 players that:
1. Have players over 200 games
2. Played at least 30 games in the last 72 hours
3. Are in the diamond league

From this list, throw out all games involving a random player (less consistent MU performance), everything older than most recent 30 games, and and calculate the group's median win ratio using the most recent 30 games. Keep the 100 players of each race with win ratios closest to the median win ratio and throw out the other 700 players games. For instance if the 1,000 players have win ratios ranging from 35% to 90% (in most recent 30 games), with median of 55%, then pick the 100 zerg, protoss, and terran players who are closest to 55%. From the remaining 300*30 games, a simple win/loss record for each MU will be about the best possible indication of imbalance I believe data mining can come up with (short of using the same methodology with more games or tweaked ratios).

I wanted to say this, but feared the backlash of "NO We has psience we is wright". The methodology of the observational study is flawed in a few ways. For one, I don't think you are considering any confounding variables such as how the ranking system comes into effect. If one race is overpowered then it's simple to assume that it will be overrepresented in relation to its total population within the top of diamond rank only. But this could also be confounded by how people think Terran is the strongest race, so the more serious players switch over to that race thinking this is true. In addition, the sample sizes here are very small.

I would like to hear what a statistician, or Blizzard statistician has to say about the data.

edit: Oh I see, the OP understands that the matchmaking systems kind of voids his analysis. I'm sorry if I sounded harsh. Great effort put into this.


I am a statistician and stand by that methodology as a reasonable indicator of racial balance.

The win rate does not prove imbalance. Would a "perfectly balanced" Starcraft 2 have perfect 50% win rates? No.

It absolutely would assuming a control for skill and the matchmaking system which can be approximated in the study.
Anomandaris
Profile Joined July 2010
Afghanistan440 Posts
August 19 2010 13:01 GMT
#198
statistique!=science
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 15:02 GMT
#199
One question that I always wanted to ask, what is your definition of imbalance?

On August 19 2010 08:12 GagnarTheUnruly wrote:
I have reasoned that if the game is imbalanced, that imbalance must manifest as either 1) a difference in win ratios for the different races or 2) a difference in race prevalence as you increase player skill level, except under one of several unlikely scenarios and one likely one.

The degree to which it shifts from condition 1) to condition 2) depends on the strength of the matchmaking system. Since we don't see 1) (as my data show), and we don't see 2) (as I have said and escapeartist has shown), we can conclude that the game is balanced, at least for regular league play.

I'm not sure how you can be confident of 2.

Win ratios should be controlled by point levels in a nice random matchmaking system. In your study, you see evidence of higher win rates for players of higher skill levels. Skill level is continuous (not all players in the same league are of the same level). Average win ratios will not match in a league unless the skill distribution of the races in that league allows for that.

First element in figuring out the skill distribution is selection biases. You have to figure out who chooses certain races and why. The population that picks zerg and the population that picks terran as their favorite are different unless you can prove otherwise.

The second element in figuring out the skill distribution is the learning curve. A difference in racial prevalence at any particular skill level is more a function of how steep the learning curve is relative to normal at that particular skill level.

A skill ceiling is any point where skill curve is really steep.
Moderator我们是个踏实的赞助商模式俱乐部
escapeArtist
Profile Joined August 2010
Norway2 Posts
Last Edited: 2010-08-19 16:05:20
August 19 2010 16:03 GMT
#200
On August 20 2010 00:02 TanGeng wrote:

First element in figuring out the skill distribution is selection biases. You have to figure out who chooses certain races and why. The population that picks zerg and the population that picks terran as their favorite are different unless you can prove otherwise.

A skill ceiling is any point where skill curve is really steep.


It may be true that different people pick different races, but this is not neccesary relevant to the imbalance question. Unless you prove that personality equals skill or that skill equals race, then I don't see why this is relevant.

I also don't see how skill curve equals imbalance. I do agree that zerg needs higher apm than say terrans, but if they perform at the same level then I still doesn't see any imbalance issues, since unless proven different we must assume that they have hit the relative skill cap when playing in the upper diamond level. As I have shown in the upper diamond level Zerg is gaining in population. And withouth any intelligent discussion we can safely assume that random takes more skill than ALL the other races. Even them are gaining in popularity in lower diamond leage. This alone strongly support my statement that difficulty is not equal to balance.

My point here is that there are soooo many people crying imbalance, but still I have seen no evidence of this. OP has tried to find evidence of this, and I have tried to find evidence of this, but both of us came up with nothing. As a result of this both of us seem to be leaning towards thinking that the game is balanced.

I personally think that you are going about the wrong way if you are trying to tell us that statistics is not the right way to do it. Let's just say it's your turn to try and prove the inbalance. Or atleast give us some new data sources. Afterall you seem to be convinced that there is imbalance, but all you seem to base it on is personal opinions.

If I understand you correct then the race in question(Zerg) are picked by the best players since they seem to perform on all levels of play, and poor players(Terran) are performing on equal level to them bequase of the difference in imbalance.

I find this vey hard to swallow considering we are dealing with over 500 000 players. Also why would the best players do this? I have seen no reasoning as to why all the "best" players would pick the "worst" race.

To my experience I must say that if it smells like shit, looks like shit and tastes like shit. It's probably shit....
Prev 1 8 9 10 11 12 13 Next All
Please log in or register to reply.
Live Events Refresh
OSC
14:00
King of the Hill #219
davetesta25
Liquipedia
Esports World Cup
11:00
2025 - Final Day
Serral vs ClassicLIVE!
EWC_Arena32492
ComeBackTV 5048
TaKeTV 1355
JimRising 960
Hui .766
3DClanTV 642
Fuzer 473
EnkiAlexander 338
Rex266
CranKy Ducklings180
Reynor149
SpeCial84
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
EWC_Arena32492
JimRising 960
Hui .766
Fuzer 473
Rex 266
UpATreeSC 161
Reynor 149
SpeCial 84
JuggernautJason23
StarCraft: Brood War
Shuttle 2977
Bisu 2657
Mini 913
Larva 800
Soma 392
actioN 339
EffOrt 257
Rush 128
TY 102
JYJ77
[ Show more ]
Shine 68
sorry 32
sas.Sziky 31
Aegong 26
yabsab 24
zelot 19
Terrorterran 16
JulyZerg 11
Sacsri 9
NaDa 6
soO 5
Dota 2
420jenkins706
League of Legends
Trikslyr61
Counter-Strike
fl0m3812
sgares489
oskar211
Heroes of the Storm
Liquid`Hasu51
Other Games
gofns9498
tarik_tv2956
FrodaN2318
Beastyqt768
KnowMe188
crisheroes103
Organizations
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 15 non-featured ]
StarCraft 2
• IndyKCrew
• sooper7s
• AfreecaTV YouTube
• Migwel
• intothetv
• LaughNgamezSOOP
• Kozan
StarCraft: Brood War
• FirePhoenix4
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
Dota 2
• WagamamaTV976
League of Legends
• Nemesis4220
Other Games
• imaqtpie822
• Shiphtur350
Upcoming Events
CranKy Ducklings
15h 50m
BSL20 Non-Korean Champi…
19h 50m
CSO Cup
21h 50m
BSL20 Non-Korean Champi…
23h 50m
Bonyth vs Sziky
Dewalt vs Hawk
Hawk vs QiaoGege
Sziky vs Dewalt
Mihu vs Bonyth
Zhanhun vs QiaoGege
QiaoGege vs Fengzi
FEL
1d 14h
BSL20 Non-Korean Champi…
1d 19h
BSL20 Non-Korean Champi…
1d 23h
Bonyth vs Zhanhun
Dewalt vs Mihu
Hawk vs Sziky
Sziky vs QiaoGege
Mihu vs Hawk
Zhanhun vs Dewalt
Fengzi vs Bonyth
Sparkling Tuna Cup
3 days
Online Event
3 days
uThermal 2v2 Circuit
4 days
[ Show More ]
The PondCast
5 days
Replay Cast
6 days
Liquipedia Results

Completed

CSL Xiamen Invitational
Championship of Russia 2025
Murky Cup #2

Ongoing

Copa Latinoamericana 4
Jiahua Invitational
BSL20 Non-Korean Championship
Esports World Cup 2025
CC Div. A S7
Underdog Cup #2
IEM Cologne 2025
FISSURE Playground #1
BLAST.tv Austin Major 2025
ESL Impact League Season 7
IEM Dallas 2025
PGL Astana 2025
Asian Champions League '25

Upcoming

CSLPRO Last Chance 2025
ASL Season 20: Qualifier #1
ASL Season 20: Qualifier #2
ASL Season 20
CSLPRO Chat StarLAN 3
BSL Season 21
RSL Revival: Season 2
Maestros of the Game
SEL Season 2 Championship
uThermal 2v2 Main Event
FEL Cracov 2025
HCC Europe
ESL Pro League S22
StarSeries Fall 2025
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025
BLAST Bounty Fall Qual
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.