• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 06:49
CET 11:49
KST 19:49
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
ByuL: The Forgotten Master of ZvT30Behind the Blue - Team Liquid History Book19Clem wins HomeStory Cup 289HomeStory Cup 28 - Info & Preview13Rongyi Cup S3 - Preview & Info8
Community News
2026 KongFu Cup Announcement3BGE Stara Zagora 2026 cancelled12Blizzard Classic Cup - Tastosis announced as captains15Weekly Cups (March 2-8): ByuN overcomes PvT block4GSL CK - New online series18
StarCraft 2
General
BGE Stara Zagora 2026 cancelled Blizzard Classic Cup - Tastosis announced as captains BGE Stara Zagora 2026 announced ByuL: The Forgotten Master of ZvT Terran AddOns placement
Tourneys
RSL Season 4 announced for March-April PIG STY FESTIVAL 7.0! (19 Feb - 1 Mar) Sparkling Tuna Cup - Weekly Open Tournament 2026 KongFu Cup Announcement [GSL CK] Team Maru vs. Team herO
Strategy
Custom Maps
Publishing has been re-enabled! [Feb 24th 2026] Map Editor closed ?
External Content
The PondCast: SC2 News & Results Mutation # 516 Specter of Death Mutation # 515 Together Forever Mutation # 514 Ulnar New Year
Brood War
General
BGH Auto Balance -> http://bghmmr.eu/ BSL 22 Map Contest — Submissions OPEN to March 10 ASL21 General Discussion Are you ready for ASL 21? Hype VIDEO Gypsy to Korea
Tourneys
[Megathread] Daily Proleagues [BSL22] Open Qualifiers & Ladder Tours IPSL Spring 2026 is here! ASL Season 21 Qualifiers March 7-8
Strategy
Simple Questions, Simple Answers Soma's 9 hatch build from ASL Game 2 Fighting Spirit mining rates Zealot bombing is no longer popular?
Other Games
General Games
Stormgate/Frost Giant Megathread Path of Exile Nintendo Switch Thread PC Games Sales Thread No Man's Sky (PS4 and PC)
Dota 2
Official 'what is Dota anymore' discussion The Story of Wings Gaming
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Five o'clock TL Mafia Mafia Game Mode Feedback/Ideas Vanilla Mini Mafia TL Mafia Community Thread
Community
General
US Politics Mega-thread Mexico's Drug War Russo-Ukrainian War Thread Things Aren’t Peaceful in Palestine NASA and the Private Sector
Fan Clubs
The IdrA Fan Club
Media & Entertainment
[Manga] One Piece Movie Discussion! [Req][Books] Good Fantasy/SciFi books
Sports
Formula 1 Discussion 2024 - 2026 Football Thread General nutrition recommendations Cricket [SPORT] TL MMA Pick'em Pool 2013
World Cup 2022
Tech Support
Laptop capable of using Photoshop Lightroom?
TL Community
The Automated Ban List
Blogs
Funny Nicknames
LUCKY_NOOB
Money Laundering In Video Ga…
TrAiDoS
Iranian anarchists: organize…
XenOsky
FS++
Kraekkling
Shocked by a laser…
Spydermine0240
Unintentional protectionism…
Uldridge
ASL S21 English Commentary…
namkraft
Customize Sidebar...

Website Feedback

Closed Threads



Active: 2705 users

Ladder-Balance-Data - Page 23

Forum Index > SC2 General
Post a Reply
Prev 1 21 22 23 24 25 26 Next All
VediVeci
Profile Joined October 2011
United States82 Posts
July 13 2012 09:11 GMT
#441
On July 13 2012 13:40 lolcanoe wrote:
Show nested quote +
On July 13 2012 13:03 VediVeci wrote:

Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly.
Edit: clarity

He had at least a 50% chance of getting it right. I'm going to ignore the rest of the post has to not encourage further irrelevance from posters who self-admittedly don't read things carefully.




Thats the sort of stuff I'm talking about. Whether or not I gave the math in your post a thorough reading was irrelevant to mine, because I have been reading through almost everything else you've posted.

You talk down to everybody, and at least 3 people have called you out on it so far. Constructive criticism is great, but don't be so damn rude about it. This was a pretty respectful discussion, no need to be so vituperative.
Alexj
Profile Blog Joined July 2010
Ukraine440 Posts
Last Edited: 2012-07-13 11:35:04
July 13 2012 11:27 GMT
#442
On July 11 2012 04:01 skeldark wrote:
There is one other method that you can use to show trends:
you look at the change of mmr of an race over time!
Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.

I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine )

Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything
More GGs, more skill
Jadoreoov
Profile Joined December 2009
United States76 Posts
July 13 2012 12:24 GMT
#443
On July 13 2012 17:31 Thrombozyt wrote:
Show nested quote +
On July 13 2012 10:23 Jadoreoov wrote:
First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.

Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group.

US and EU:
ZvT
(51.5, 118.8)
PvT
(28.9, 99.6)
ZvP
(-11.1, 53.2)


Show nested quote +
On July 13 2012 11:20 Jadoreoov wrote:
Done:

95% confidence intervals for the EU and US combined:
ZvT:
(59.5, 110.7)
PvT
(37.3, 91.2)
ZvP
(-3.7, 45.5)

US vs EU
(28.5, 70.5)


Shouldn't the interval in which the mean can fall become larger as you lower your level of confidence?


No, the 95% confidence interval should be smaller.

It is similar to if someone asked you to guess a number between 0 and 100.
If you guessed that it was exactly 50 you wouldn't be very confident. (low confidence interval)
If you guessed that it was between 1 and 99 you would be pretty confident that you'd be correct (high confidence interval)

In each calculation the data itself gives us the same amount of uncertainty, so to be more confident in our interval we have to include a greater range of values.
skeldark
Profile Joined April 2010
Germany2223 Posts
July 13 2012 13:31 GMT
#444
On July 13 2012 20:27 Alexj wrote:
Show nested quote +
On July 11 2012 04:01 skeldark wrote:
There is one other method that you can use to show trends:
you look at the change of mmr of an race over time!
Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.

I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine )

Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything


Eu and na mmr are very close to each other.
I have user that have eu and us accounts and have very similar mmr on both accounts

The data is from 3 weeks not more.


Save gaming: kill esport
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 14:33:23
July 13 2012 13:46 GMT
#445
On July 13 2012 14:34 Cascade wrote:
Ok, let me prove it for you then.
My claim is that if the set of samples is large enough, we can use the normal distribution with S/sqrt(N) width to estimate the errors. For simplicity, let me prove that the 2*S/sqrt(N) interval is close to 95%:

Let the distribution f(x) have an average 0 and standard deviation S. An average X from a sufficiently large (specified in the proof) set of N samples from f(x) will fall within 2*S/sqrt(N) of the average 0 with a probability between 0.93 and 0.97.
proof:
Calculating the average x from N samples (from many different sets, each of N samples) will give a distribution of averages A_N(x) that approaches a normal distribution as N goes to infinity, centred around 0, and with a width of S/sqrt(N). This is the CLT.

Specify "sufficiently large N" such that A_N(x) is similar to a normal distribution g(x) of width S/sqrt(N). Close enough so that the integral from -2*S/sqrt(N) to 2*S/sqrt(N) is between 0.97 and 0.93 (it is close to 0.95 for g). As A_N approaches g as N-->infty, this will happen for some N. The more similar f(x) is to a normal distribution, the lower N is required.

Now take a single average X from f(x), using N samples (this would be the OP). This average is distributed according to A_N(x), and with a sufficiently large N, the probability that X is between -2*S/sqrt(N) and 2*S/sqrt(N) is larger than 0.93, and smaller than 0.97. QED.

No, reread part A. The claim that the distribution that the sample distribution approaches normality only applies when the population data itself is normal. This is extraordinarily intuitive as you watch your sample size approach the entire population. In your claim here, you used standard distribution around a known average to describe a population. In our data, we do not know if SD's can be applied to the population, as the SDs we are calculating are really only accurate for Gaussian distributions.

It is a common misapplication of CLT to state that a sample size of 30 guarantees approximate normality. This iteration tends to be true only because populations tend to be normally distributed. To be mathematically precise, the correct statement is that with a sufficient amount of samples of at least size ~ 30, the distribution of the means of these samples will begin approaching normality, with only a slight regard to the original distribution.

The normality test is essential when running the two-side t test if you want to be thorough when dealing with an unknown population distribution. The textbook, wiki, and other websites have confirmed it. I do not understand why this question persists.

Edit: I should further add that the tendency of sampling means and samples themselves in normal populations to approach normality only occurs when the sample is RANDOMLY procured. In this case it is clearly NOT random (we have different population means vs sample means), so the normality test is ABSOLUTELY a reasonable thing to be concerned about.
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 14:37:32
July 13 2012 14:33 GMT
#446
To me, what the result here indicates is the opposite of what a lay person would think from reading the post.

I'd read your results as "this means that Terran players in general have a lower MMR". But based on your data:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"

What the above seems to imply is that, although the average player included in the study has a smaller MMR, as you go higher, MMR seems to be higher for Terrans than other races. In particular, Mean + 2*STDev = (Cutoff for Top 5% of Normal Distribution) is:

T 2651.47
P 2639.92
Z 2662.75

Giving us much different looking results. As we strive to study based on arbitrarily good players (as player skill increases over time), I would think we'd want to look more heavily at analysis of the implications of Terran's higher STDEV.

A question

Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.

Edit: Link to the test I'm referring to: http://www.cithep.caltech.edu/~fcp/statistics/hypothesisTest/PoissonConsistency/ScholzStephens1987.pdf.
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 15:00:37
July 13 2012 14:59 GMT
#447
On July 13 2012 23:33 Treehead wrote:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"


Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.


I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest.

Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant).

I hope that addresses your concerns.


skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-13 16:43:40
July 13 2012 16:42 GMT
#448
Ok i have an hard question for you guys.

If i want to publish average mmr of the data in timeline.
What is the minimum value of profiles to still be accurate ?
Someone can test a weekly / monthly update ?





Save gaming: kill esport
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 18:06:50
July 13 2012 18:01 GMT
#449
On July 13 2012 23:59 lolcanoe wrote:
Show nested quote +
On July 13 2012 23:33 Treehead wrote:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"


Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.


I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest.

Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant).

I hope that addresses your concerns.




My bad - you already did some of the work I suggested. Honestly, I didn't read most of the thread terribly closely except the OP, which I read over a couple times to make sure he hadn't posted anything definitive on this.

Here's the thing though. Maybe you'll get better p-values to convince ourselves of normality. But maybe you won't. 0.05-0.1 isn't bad, and if the T-test returns as good a result as stated in the OP, I doubt you'll get worse than .05 on the Anderson-Darling test if the thing is anywhere close to normal. My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.

I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.

And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.

For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit, and combining pdfs always gives us weird looking data.

Edit: I don't mean to be dismissive here. The work done is really great (and far better than other stats workups I've seen on these boards), deserves credit and it does have some meaning to it. I only include this in the discussion above for the sake of good bookkeeping on assumptions.

Also, maybe if more data is continues to be gathered, enough will be obtained to use the data as a time series (which it is), rather than as a sample. Just some thoughts. Keep up the good analysis, though. I liked reading all this. Good to see some other quanty nerds in here.
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 18:15:53
July 13 2012 18:14 GMT
#450
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it.

On July 14 2012 03:01 Treehead wrote:
My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.

My experience is that the A-D test is actually not as common as you think, especially given it's tremendous sensitivity at high sample values. It's much more common to show a fitted histogram as I've done to show that approximate normality is fufilled.

The purpose here is simply to show that the SD's are relevant calculations. If 1 SD cover 68% of the normalized data, but in actuality 72% of the real data, it's not a terrible problem when you're making observations over 3 SD's down the line, as the majority of your error is going to be somewhat centralized.

On July 14 2012 03:01 Treehead wrote:
I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.

Yes. But defining effectively significant here is difficult.

On July 14 2012 03:01 Treehead wrote:
And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.

The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...

On July 14 2012 03:01 Treehead wrote:
For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit.

You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.
skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-13 18:24:38
July 13 2012 18:21 GMT
#451
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it.

when the day comes i install exel, i buy a mac, quit programming and dont look in the mirror again...

I willl wait and split the data into timelines in near future if it works out i just go on from there.
The problem is, i get new user and loose old, so my data-income is not as stable as i wish.


Save gaming: kill esport
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 19:22:03
July 13 2012 19:21 GMT
#452
On July 14 2012 03:14 lolcanoe wrote:

The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...

...

You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.


Of course I'll be cautious. When confidence cannot accurately be assessed, people tend to be overconfident when the idea is their own and overcritical when it isn't. I'd be foolish to ignore that and proceed as though I were right about my "multiple distributions" theory.

If I were right though, it wouldn't be statistically provable without knowing more about each data and qualitatively categorizing different types of games into different categories - which a person couldn't really do for thousands of games without a lot more involved. You could try to place the games in some kind of pockets based on what info is known (such as time) and perform some kind of goodness-of-fit analysis, but fitness and disparity never proves a theory, it only shows that the data is what a theory would expect - which is less than useful. When something is not statistically provable, then, it must remain as theory. You have to admit, though, that the idea of varying MMR pdfs for varying builds in varying matchups is at least qualitatively plausible, I hope.

The paragraph you mention that has "left the scope of the study" was just a random example illustrating my theory. Don't read more into it than that.
cndaks
Profile Joined June 2012
United States95 Posts
July 14 2012 02:23 GMT
#453
Nice Job in taking the time to do so and informing all of us!~
xelnaga_empire
Profile Joined March 2012
627 Posts
July 15 2012 04:31 GMT
#454
This data shows Blizzard needs to buff Terran to bring back balance to the game. I hope somebody at Blizzard looks at this data because they need to realize the game has balance issues at this moment.
themell
Profile Joined February 2011
43 Posts
July 15 2012 07:27 GMT
#455
Is it possible to see what average time it takes for a race to win?

For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game.
Crashburn
Profile Blog Joined October 2010
United States476 Posts
July 15 2012 07:29 GMT
#456
@ xelnaga_empire

ಠ_ಠ

[image loading]

[image loading]

[image loading]

[image loading]
skeldark
Profile Joined April 2010
Germany2223 Posts
July 15 2012 07:33 GMT
#457
On July 15 2012 16:27 themell wrote:
Is it possible to see what average time it takes for a race to win?

For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game.

yes
even way more accurate.
I dont have time at the moment but the data is there
Save gaming: kill esport
skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-15 11:56:02
July 15 2012 11:53 GMT
#458
Update the result with a lot of stats:

Result


Source Main Data
+ Show Spoiler +

- The data is biased towards EU/US and towards higher skill-rate.

Gamescount: 125976
Sc2-Accounts: 45203

-worst to best player: 3200 MMR
-one average win/loose on Ladder: +16 / -16 MMR

TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT


Average MMR per Race
+ Show Spoiler +
Race account count: 15814
Data average MMR: 1539.46

Difference in average MMR per Matchup:
T-P: -62.14
T-Z: -117.03
P-Z: -54.89




Average Win-ratio per Race
+ Show Spoiler +


TvP 50.43 Games: 6700
TvZ 46.7 Games: 8118
PvZ 51.61 Games 9189



Win-ratio per Race over Game-Time
+ Show Spoiler +

TvP

gamelength,%race1 win,%race2win, %of games
0,44.9,55.1,3.66
5,40.71,59.29,13.9
10,58.32,41.68,24.21
15,59.7,40.3,24.78
20,45.72,54.28,18.31
25,37.79,62.21,9.16
30,35.04,64.96,3.49
35,46.71,53.29,2.49

TvZ
gamelength,%race1 win,%race2win, %of games
0,37.13,62.87,3.78
5,33.78,66.22,9.15
10,46.91,53.09,15.96
15,52.51,47.49,22.12
20,47.88,52.12,22.9
25,44.36,55.64,14.3
30,50.0,50.0,6.65
35,48.08,51.92,5.12

PvZ

gamelength,%race1 win,%race2win, %of games
0,47.38,52.62,4.57
5,38.3,61.7,11.39
10,59.72,40.28,25.07
15,50.17,49.83,25.36
20,49.97,50.03,17.34
25,53.21,46.79,9.14
30,51.0,49.0,4.37
35,58.89,41.11,2.75
Save gaming: kill esport
Methy
Profile Joined November 2010
United Kingdom74 Posts
July 15 2012 13:13 GMT
#459
This is fantastic work, well done.

I'd just like to make the (obvious) point that the concept of an 'instantaneous balance' is a bad one that should be ignored. As skeldark has said many times, one of the ways to detect imbalance is to track the MMR of the player base over time - I'd argue that this is the only reasonable way to do it. A sufficiently large sample of games determined in a small time period is rather meaningless for the 'balance' of a game, especially due to the competitive nature and the way balance is completely tied to perception.

To give an example, if you had built a sample of games in the month following the NASL season 1 final, you probably would've seen an 'imbalance' in TvP - players of those races that had equal MMRs before Puma unveiled 1/1/1 would not have a 50% winrate once 1/1/1 became common. As such there would be a short term spike in TvP winrates, and the Protoss average MMR would drop until this winrate normalised to some extent. This would produce a corresponding rise in PvZ winrates as Protoss players are getting matched against zergs with a lower MMR than they're used to facing and nothing significant has changed in the matchup.

As such a development in the TvP matchup influences PvZ winrates and this happens fairly consistantly at all MMR ranges (with the possible exception of the bottom end MMR range). The only way you can distinguish the development of 1/1/1 from 'imbalance' in PvZ is by monitoring the MMRs over a sufficiently large time.

Furthermore does this mean the game is 'imbalanced'? Not even remotely. 1/1/1 was eventually solved without significant patching (immortal range is the only really important change), but before the solution was found no one could claim to know a solution would be found, so how could we comment on balance? Well we couldn't at the time... we needed to let games be played over a long enough period, then, if after months and months of 1/1/1 dominance we could possibly conclude that that particular 'strategy' was overpowered.

But the crucial point is that this works in the other direction as well. Let's assume that all 3 races have a player base with identical MMR distributions and all matchups havea 50-50 winrate. This doesn't mean the game is 'balanced' - someone might think up a strategy that causes one race to gain a significant advantage and is never overcome. Thus to determine 'balance' we need to be analysing a period of years, not months - a position we are now easily able to monitor thanks to skeldarks efforts.

But the main point I'm trying to make here is that balance is actually largely tied to perception and nothing more. The root of the problem lies in the fact that we're using one word to describe multiple concepts. If we say 'players of equal skill should get to where they are regardless of race choice,' we are being utterly foolish. What is meant by skill? Sheer mechanical speed? Strategising ability? On-the-fly decision making? There are so many factors of what constitutes 'skill' that you can't possible keep a general universal decision.

In fact I'd like to explicitly make the point that it is a BAD thing if a player gets to exactly the same MMR with all three races - this is a sign of a one dimensional game. I am a person possessed of certain abilities - those abilities happen to align with the skillset required by one particular race more than the others - hence I play that race, and accept that if I switch race I will not perform as well.

If we then ignore people using balance whine as a crutch to justify their own poor performance, we can only begin to talk about balance 'at the highest levels of the game.' The beauty of the game lies in the fact that 'balance' is inseparable from the 'distribution' of human abilities. If we genuinely cared about the game being balanced, we would have to care about 'the best possible player of starcraft 2' - which would undoubtedly be a computer ai possessed of unlimited apm that we don't quite have the ability to code yet. All we truly care about is A) the perception that over a sufficient period of time all three of the races perform 'equally well' at the highest level of human ability (ie tournaments) and B) active innovation is occuring.

I realise I've ranted on for quite some time and I must apologise, but +10 points if you managed to read this entire post.
<3 Nony
Methy
Profile Joined November 2010
United Kingdom74 Posts
Last Edited: 2012-07-15 13:24:05
July 15 2012 13:16 GMT
#460
I'd actually just like to follow up with a far more simple single statement that I believe cuts right to the point:

If you do not believe that the 'overpowered and imbalanced' race is the one you are playing then you've chosen the wrong race - balance is a function of the skill set required by a particular race matched to the corresponding distribution of skills in the human population. Your race should always feel like the 'easiest race' for *any* player at *any* skill level, or your abilities simply do not match up with those required by the race you've chosen. As such the best way to determine 'balance' is actually just to look at the percentage of players in each race over a long period of time as long as there are equal numbers of each race at any given bracket, then you can flat out conclude the game is 'balanced' in the only meaningful sense of the word.
<3 Nony
Prev 1 21 22 23 24 25 26 Next All
Please log in or register to reply.
Live Events Refresh
RSL Revival
10:00
Season 4: Group D
ByuN vs SHIN
Maru vs Krystianer
Tasteless930
IndyStarCraft 133
Rex80
LiquipediaDiscussion
Sparkling Tuna Cup
10:00
Weekly #123
CranKy Ducklings57
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Tasteless 930
IndyStarCraft 133
Rex 80
StarCraft: Brood War
Sea 42316
Calm 14037
Horang2 1866
GuemChi 1802
BeSt 867
Jaedong 591
actioN 473
Soma 164
EffOrt 151
Last 131
[ Show more ]
Dewaltoss 126
Rush 120
Mini 118
ToSsGirL 89
Mind 63
Hm[arnc] 60
sorry 57
Backho 54
ZerO 48
Barracks 42
JulyZerg 32
IntoTheRainbow 31
NaDa 31
HiyA 23
GoRush 22
ivOry 13
SilentControl 8
Dota 2
Gorgc1935
XaKoH 514
XcaliburYe92
League of Legends
JimRising 482
Counter-Strike
zeus462
Super Smash Bros
Mew2King86
Heroes of the Storm
Khaldor267
MindelVK11
Other Games
B2W.Neo376
Fuzer 151
ZerO(Twitch)12
Organizations
Dota 2
PGL Dota 2 - Main Stream15120
Other Games
gamesdonequick881
ComeBackTV 278
StarCraft: Brood War
lovetv 22
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 14 non-featured ]
StarCraft 2
• 3DClanTV 63
• LUISG 51
• CranKy Ducklings SOOP3
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• C_a_k_e 1529
Upcoming Events
WardiTV Team League
1h 11m
Patches Events
6h 11m
BSL
9h 11m
GSL
21h 11m
Wardi Open
1d 1h
Monday Night Weeklies
1d 6h
WardiTV Team League
2 days
PiGosaur Cup
2 days
Kung Fu Cup
3 days
OSC
3 days
[ Show More ]
The PondCast
3 days
KCM Race Survival
3 days
WardiTV Team League
4 days
Replay Cast
4 days
KCM Race Survival
4 days
WardiTV Team League
5 days
Korean StarCraft League
5 days
uThermal 2v2 Circuit
6 days
BSL
6 days
Liquipedia Results

Completed

Proleague 2026-03-13
WardiTV Winter 2026
Underdog Cup #3

Ongoing

KCM Race Survival 2026 Season 1
Jeongseon Sooper Cup
BSL Season 22
RSL Revival: Season 4
Nations Cup 2026
ESL Pro League S23 Finals
ESL Pro League S23 Stage 1&2
PGL Cluj-Napoca 2026
IEM Kraków 2026
BLAST Bounty Winter 2026
BLAST Bounty Winter Qual

Upcoming

CSL Elite League 2026
ASL Season 21
Acropolis #4 - TS6
2026 Changsha Offline CUP
Acropolis #4
IPSL Spring 2026
CSLAN 4
Kung Fu Cup 2026 Grand Finals
HSC XXIX
uThermal 2v2 2026 Main Event
NationLESS Cup
Stake Ranked Episode 2
CS Asia Championships 2026
IEM Atlanta 2026
Asian Champions League 2026
PGL Astana 2026
BLAST Rivals Spring 2026
CCT Season 3 Global Finals
IEM Rio 2026
PGL Bucharest 2026
Stake Ranked Episode 1
BLAST Open Spring 2026
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2026 TLnet. All Rights Reserved.