• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 12:41
CEST 18:41
KST 01:41
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Team TLMC #5 - Finalists & Open Tournaments0[ASL20] Ro16 Preview Pt2: Turbulence3Classic Games #3: Rogue vs Serral at BlizzCon9[ASL20] Ro16 Preview Pt1: Ascent10Maestros of the Game: Week 1/Play-in Preview12
Community News
Weekly Cups (Sept 8-14): herO & MaxPax split cups2WardiTV TL Team Map Contest #5 Tournaments1SC4ALL $6,000 Open LAN in Philadelphia7Weekly Cups (Sept 1-7): MaxPax rebounds & Clem saga continues29LiuLi Cup - September 2025 Tournaments3
StarCraft 2
General
Weekly Cups (Sept 8-14): herO & MaxPax split cups SpeCial on The Tasteless Podcast Team TLMC #5 - Finalists & Open Tournaments Weekly Cups (Sept 1-7): MaxPax rebounds & Clem saga continues #1: Maru - Greatest Players of All Time
Tourneys
WardiTV TL Team Map Contest #5 Tournaments Maestros of The Game—$20k event w/ live finals in Paris RSL: Revival, a new crowdfunded tournament series Sparkling Tuna Cup - Weekly Open Tournament SC4ALL $6,000 Open LAN in Philadelphia
Strategy
Custom Maps
External Content
Mutation # 491 Night Drive Mutation # 490 Masters of Midnight Mutation # 489 Bannable Offense Mutation # 488 What Goes Around
Brood War
General
BW General Discussion [ASL20] Ro16 Preview Pt2: Turbulence BGH Auto Balance -> http://bghmmr.eu/ ASL20 General Discussion Playing StarCraft as 2 people on the same network
Tourneys
[ASL20] Ro16 Group C [ASL20] Ro16 Group B [IPSL] ISPL Season 1 Winter Qualis and Info! Is there English video for group selection for ASL
Strategy
Simple Questions, Simple Answers Muta micro map competition Fighting Spirit mining rates [G] Mineral Boosting
Other Games
General Games
Path of Exile General RTS Discussion Thread Stormgate/Frost Giant Megathread Nintendo Switch Thread Borderlands 3
Dota 2
Official 'what is Dota anymore' discussion LiquidDota to reintegrate into TL.net
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread
Community
General
US Politics Mega-thread Things Aren’t Peaceful in Palestine Canadian Politics Mega-thread Russo-Ukrainian War Thread The Big Programming Thread
Fan Clubs
The Happy Fan Club!
Media & Entertainment
Movie Discussion! [Manga] One Piece Anime Discussion Thread
Sports
2024 - 2026 Football Thread Formula 1 Discussion MLB/Baseball 2023
World Cup 2022
Tech Support
Linksys AE2500 USB WIFI keeps disconnecting Computer Build, Upgrade & Buying Resource Thread High temperatures on bridge(s)
TL Community
BarCraft in Tokyo Japan for ASL Season5 Final The Automated Ban List
Blogs
The Personality of a Spender…
TrAiDoS
A very expensive lesson on ma…
Garnet
hello world
radishsoup
Lemme tell you a thing o…
JoinTheRain
RTS Design in Hypercoven
a11
Evil Gacha Games and the…
ffswowsucks
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1594 users

Ladder-Balance-Data - Page 23

Forum Index > SC2 General
Post a Reply
Prev 1 21 22 23 24 25 26 Next All
VediVeci
Profile Joined October 2011
United States82 Posts
July 13 2012 09:11 GMT
#441
On July 13 2012 13:40 lolcanoe wrote:
Show nested quote +
On July 13 2012 13:03 VediVeci wrote:

Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly.
Edit: clarity

He had at least a 50% chance of getting it right. I'm going to ignore the rest of the post has to not encourage further irrelevance from posters who self-admittedly don't read things carefully.




Thats the sort of stuff I'm talking about. Whether or not I gave the math in your post a thorough reading was irrelevant to mine, because I have been reading through almost everything else you've posted.

You talk down to everybody, and at least 3 people have called you out on it so far. Constructive criticism is great, but don't be so damn rude about it. This was a pretty respectful discussion, no need to be so vituperative.
Alexj
Profile Blog Joined July 2010
Ukraine440 Posts
Last Edited: 2012-07-13 11:35:04
July 13 2012 11:27 GMT
#442
On July 11 2012 04:01 skeldark wrote:
There is one other method that you can use to show trends:
you look at the change of mmr of an race over time!
Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.

I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine )

Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything
More GGs, more skill
Jadoreoov
Profile Joined December 2009
United States76 Posts
July 13 2012 12:24 GMT
#443
On July 13 2012 17:31 Thrombozyt wrote:
Show nested quote +
On July 13 2012 10:23 Jadoreoov wrote:
First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.

Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group.

US and EU:
ZvT
(51.5, 118.8)
PvT
(28.9, 99.6)
ZvP
(-11.1, 53.2)


Show nested quote +
On July 13 2012 11:20 Jadoreoov wrote:
Done:

95% confidence intervals for the EU and US combined:
ZvT:
(59.5, 110.7)
PvT
(37.3, 91.2)
ZvP
(-3.7, 45.5)

US vs EU
(28.5, 70.5)


Shouldn't the interval in which the mean can fall become larger as you lower your level of confidence?


No, the 95% confidence interval should be smaller.

It is similar to if someone asked you to guess a number between 0 and 100.
If you guessed that it was exactly 50 you wouldn't be very confident. (low confidence interval)
If you guessed that it was between 1 and 99 you would be pretty confident that you'd be correct (high confidence interval)

In each calculation the data itself gives us the same amount of uncertainty, so to be more confident in our interval we have to include a greater range of values.
skeldark
Profile Joined April 2010
Germany2223 Posts
July 13 2012 13:31 GMT
#444
On July 13 2012 20:27 Alexj wrote:
Show nested quote +
On July 11 2012 04:01 skeldark wrote:
There is one other method that you can use to show trends:
you look at the change of mmr of an race over time!
Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.

I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine )

Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything


Eu and na mmr are very close to each other.
I have user that have eu and us accounts and have very similar mmr on both accounts

The data is from 3 weeks not more.


Save gaming: kill esport
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 14:33:23
July 13 2012 13:46 GMT
#445
On July 13 2012 14:34 Cascade wrote:
Ok, let me prove it for you then.
My claim is that if the set of samples is large enough, we can use the normal distribution with S/sqrt(N) width to estimate the errors. For simplicity, let me prove that the 2*S/sqrt(N) interval is close to 95%:

Let the distribution f(x) have an average 0 and standard deviation S. An average X from a sufficiently large (specified in the proof) set of N samples from f(x) will fall within 2*S/sqrt(N) of the average 0 with a probability between 0.93 and 0.97.
proof:
Calculating the average x from N samples (from many different sets, each of N samples) will give a distribution of averages A_N(x) that approaches a normal distribution as N goes to infinity, centred around 0, and with a width of S/sqrt(N). This is the CLT.

Specify "sufficiently large N" such that A_N(x) is similar to a normal distribution g(x) of width S/sqrt(N). Close enough so that the integral from -2*S/sqrt(N) to 2*S/sqrt(N) is between 0.97 and 0.93 (it is close to 0.95 for g). As A_N approaches g as N-->infty, this will happen for some N. The more similar f(x) is to a normal distribution, the lower N is required.

Now take a single average X from f(x), using N samples (this would be the OP). This average is distributed according to A_N(x), and with a sufficiently large N, the probability that X is between -2*S/sqrt(N) and 2*S/sqrt(N) is larger than 0.93, and smaller than 0.97. QED.

No, reread part A. The claim that the distribution that the sample distribution approaches normality only applies when the population data itself is normal. This is extraordinarily intuitive as you watch your sample size approach the entire population. In your claim here, you used standard distribution around a known average to describe a population. In our data, we do not know if SD's can be applied to the population, as the SDs we are calculating are really only accurate for Gaussian distributions.

It is a common misapplication of CLT to state that a sample size of 30 guarantees approximate normality. This iteration tends to be true only because populations tend to be normally distributed. To be mathematically precise, the correct statement is that with a sufficient amount of samples of at least size ~ 30, the distribution of the means of these samples will begin approaching normality, with only a slight regard to the original distribution.

The normality test is essential when running the two-side t test if you want to be thorough when dealing with an unknown population distribution. The textbook, wiki, and other websites have confirmed it. I do not understand why this question persists.

Edit: I should further add that the tendency of sampling means and samples themselves in normal populations to approach normality only occurs when the sample is RANDOMLY procured. In this case it is clearly NOT random (we have different population means vs sample means), so the normality test is ABSOLUTELY a reasonable thing to be concerned about.
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 14:37:32
July 13 2012 14:33 GMT
#446
To me, what the result here indicates is the opposite of what a lay person would think from reading the post.

I'd read your results as "this means that Terran players in general have a lower MMR". But based on your data:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"

What the above seems to imply is that, although the average player included in the study has a smaller MMR, as you go higher, MMR seems to be higher for Terrans than other races. In particular, Mean + 2*STDev = (Cutoff for Top 5% of Normal Distribution) is:

T 2651.47
P 2639.92
Z 2662.75

Giving us much different looking results. As we strive to study based on arbitrarily good players (as player skill increases over time), I would think we'd want to look more heavily at analysis of the implications of Terran's higher STDEV.

A question

Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.

Edit: Link to the test I'm referring to: http://www.cithep.caltech.edu/~fcp/statistics/hypothesisTest/PoissonConsistency/ScholzStephens1987.pdf.
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 15:00:37
July 13 2012 14:59 GMT
#447
On July 13 2012 23:33 Treehead wrote:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"


Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.


I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest.

Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant).

I hope that addresses your concerns.


skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-13 16:43:40
July 13 2012 16:42 GMT
#448
Ok i have an hard question for you guys.

If i want to publish average mmr of the data in timeline.
What is the minimum value of profiles to still be accurate ?
Someone can test a weekly / monthly update ?





Save gaming: kill esport
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 18:06:50
July 13 2012 18:01 GMT
#449
On July 13 2012 23:59 lolcanoe wrote:
Show nested quote +
On July 13 2012 23:33 Treehead wrote:

Analysis

"Terran Average MMR, STD
1559.214909, 546.131097

Protoss Average MMR, STD
1620.764863, 509.5809733

Zerg Average MMR, STD
1672.129547, 495.3121321"


Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.

If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.


I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest.

Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant).

I hope that addresses your concerns.




My bad - you already did some of the work I suggested. Honestly, I didn't read most of the thread terribly closely except the OP, which I read over a couple times to make sure he hadn't posted anything definitive on this.

Here's the thing though. Maybe you'll get better p-values to convince ourselves of normality. But maybe you won't. 0.05-0.1 isn't bad, and if the T-test returns as good a result as stated in the OP, I doubt you'll get worse than .05 on the Anderson-Darling test if the thing is anywhere close to normal. My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.

I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.

And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.

For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit, and combining pdfs always gives us weird looking data.

Edit: I don't mean to be dismissive here. The work done is really great (and far better than other stats workups I've seen on these boards), deserves credit and it does have some meaning to it. I only include this in the discussion above for the sake of good bookkeeping on assumptions.

Also, maybe if more data is continues to be gathered, enough will be obtained to use the data as a time series (which it is), rather than as a sample. Just some thoughts. Keep up the good analysis, though. I liked reading all this. Good to see some other quanty nerds in here.
lolcanoe
Profile Joined July 2010
United States57 Posts
Last Edited: 2012-07-13 18:15:53
July 13 2012 18:14 GMT
#450
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it.

On July 14 2012 03:01 Treehead wrote:
My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.

My experience is that the A-D test is actually not as common as you think, especially given it's tremendous sensitivity at high sample values. It's much more common to show a fitted histogram as I've done to show that approximate normality is fufilled.

The purpose here is simply to show that the SD's are relevant calculations. If 1 SD cover 68% of the normalized data, but in actuality 72% of the real data, it's not a terrible problem when you're making observations over 3 SD's down the line, as the majority of your error is going to be somewhat centralized.

On July 14 2012 03:01 Treehead wrote:
I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.

Yes. But defining effectively significant here is difficult.

On July 14 2012 03:01 Treehead wrote:
And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.

The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...

On July 14 2012 03:01 Treehead wrote:
For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit.

You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.
skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-13 18:24:38
July 13 2012 18:21 GMT
#451
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it.

when the day comes i install exel, i buy a mac, quit programming and dont look in the mirror again...

I willl wait and split the data into timelines in near future if it works out i just go on from there.
The problem is, i get new user and loose old, so my data-income is not as stable as i wish.


Save gaming: kill esport
Treehead
Profile Blog Joined November 2010
999 Posts
Last Edited: 2012-07-13 19:22:03
July 13 2012 19:21 GMT
#452
On July 14 2012 03:14 lolcanoe wrote:

The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...

...

You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.


Of course I'll be cautious. When confidence cannot accurately be assessed, people tend to be overconfident when the idea is their own and overcritical when it isn't. I'd be foolish to ignore that and proceed as though I were right about my "multiple distributions" theory.

If I were right though, it wouldn't be statistically provable without knowing more about each data and qualitatively categorizing different types of games into different categories - which a person couldn't really do for thousands of games without a lot more involved. You could try to place the games in some kind of pockets based on what info is known (such as time) and perform some kind of goodness-of-fit analysis, but fitness and disparity never proves a theory, it only shows that the data is what a theory would expect - which is less than useful. When something is not statistically provable, then, it must remain as theory. You have to admit, though, that the idea of varying MMR pdfs for varying builds in varying matchups is at least qualitatively plausible, I hope.

The paragraph you mention that has "left the scope of the study" was just a random example illustrating my theory. Don't read more into it than that.
cndaks
Profile Joined June 2012
United States95 Posts
July 14 2012 02:23 GMT
#453
Nice Job in taking the time to do so and informing all of us!~
xelnaga_empire
Profile Joined March 2012
627 Posts
July 15 2012 04:31 GMT
#454
This data shows Blizzard needs to buff Terran to bring back balance to the game. I hope somebody at Blizzard looks at this data because they need to realize the game has balance issues at this moment.
themell
Profile Joined February 2011
43 Posts
July 15 2012 07:27 GMT
#455
Is it possible to see what average time it takes for a race to win?

For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game.
Crashburn
Profile Blog Joined October 2010
United States476 Posts
July 15 2012 07:29 GMT
#456
@ xelnaga_empire

ಠ_ಠ

[image loading]

[image loading]

[image loading]

[image loading]
skeldark
Profile Joined April 2010
Germany2223 Posts
July 15 2012 07:33 GMT
#457
On July 15 2012 16:27 themell wrote:
Is it possible to see what average time it takes for a race to win?

For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game.

yes
even way more accurate.
I dont have time at the moment but the data is there
Save gaming: kill esport
skeldark
Profile Joined April 2010
Germany2223 Posts
Last Edited: 2012-07-15 11:56:02
July 15 2012 11:53 GMT
#458
Update the result with a lot of stats:

Result


Source Main Data
+ Show Spoiler +

- The data is biased towards EU/US and towards higher skill-rate.

Gamescount: 125976
Sc2-Accounts: 45203

-worst to best player: 3200 MMR
-one average win/loose on Ladder: +16 / -16 MMR

TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT


Average MMR per Race
+ Show Spoiler +
Race account count: 15814
Data average MMR: 1539.46

Difference in average MMR per Matchup:
T-P: -62.14
T-Z: -117.03
P-Z: -54.89




Average Win-ratio per Race
+ Show Spoiler +


TvP 50.43 Games: 6700
TvZ 46.7 Games: 8118
PvZ 51.61 Games 9189



Win-ratio per Race over Game-Time
+ Show Spoiler +

TvP

gamelength,%race1 win,%race2win, %of games
0,44.9,55.1,3.66
5,40.71,59.29,13.9
10,58.32,41.68,24.21
15,59.7,40.3,24.78
20,45.72,54.28,18.31
25,37.79,62.21,9.16
30,35.04,64.96,3.49
35,46.71,53.29,2.49

TvZ
gamelength,%race1 win,%race2win, %of games
0,37.13,62.87,3.78
5,33.78,66.22,9.15
10,46.91,53.09,15.96
15,52.51,47.49,22.12
20,47.88,52.12,22.9
25,44.36,55.64,14.3
30,50.0,50.0,6.65
35,48.08,51.92,5.12

PvZ

gamelength,%race1 win,%race2win, %of games
0,47.38,52.62,4.57
5,38.3,61.7,11.39
10,59.72,40.28,25.07
15,50.17,49.83,25.36
20,49.97,50.03,17.34
25,53.21,46.79,9.14
30,51.0,49.0,4.37
35,58.89,41.11,2.75
Save gaming: kill esport
Methy
Profile Joined November 2010
United Kingdom74 Posts
July 15 2012 13:13 GMT
#459
This is fantastic work, well done.

I'd just like to make the (obvious) point that the concept of an 'instantaneous balance' is a bad one that should be ignored. As skeldark has said many times, one of the ways to detect imbalance is to track the MMR of the player base over time - I'd argue that this is the only reasonable way to do it. A sufficiently large sample of games determined in a small time period is rather meaningless for the 'balance' of a game, especially due to the competitive nature and the way balance is completely tied to perception.

To give an example, if you had built a sample of games in the month following the NASL season 1 final, you probably would've seen an 'imbalance' in TvP - players of those races that had equal MMRs before Puma unveiled 1/1/1 would not have a 50% winrate once 1/1/1 became common. As such there would be a short term spike in TvP winrates, and the Protoss average MMR would drop until this winrate normalised to some extent. This would produce a corresponding rise in PvZ winrates as Protoss players are getting matched against zergs with a lower MMR than they're used to facing and nothing significant has changed in the matchup.

As such a development in the TvP matchup influences PvZ winrates and this happens fairly consistantly at all MMR ranges (with the possible exception of the bottom end MMR range). The only way you can distinguish the development of 1/1/1 from 'imbalance' in PvZ is by monitoring the MMRs over a sufficiently large time.

Furthermore does this mean the game is 'imbalanced'? Not even remotely. 1/1/1 was eventually solved without significant patching (immortal range is the only really important change), but before the solution was found no one could claim to know a solution would be found, so how could we comment on balance? Well we couldn't at the time... we needed to let games be played over a long enough period, then, if after months and months of 1/1/1 dominance we could possibly conclude that that particular 'strategy' was overpowered.

But the crucial point is that this works in the other direction as well. Let's assume that all 3 races have a player base with identical MMR distributions and all matchups havea 50-50 winrate. This doesn't mean the game is 'balanced' - someone might think up a strategy that causes one race to gain a significant advantage and is never overcome. Thus to determine 'balance' we need to be analysing a period of years, not months - a position we are now easily able to monitor thanks to skeldarks efforts.

But the main point I'm trying to make here is that balance is actually largely tied to perception and nothing more. The root of the problem lies in the fact that we're using one word to describe multiple concepts. If we say 'players of equal skill should get to where they are regardless of race choice,' we are being utterly foolish. What is meant by skill? Sheer mechanical speed? Strategising ability? On-the-fly decision making? There are so many factors of what constitutes 'skill' that you can't possible keep a general universal decision.

In fact I'd like to explicitly make the point that it is a BAD thing if a player gets to exactly the same MMR with all three races - this is a sign of a one dimensional game. I am a person possessed of certain abilities - those abilities happen to align with the skillset required by one particular race more than the others - hence I play that race, and accept that if I switch race I will not perform as well.

If we then ignore people using balance whine as a crutch to justify their own poor performance, we can only begin to talk about balance 'at the highest levels of the game.' The beauty of the game lies in the fact that 'balance' is inseparable from the 'distribution' of human abilities. If we genuinely cared about the game being balanced, we would have to care about 'the best possible player of starcraft 2' - which would undoubtedly be a computer ai possessed of unlimited apm that we don't quite have the ability to code yet. All we truly care about is A) the perception that over a sufficient period of time all three of the races perform 'equally well' at the highest level of human ability (ie tournaments) and B) active innovation is occuring.

I realise I've ranted on for quite some time and I must apologise, but +10 points if you managed to read this entire post.
<3 Nony
Methy
Profile Joined November 2010
United Kingdom74 Posts
Last Edited: 2012-07-15 13:24:05
July 15 2012 13:16 GMT
#460
I'd actually just like to follow up with a far more simple single statement that I believe cuts right to the point:

If you do not believe that the 'overpowered and imbalanced' race is the one you are playing then you've chosen the wrong race - balance is a function of the skill set required by a particular race matched to the corresponding distribution of skills in the human population. Your race should always feel like the 'easiest race' for *any* player at *any* skill level, or your abilities simply do not match up with those required by the race you've chosen. As such the best way to determine 'balance' is actually just to look at the percentage of players in each race over a long period of time as long as there are equal numbers of each race at any given bracket, then you can flat out conclude the game is 'balanced' in the only meaningful sense of the word.
<3 Nony
Prev 1 21 22 23 24 25 26 Next All
Please log in or register to reply.
Live Events Refresh
Monday Night Weeklies
16:00
#23
RotterdaM305
TKL 186
IndyStarCraft 132
SteadfastSC71
Liquipedia
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Harstem 413
RotterdaM 305
TKL 186
IndyStarCraft 132
SteadfastSC 71
PiGStarcraft57
MindelVK 39
UpATreeSC 39
Codebar 6
StarCraft: Brood War
Calm 4140
Rain 2772
Sea 1773
Shuttle 1408
EffOrt 1163
firebathero 362
Stork 359
BeSt 233
ggaemo 166
Hyuk 132
[ Show more ]
Rush 117
Mong 84
hero 82
JYJ81
zelot 74
Mind 61
sas.Sziky 51
Dewaltoss 43
soO 27
Noble 23
Movie 22
SilentControl 18
Rock 15
Terrorterran 15
ajuk12(nOOB) 13
yabsab 12
Shine 12
sSak 9
Hm[arnc] 7
IntoTheRainbow 4
Dota 2
Fuzer 281
Other Games
tarik_tv24050
gofns21054
B2W.Neo911
ceh9461
Beastyqt366
crisheroes328
Lowko301
FrodaN250
XaKoH 202
flusha180
Liquid`VortiX160
KnowMe96
QueenE88
Trikslyr69
NeuroSwarm43
Mew2King38
Organizations
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 16 non-featured ]
StarCraft 2
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• OhrlRock 0
• sooper7s
StarCraft: Brood War
• FirePhoenix5
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• Ler100
League of Legends
• Jankos1490
• Nemesis1374
Other Games
• Shiphtur306
Upcoming Events
OSC
7h 19m
Sparkling Tuna Cup
17h 19m
Afreeca Starleague
17h 19m
Light vs Speed
Larva vs Soma
2v2
18h 19m
PiGosaur Monday
1d 7h
LiuLi Cup
1d 18h
RSL Revival
2 days
Maru vs Reynor
Cure vs TriGGeR
The PondCast
2 days
RSL Revival
3 days
Zoun vs Classic
Korean StarCraft League
4 days
[ Show More ]
RSL Revival
4 days
[BSL 2025] Weekly
5 days
BSL Team Wars
5 days
RSL Revival
5 days
Online Event
5 days
Wardi Open
6 days
Liquipedia Results

Completed

BSL 20 Team Wars
Chzzk MurlocKing SC1 vs SC2 Cup #2
HCC Europe

Ongoing

KCM Race Survival 2025 Season 3
BSL 21 Points
ASL Season 20
CSL 2025 AUTUMN (S18)
LASL Season 20
RSL Revival: Season 2
Maestros of the Game
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025
BLAST Bounty Fall Qual
IEM Cologne 2025
FISSURE Playground #1

Upcoming

2025 Chongqing Offline CUP
BSL Polish World Championship 2025
IPSL Winter 2025-26
BSL Season 21
SC4ALL: Brood War
BSL 21 Team A
Stellar Fest
SC4ALL: StarCraft II
EC S1
ESL Impact League Season 8
SL Budapest Major 2025
BLAST Rivals Fall 2025
IEM Chengdu 2025
PGL Masters Bucharest 2025
MESA Nomadic Masters Fall
Thunderpick World Champ.
CS Asia Championships 2025
ESL Pro League S22
StarSeries Fall 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.