• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EST 20:26
CET 02:26
KST 10:26
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
RSL Season 3 - Playoffs Preview0RSL Season 3 - RO16 Groups C & D Preview0RSL Season 3 - RO16 Groups A & B Preview2TL.net Map Contest #21: Winners12Intel X Team Liquid Seoul event: Showmatches and Meet the Pros10
Community News
Weekly Cups (Dec 1-7): Clem doubles, Solar gets over the hump1Weekly Cups (Nov 24-30): MaxPax, Clem, herO win2BGE Stara Zagora 2026 announced15[BSL21] Ro.16 Group Stage (C->B->A->D)4Weekly Cups (Nov 17-23): Solar, MaxPax, Clem win3
StarCraft 2
General
Weekly Cups (Dec 1-7): Clem doubles, Solar gets over the hump Chinese SC2 server to reopen; live all-star event in Hangzhou Maestros of the Game: Live Finals Preview (RO4) BGE Stara Zagora 2026 announced Weekly Cups (Nov 24-30): MaxPax, Clem, herO win
Tourneys
StarCraft2.fi 15th Anniversary Cup RSL Offline Finals Info - Dec 13 and 14! Tenacious Turtle Tussle Sparkling Tuna Cup - Weekly Open Tournament StarCraft Evolution League (SC Evo Biweekly)
Strategy
Custom Maps
Map Editor closed ?
External Content
Mutation # 503 Fowl Play Mutation # 502 Negative Reinforcement Mutation # 501 Price of Progress Mutation # 500 Fright night
Brood War
General
Let's talk about Metropolis [ASL20] Ask the mapmakers — Drop your questions BW General Discussion BGH Auto Balance -> http://bghmmr.eu/ Foreign Brood War
Tourneys
[Megathread] Daily Proleagues Small VOD Thread 2.0 [BSL21] RO16 Group D - Sunday 21:00 CET [BSL21] RO16 Group A - Saturday 21:00 CET
Strategy
Current Meta Game Theory for Starcraft How to stay on top of macro? PvZ map balance
Other Games
General Games
Stormgate/Frost Giant Megathread Nintendo Switch Thread EVE Corporation Path of Exile ZeroSpace Megathread
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Mafia Game Mode Feedback/Ideas Survivor II: The Amazon Sengoku Mafia TL Mafia Community Thread
Community
General
YouTube Thread Russo-Ukrainian War Thread US Politics Mega-thread European Politico-economics QA Mega-thread Things Aren’t Peaceful in Palestine
Fan Clubs
White-Ra Fan Club
Media & Entertainment
Anime Discussion Thread [Manga] One Piece Movie Discussion!
Sports
Formula 1 Discussion 2024 - 2026 Football Thread
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread
TL Community
TL+ Announced Where to ask questions and add stream? The Automated Ban List
Blogs
I decided to write a webnov…
DjKniteX
Physical Exertion During Gam…
TrAiDoS
James Bond movies ranking - pa…
Topin
Thanks for the RSL
Hildegard
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1938 users

What you don't know about statistics

Forum Index > SC2 General
Post a Reply
1 2 3 4 5 6 7 Next All
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
Last Edited: 2012-05-04 07:33:28
May 04 2012 00:00 GMT
#1
This post is originally written by Drabzalver on Reddit. Since he does not have a TL account I asked and was allowed to repost if I include this:

Drabzalver on Reddit:
TL is the [swear word] of praetentious [swear word] where moderators reward supposed 'high quality posts' which are full of statistical and scientific garbage I just outlined just because they are 'praesented nicely' and the mods basically think that any long post with a lot of images and no swear words is intellectually advanced while often a lot of it is total garbage filled with wrong interpretations and grave statistical errors.
Also, I don't have a TL account, you're welcome to repost, but do include this qualifier, should be fun,

Clarification: This is not my opinion of this community and I only posted the quote because I was asked to and respect his author rights. The content is very interesting.


EDIT: A lot of people claim the OP/author just wants to bash the TL community. That is wrong! The text is directed towrds the reddit Community and was originally called "What reddit doesn't know about statistics" with me liberately altering the title... I see this might have let to some confusion.
-----------------------------------------------------------------------------------------------------------------------------------

What you don't know about statistics
by Drabzalver

Not a day goes by without some stat or graph being posted here or whenever in the community, 99% of the time this deals with winrates and balance, 99% of the people commenting on it have no clue how to interpret these numbers and how little they actually mean, and no, this has nothing to do with sample size. Stats and probability theory is actually something you study on for years and is a specialized field to avoid people from making the errors people make here. So stay a while and listen:

- Fallacy: winrates indicate balance

Yes and no, for the most part, they do not indicate balance, rather, they indicate balance shifts. It's so trivial to see that it boggles me that people don't figure this out on their own. Assume that race X was actually underpowered last month and is balanced this month. This means that the X players who actually qualified last month and stayed in tournaments are actually better than the Y and Z players, therefore, now that it gets balanced, as they are overal better, they start smashing Y and Z because they are better, thereby suddenly making the graph appear as X has been 'overbuffed' or whatever else while simply the few X players that were around in the scene were better. This will continue on until the mediocre Y and Z players who were more so carried by their race than the X players get weeded out.

This extends even further, most tournaments have qualifiers, so say X is underpowered, the players who play X that get into the tournament are simply better because thety got in despite the imbalance, therefore as they are better, they will continue to win even despite the imbalance vested against them, thereby skewing the results to more 50-50 than it actually is.

In fact, there are even more things wrong with this idea, because tournaments generally feature some form of elimination, this means that players who are better stay in the tournament for longer, therefore they contribute more to the amount of games played, since these are the players that overcome balance, again, it skews to 50-50 more than it is.

So yes, no matter how you have it, balance will skew to 50-50 more than it actually is, and monthily spikes and variation indicate balance shift more so than absolute balance. The only way to find out absolute balance is to get a random pool of pros, force them to play random, and have a round robin tournament to ensure that everyone plays the same amount of games. Very unfeasible to get enough games with that for a reasonable sample size.

- Fallacy: sample size is big enough

The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments. The fact that Stephano won 4-0 this time and 4-0 the last time is not independent, they have a common cause, Stephano is fucking good. More so than that, as indicated before, most tournaments feature a form of elimination, which means that because Stephano is fucking good he simply contributes more to the sample size than players who are not as good. One has to realize that in some single elimination tournaments, it's possible for the champion to have played 40% of all the games in the tournament...

As an extreme example to show that the idea is fundamentally flawed. Say you have 2 and only 2 players, the best 2 in the world, let's say you have MKP vs DRG, they play a thousand games, this ends 720-280 in MKP's favour. Can you then say 'Clearly since this is taken from the absolute top, TvZ is highly imbalanced as a thousand games is a very large sample size'. No! you cannot, and while this is an extreme example it shows the general idea and the fallacy thereof, the games are not independent probability experiments.

Especially in Korea, it is very likely that the TLPD graphs we're all so fond of do not indicate balance overall, they just indicate whichever race has a couple of good players this month that dominated everything. The Korean graph just shifts around every month while the amount of games should be large enough to stop that from happening, if they were independent experiments, but they aren't, they are quite dependent and how much the KR graph flips around each months demonstrates how unreliable it is.

Simply put, the amount of games has a large enough sample size to be significant, but the amount of players is way too small.

- Fallacy: advantages at certain times of matchups expressed in graphs

There are also a lot of graphs posted which supposedly indicate that some races may have an advantage at certain matchups. Oh boy do people misread what these graphs mean. Take this bad boy.

A naïve way to read it would simply be 'Hmm, Z has an early game advantage in ZvP, then it becomes about even, then P has a slight advantage up to the end, then Z again.', wrong; look closely, what does the graph actually say, it says this:
IF a ZvP game ends in the 0-5 minute range, the chance is 60% it ends in Z's favour.
Now the 'if' is so bloody important here, the game needn't end. Now, everyone of course realizes that that part is caused by early pools. Does Zerg really have a large advantage at that point? Can Zerg force a win at that point if they want to, are early pools overpowered? No, not at all, so what is going on?

Imagine a ZvP, Z decides to 7pool, P doesn't scout soon enough, lings get in, kill every probe, traalalala, P GG's. Game over in the first 5 minutes in Z's favour.
Okay, imagine a ZvP, Z decides to 7pool, P scouts in time, gets his wall up, damn, Z's like 'fuck man, shouldn't have done that'. But not necesarily GG's unless IdrA, the game goes on, Z however plays at such a disadvantage that in the next 5-10 minutes surely P will claim his victory unless P messes up.

See the fallacy? That Z has that 'early game advantage' doesn't mean that Z is more powerful or that 7pools are too powerful, it just means that IF the game immediately ends due to a 7pool it will most likely be in Z's favour. If the 7pool fails, the game doesn't end at that point, Z will most likely stay in the game and play from a significant disadvantage to lose later.

It is a grave statistical error of the magnitude of interpreting 'If a 8 year old child dies, the chance is the greatest he dies from a car swoop' as 'It is very likely 8 year old children die from car swoops'.

The graph doesn't even say how likely it is that the game ends at certain intervals. For all you it's far more likely for P to win in the late game than in the mid game, even though the graph indicates that if the game ends in the late game, the chance is higher that Z takes it. And even so, that still says nothing about advantages of races at certain times. One would assume that if a race is likely to win at time X, that race enjoys an advantage slightly before that time, no?

What would be far more intersting, though also not conclusive, would be a graph which outlines 'How large was the percentage of Z wins in ZvP at each interval', which is fundamentally different from 'at each interval, if the game ends, how often does it end into Z's favour in ZvP'. My bet is that because 7pools are actually quite rare, it would not at all show the huge spike for Z in the early game.
halfies
Profile Joined November 2011
United Kingdom327 Posts
May 04 2012 00:09 GMT
#2
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.
Lorizean
Profile Blog Joined March 2011
Germany1330 Posts
May 04 2012 00:12 GMT
#3
None of this is new or even in-depth. He starts of by saying that probability and stats are a filed that people study and continues on to bring arguments anybody could make. Not really proving a point there.
Not that I'm saying he's necessarily wrong, but having such an arrogant tone and then bringing something this trivial to the table is a bit... pretentious.
imMUTAble787
Profile Joined November 2011
United States680 Posts
May 04 2012 00:13 GMT
#4
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.
*eternalenvy fanboy*
Ripper41
Profile Joined July 2011
284 Posts
Last Edited: 2012-05-04 00:16:54
May 04 2012 00:16 GMT
#5
On May 04 2012 09:09 halfies wrote:
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.

Well he explains...
it's worse, he spelled it that way intentionally. still, he makes some good points in the substance of what he says.
Chocolate
Profile Blog Joined December 2010
United States2350 Posts
May 04 2012 00:17 GMT
#6
So basically the person who came up with this thinks that he is smarter than 99% of everyone else and then proceeds to list "fallacies" which tend to skew game results. Then he explains that ZvP ends early more often in favor of the zerg because of early pools.

I don't really see the value in this. It only shows what we already knew, that statistics aren't always true due to a couple factors. He also calls us pretentious but spells it oddly :/. The author himself strikes me as very pretentious.
Integra
Profile Blog Joined January 2008
Sweden5626 Posts
Last Edited: 2012-05-04 00:19:21
May 04 2012 00:18 GMT
#7
Yes, statistics taken out of context and without understanding of what they are suppose to prove makes them misleading. Good for reddit that they finally understand this

EDIT. @Chocolate: there is no value in the post, the poster just wanted to bash on TL.Net community and used statistics as an excuse.
"Dark Pleasure" | | I survived the Locust war of May 3, 2014
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:19 GMT
#8
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Please stop bashing the r/sc forums if you don't actually know what goes on over there. That was a lot more prevalent before, but even if you look at the front page right now you can see that there's a lot of starcraft legitimate related content and not just random shitty memes.
No tears now, only dreams.
Trumpstyle
Profile Joined May 2011
Sweden114 Posts
May 04 2012 00:23 GMT
#9
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:26 GMT
#10
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.
No tears now, only dreams.
i)awn
Profile Joined October 2011
United States189 Posts
Last Edited: 2012-05-04 00:28:04
May 04 2012 00:26 GMT
#11
While tournaments are indeed not reliable to calculate balance statistics because of the small pool and elimination process as Drabzalver said; tournaments and statistics still contain quite useful information. Many people might not interpret or analyze the numbers correctly but that doesn't mean that the numbers are useless.
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
May 04 2012 00:33 GMT
#12
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.


I think so, too... There's no reason for bashing around. I wrote why I included his text but I'd rather have a discussion on the topic and not its preface.
Bidj
Profile Joined September 2010
France98 Posts
May 04 2012 00:37 GMT
#13
At least a very interesting read.
Rooooaaaar
Jinsho
Profile Joined March 2011
United Kingdom3101 Posts
May 04 2012 00:37 GMT
#14
IdrA has been always been referring to point 1 and 2 when asked about winrates.
windsupernova
Profile Joined October 2010
Mexico5280 Posts
May 04 2012 00:40 GMT
#15
As much as I agree with him in some points. I don't like how he comes off as someone pretty arrogant and doesn't even present some kind of credentials on why he understand statistics more than 99% of people.I mean for all we know he could be some arrogant College kid who just passed his 1st statistics class.

Arguing from authority only works if you prove you are have some kind of authority. But his arguments are nice and he does seems to have some kind of understanding of statistics. But then he doesn't say how we should go about interpreting those statistics and providing proof.

That being said I do think most of the people take a really simplistic approach to statistics, but well statistics are a hard subject to tackle
"Its easy, just trust your CPU".-Boxer on being good at games
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:42 GMT
#16
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Maybe you should read the OP instead of just the qualifier.
Anything is Possible
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:43 GMT
#17
On May 04 2012 09:23 Trumpstyle wrote:
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.


He makes literally 0 claims about the skill gap between races. Reread what he said.
Anything is Possible
JitnikoVi
Profile Joined May 2010
Russian Federation396 Posts
May 04 2012 00:45 GMT
#18
i dont understand the point of this... most of this is common sense yet it states that 99% of people dont know this?

really?
In theory yes, but theoretically, no.
Mephiztopheles1
Profile Blog Joined December 2010
1124 Posts
May 04 2012 00:49 GMT
#19
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.

I read it. He is extremely aggressive so he begets extremely aggressive answers.

Regardless of his tone and of any tl vs reddit thing brewing here, his points should accompany threads (better phrased, of course) like the TLPD win rates thread in my opinion so that users with no background in statistics (for whatever reason) can have a better grasp of what the data presented means and/or doesn't.
xrapture
Profile Blog Joined December 2011
United States1644 Posts
May 04 2012 00:55 GMT
#20
If that is a top of the line post from reddit then I'm glad I stick to TL lol.

He just said random stuff in an aggressive way (stuff everyone knows anyway) and called TLPD statistics pretentious? hypocrite much?
Everyone is either delusional, a nihlilst, or dead from suicide.
1 2 3 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
PiGosaur Monday
01:00
#61
PiGStarcraft419
SteadfastSC56
CranKy Ducklings48
Liquipedia
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
WinterStarcraft520
PiGStarcraft419
elazer 172
SteadfastSC 56
CosmosSc2 47
StarCraft: Brood War
Artosis 632
Shuttle 620
NaDa 42
ggaemo 25
Dota 2
syndereN1290
League of Legends
C9.Mang0294
Nathanias4
Counter-Strike
minikerr36
Other Games
summit1g6844
Day[9].tv815
JimRising 340
XaKoH 172
ToD128
Maynarde108
ViBE104
Livibee58
Organizations
Other Games
gamesdonequick659
BasetradeTV121
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 17 non-featured ]
StarCraft 2
• Hupsaiya 91
• davetesta43
• sooper7s
• Migwel
• AfreecaTV YouTube
• LaughNgamezSOOP
• intothetv
• IndyKCrew
• Kozan
StarCraft: Brood War
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
Dota 2
• masondota22177
League of Legends
• Doublelift5338
Other Games
• imaqtpie2234
• Day9tv815
• Shiphtur78
Upcoming Events
StarCraft2.fi
15h 34m
Tenacious Turtle Tussle
22h 34m
The PondCast
1d 8h
WardiTV 2025
1d 10h
StarCraft2.fi
1d 15h
WardiTV 2025
2 days
StarCraft2.fi
3 days
RSL Revival
3 days
IPSL
3 days
Sziky vs JDConan
RSL Revival
4 days
Classic vs TBD
herO vs Zoun
[ Show More ]
WardiTV 2025
4 days
IPSL
4 days
Tarson vs DragOn
Wardi Open
5 days
Monday Night Weeklies
5 days
Replay Cast
5 days
Sparkling Tuna Cup
6 days
Liquipedia Results

Completed

Acropolis #4 - TS3
RSL Revival: Season 3
Kuram Kup

Ongoing

IPSL Winter 2025-26
KCM Race Survival 2025 Season 4
YSL S2
BSL Season 21
Slon Tour Season 2
WardiTV 2025
META Madness #9
SL Budapest Major 2025
ESL Impact League Season 8
BLAST Rivals Fall 2025
IEM Chengdu 2025
PGL Masters Bucharest 2025
Thunderpick World Champ.
CS Asia Championships 2025
ESL Pro League S22

Upcoming

BSL 21 Non-Korean Championship
Acropolis #4
IPSL Spring 2026
Bellum Gens Elite Stara Zagora 2026
HSC XXVIII
Big Gabe Cup #3
RSL Offline Finals
PGL Cluj-Napoca 2026
IEM Kraków 2026
BLAST Bounty Winter 2026
BLAST Bounty Winter Qual
eXTREMESLAND 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.