• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 14:56
CEST 20:56
KST 03:56
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Code S Season 1 (2026) - RO4 & Finals Preview5[ASL21] Ro4 Preview: On Course12Code S Season 1 - RO8 Preview7[ASL21] Ro8 Preview Pt2: Progenitors8Code S Season 1 - RO12 Group A: Rogue, Percival, Solar, Zoun13
Community News
Code S Season 1 (2026) - RO8 Results2Weekly Cups (May 4-10): Clem, MaxPax, herO win1Maestros of The Game 2 announcement and schedule !16Weekly Cups (April 27-May 4): Clem takes triple0RSL Revival: Season 5 - Qualifiers and Main Event12
StarCraft 2
General
Code S Season 1 (2026) - RO4 & Finals Preview Team Liquid Map Contest #22 - The Finalists Code S Season 1 (2026) - RO8 Results Code S Season 1 (2026) - RO12 Results MaNa leaves Team Liquid
Tourneys
GSL Code S Season 1 (2026) $5,000 WardiTV Spring Championship 2026 Maestros of The Game 2 announcement and schedule ! Sparkling Tuna Cup - Weekly Open Tournament KSL Week 89
Strategy
Custom Maps
[D]RTS in all its shapes and glory <3 [A] Nemrods 1/4 players
External Content
The PondCast: SC2 News & Results Mutation # 526 Rubber and Glue Mutation # 525 Wheel of Misfortune Mutation # 524 Death and Taxes
Brood War
General
Lights Ro.8 Review (asl s21) 25 Years Since Brood War Patch 1.08 ASL21 General Discussion vespene.gg — BW replays in browser BW General Discussion
Tourneys
[ASL21] Semifinals B [BSL22] RO8 Bracket Stage + Another TieBreaker [ASL21] Ro8 Day 4 Escore Tournament StarCraft Season 2
Strategy
Muta micro map competition Fighting Spirit mining rates [G] Hydra ZvZ: An Introduction Simple Questions, Simple Answers
Other Games
General Games
Stormgate/Frost Giant Megathread Warcraft III: The Frozen Throne ZeroSpace Megathread War of Dots, 2026 minimalst RTS Nintendo Switch Thread
Dota 2
The Story of Wings Gaming
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Vanilla Mini Mafia Mafia Game Mode Feedback/Ideas TL Mafia Community Thread Five o'clock TL Mafia
Community
General
US Politics Mega-thread European Politico-economics QA Mega-thread YouTube Thread Russo-Ukrainian War Thread UK Politics Mega-thread
Fan Clubs
The herO Fan Club!
Media & Entertainment
[Manga] One Piece Anime Discussion Thread [Req][Books] Good Fantasy/SciFi books
Sports
2024 - 2026 Football Thread McBoner: A hockey love story Formula 1 Discussion
World Cup 2022
Tech Support
streaming software Strange computer issues (software) [G] How to Block Livestream Ads
TL Community
The Automated Ban List
Blogs
Why RTS gamers make better f…
gosubay
How EEG Data Can Predict Gam…
TrAiDoS
ramps on octagon
StaticNine
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1789 users

What you don't know about statistics

Forum Index > SC2 General
Post a Reply
1 2 3 4 5 6 7 Next All
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
Last Edited: 2012-05-04 07:33:28
May 04 2012 00:00 GMT
#1
This post is originally written by Drabzalver on Reddit. Since he does not have a TL account I asked and was allowed to repost if I include this:

Drabzalver on Reddit:
TL is the [swear word] of praetentious [swear word] where moderators reward supposed 'high quality posts' which are full of statistical and scientific garbage I just outlined just because they are 'praesented nicely' and the mods basically think that any long post with a lot of images and no swear words is intellectually advanced while often a lot of it is total garbage filled with wrong interpretations and grave statistical errors.
Also, I don't have a TL account, you're welcome to repost, but do include this qualifier, should be fun,

Clarification: This is not my opinion of this community and I only posted the quote because I was asked to and respect his author rights. The content is very interesting.


EDIT: A lot of people claim the OP/author just wants to bash the TL community. That is wrong! The text is directed towrds the reddit Community and was originally called "What reddit doesn't know about statistics" with me liberately altering the title... I see this might have let to some confusion.
-----------------------------------------------------------------------------------------------------------------------------------

What you don't know about statistics
by Drabzalver

Not a day goes by without some stat or graph being posted here or whenever in the community, 99% of the time this deals with winrates and balance, 99% of the people commenting on it have no clue how to interpret these numbers and how little they actually mean, and no, this has nothing to do with sample size. Stats and probability theory is actually something you study on for years and is a specialized field to avoid people from making the errors people make here. So stay a while and listen:

- Fallacy: winrates indicate balance

Yes and no, for the most part, they do not indicate balance, rather, they indicate balance shifts. It's so trivial to see that it boggles me that people don't figure this out on their own. Assume that race X was actually underpowered last month and is balanced this month. This means that the X players who actually qualified last month and stayed in tournaments are actually better than the Y and Z players, therefore, now that it gets balanced, as they are overal better, they start smashing Y and Z because they are better, thereby suddenly making the graph appear as X has been 'overbuffed' or whatever else while simply the few X players that were around in the scene were better. This will continue on until the mediocre Y and Z players who were more so carried by their race than the X players get weeded out.

This extends even further, most tournaments have qualifiers, so say X is underpowered, the players who play X that get into the tournament are simply better because thety got in despite the imbalance, therefore as they are better, they will continue to win even despite the imbalance vested against them, thereby skewing the results to more 50-50 than it actually is.

In fact, there are even more things wrong with this idea, because tournaments generally feature some form of elimination, this means that players who are better stay in the tournament for longer, therefore they contribute more to the amount of games played, since these are the players that overcome balance, again, it skews to 50-50 more than it is.

So yes, no matter how you have it, balance will skew to 50-50 more than it actually is, and monthily spikes and variation indicate balance shift more so than absolute balance. The only way to find out absolute balance is to get a random pool of pros, force them to play random, and have a round robin tournament to ensure that everyone plays the same amount of games. Very unfeasible to get enough games with that for a reasonable sample size.

- Fallacy: sample size is big enough

The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments. The fact that Stephano won 4-0 this time and 4-0 the last time is not independent, they have a common cause, Stephano is fucking good. More so than that, as indicated before, most tournaments feature a form of elimination, which means that because Stephano is fucking good he simply contributes more to the sample size than players who are not as good. One has to realize that in some single elimination tournaments, it's possible for the champion to have played 40% of all the games in the tournament...

As an extreme example to show that the idea is fundamentally flawed. Say you have 2 and only 2 players, the best 2 in the world, let's say you have MKP vs DRG, they play a thousand games, this ends 720-280 in MKP's favour. Can you then say 'Clearly since this is taken from the absolute top, TvZ is highly imbalanced as a thousand games is a very large sample size'. No! you cannot, and while this is an extreme example it shows the general idea and the fallacy thereof, the games are not independent probability experiments.

Especially in Korea, it is very likely that the TLPD graphs we're all so fond of do not indicate balance overall, they just indicate whichever race has a couple of good players this month that dominated everything. The Korean graph just shifts around every month while the amount of games should be large enough to stop that from happening, if they were independent experiments, but they aren't, they are quite dependent and how much the KR graph flips around each months demonstrates how unreliable it is.

Simply put, the amount of games has a large enough sample size to be significant, but the amount of players is way too small.

- Fallacy: advantages at certain times of matchups expressed in graphs

There are also a lot of graphs posted which supposedly indicate that some races may have an advantage at certain matchups. Oh boy do people misread what these graphs mean. Take this bad boy.

A naïve way to read it would simply be 'Hmm, Z has an early game advantage in ZvP, then it becomes about even, then P has a slight advantage up to the end, then Z again.', wrong; look closely, what does the graph actually say, it says this:
IF a ZvP game ends in the 0-5 minute range, the chance is 60% it ends in Z's favour.
Now the 'if' is so bloody important here, the game needn't end. Now, everyone of course realizes that that part is caused by early pools. Does Zerg really have a large advantage at that point? Can Zerg force a win at that point if they want to, are early pools overpowered? No, not at all, so what is going on?

Imagine a ZvP, Z decides to 7pool, P doesn't scout soon enough, lings get in, kill every probe, traalalala, P GG's. Game over in the first 5 minutes in Z's favour.
Okay, imagine a ZvP, Z decides to 7pool, P scouts in time, gets his wall up, damn, Z's like 'fuck man, shouldn't have done that'. But not necesarily GG's unless IdrA, the game goes on, Z however plays at such a disadvantage that in the next 5-10 minutes surely P will claim his victory unless P messes up.

See the fallacy? That Z has that 'early game advantage' doesn't mean that Z is more powerful or that 7pools are too powerful, it just means that IF the game immediately ends due to a 7pool it will most likely be in Z's favour. If the 7pool fails, the game doesn't end at that point, Z will most likely stay in the game and play from a significant disadvantage to lose later.

It is a grave statistical error of the magnitude of interpreting 'If a 8 year old child dies, the chance is the greatest he dies from a car swoop' as 'It is very likely 8 year old children die from car swoops'.

The graph doesn't even say how likely it is that the game ends at certain intervals. For all you it's far more likely for P to win in the late game than in the mid game, even though the graph indicates that if the game ends in the late game, the chance is higher that Z takes it. And even so, that still says nothing about advantages of races at certain times. One would assume that if a race is likely to win at time X, that race enjoys an advantage slightly before that time, no?

What would be far more intersting, though also not conclusive, would be a graph which outlines 'How large was the percentage of Z wins in ZvP at each interval', which is fundamentally different from 'at each interval, if the game ends, how often does it end into Z's favour in ZvP'. My bet is that because 7pools are actually quite rare, it would not at all show the huge spike for Z in the early game.
halfies
Profile Joined November 2011
United Kingdom327 Posts
May 04 2012 00:09 GMT
#2
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.
Lorizean
Profile Blog Joined March 2011
Germany1330 Posts
May 04 2012 00:12 GMT
#3
None of this is new or even in-depth. He starts of by saying that probability and stats are a filed that people study and continues on to bring arguments anybody could make. Not really proving a point there.
Not that I'm saying he's necessarily wrong, but having such an arrogant tone and then bringing something this trivial to the table is a bit... pretentious.
imMUTAble787
Profile Joined November 2011
United States680 Posts
May 04 2012 00:13 GMT
#4
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.
*eternalenvy fanboy*
Ripper41
Profile Joined July 2011
284 Posts
Last Edited: 2012-05-04 00:16:54
May 04 2012 00:16 GMT
#5
On May 04 2012 09:09 halfies wrote:
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.

Well he explains...
it's worse, he spelled it that way intentionally. still, he makes some good points in the substance of what he says.
Chocolate
Profile Blog Joined December 2010
United States2350 Posts
May 04 2012 00:17 GMT
#6
So basically the person who came up with this thinks that he is smarter than 99% of everyone else and then proceeds to list "fallacies" which tend to skew game results. Then he explains that ZvP ends early more often in favor of the zerg because of early pools.

I don't really see the value in this. It only shows what we already knew, that statistics aren't always true due to a couple factors. He also calls us pretentious but spells it oddly :/. The author himself strikes me as very pretentious.
Integra
Profile Blog Joined January 2008
Sweden5626 Posts
Last Edited: 2012-05-04 00:19:21
May 04 2012 00:18 GMT
#7
Yes, statistics taken out of context and without understanding of what they are suppose to prove makes them misleading. Good for reddit that they finally understand this

EDIT. @Chocolate: there is no value in the post, the poster just wanted to bash on TL.Net community and used statistics as an excuse.
"Dark Pleasure" | | I survived the Locust war of May 3, 2014
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:19 GMT
#8
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Please stop bashing the r/sc forums if you don't actually know what goes on over there. That was a lot more prevalent before, but even if you look at the front page right now you can see that there's a lot of starcraft legitimate related content and not just random shitty memes.
No tears now, only dreams.
Trumpstyle
Profile Joined May 2011
Sweden114 Posts
May 04 2012 00:23 GMT
#9
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:26 GMT
#10
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.
No tears now, only dreams.
i)awn
Profile Joined October 2011
United States189 Posts
Last Edited: 2012-05-04 00:28:04
May 04 2012 00:26 GMT
#11
While tournaments are indeed not reliable to calculate balance statistics because of the small pool and elimination process as Drabzalver said; tournaments and statistics still contain quite useful information. Many people might not interpret or analyze the numbers correctly but that doesn't mean that the numbers are useless.
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
May 04 2012 00:33 GMT
#12
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.


I think so, too... There's no reason for bashing around. I wrote why I included his text but I'd rather have a discussion on the topic and not its preface.
Bidj
Profile Joined September 2010
France98 Posts
May 04 2012 00:37 GMT
#13
At least a very interesting read.
Rooooaaaar
Jinsho
Profile Joined March 2011
United Kingdom3101 Posts
May 04 2012 00:37 GMT
#14
IdrA has been always been referring to point 1 and 2 when asked about winrates.
windsupernova
Profile Joined October 2010
Mexico5280 Posts
May 04 2012 00:40 GMT
#15
As much as I agree with him in some points. I don't like how he comes off as someone pretty arrogant and doesn't even present some kind of credentials on why he understand statistics more than 99% of people.I mean for all we know he could be some arrogant College kid who just passed his 1st statistics class.

Arguing from authority only works if you prove you are have some kind of authority. But his arguments are nice and he does seems to have some kind of understanding of statistics. But then he doesn't say how we should go about interpreting those statistics and providing proof.

That being said I do think most of the people take a really simplistic approach to statistics, but well statistics are a hard subject to tackle
"Its easy, just trust your CPU".-Boxer on being good at games
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:42 GMT
#16
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Maybe you should read the OP instead of just the qualifier.
Anything is Possible
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:43 GMT
#17
On May 04 2012 09:23 Trumpstyle wrote:
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.


He makes literally 0 claims about the skill gap between races. Reread what he said.
Anything is Possible
JitnikoVi
Profile Joined May 2010
Russian Federation396 Posts
May 04 2012 00:45 GMT
#18
i dont understand the point of this... most of this is common sense yet it states that 99% of people dont know this?

really?
In theory yes, but theoretically, no.
Mephiztopheles1
Profile Blog Joined December 2010
1124 Posts
May 04 2012 00:49 GMT
#19
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.

I read it. He is extremely aggressive so he begets extremely aggressive answers.

Regardless of his tone and of any tl vs reddit thing brewing here, his points should accompany threads (better phrased, of course) like the TLPD win rates thread in my opinion so that users with no background in statistics (for whatever reason) can have a better grasp of what the data presented means and/or doesn't.
xrapture
Profile Blog Joined December 2011
United States1644 Posts
May 04 2012 00:55 GMT
#20
If that is a top of the line post from reddit then I'm glad I stick to TL lol.

He just said random stuff in an aggressive way (stuff everyone knows anyway) and called TLPD statistics pretentious? hypocrite much?
Everyone is either delusional, a nihlilst, or dead from suicide.
1 2 3 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
Monday Night Weeklies
16:00
#52
TKL 2663
RotterdaM946
SteadfastSC269
IndyStarCraft 207
BRAT_OK 98
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
TKL 2663
RotterdaM 946
SteadfastSC 269
MaxPax 231
IndyStarCraft 207
elazer 141
ProTech119
UpATreeSC 99
BRAT_OK 98
MindelVK 17
EmSc Tv 3
StarCraft: Brood War
Sea 3922
Britney 1445
scan(afreeca) 58
Aegong 27
Rock 20
GoRush 18
Dota 2
qojqva2337
monkeys_forever409
Counter-Strike
pashabiceps2016
edward102
Heroes of the Storm
Liquid`Hasu344
Other Games
Grubby6634
Liquid`RaSZi2234
KnowMe190
C9.Mang0184
Hui .170
Trikslyr52
Organizations
Counter-Strike
PGL1534
StarCraft: Brood War
lovetv 5
StarCraft 2
EmSc Tv 3
EmSc2Tv 3
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
[ Show 18 non-featured ]
StarCraft 2
• kabyraGe 168
• Reevou 4
• Kozan
• LaughNgamezSOOP
• sooper7s
• AfreecaTV YouTube
• intothetv
• Migwel
• IndyKCrew
StarCraft: Brood War
• HerbMon 28
• FirePhoenix3
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
Dota 2
• WagamamaTV532
• lizZardDota254
Other Games
• imaqtpie1580
• Shiphtur307
Upcoming Events
Replay Cast
5h 4m
The PondCast
15h 4m
Kung Fu Cup
16h 4m
WardiTV Qualifier
19h 4m
GSL
1d 14h
Cure vs sOs
SHIN vs ByuN
Replay Cast
2 days
GSL
2 days
Classic vs Solar
GuMiho vs Zoun
WardiTV Spring Champion…
2 days
Replay Cast
3 days
Sparkling Tuna Cup
3 days
[ Show More ]
WardiTV Spring Champion…
3 days
Replay Cast
4 days
RSL Revival
4 days
Classic vs SHIN
Rogue vs Bunny
BSL
5 days
Replay Cast
5 days
Afreeca Starleague
5 days
Flash vs Soma
RSL Revival
5 days
BSL
6 days
Patches Events
6 days
Universe Titan Cup
6 days
Rogue vs Percival
Wardi Open
6 days
Monday Night Weeklies
6 days
Liquipedia Results

Completed

Escore Tournament S2: W7
2026 GSL S1
Nations Cup 2026

Ongoing

BSL Season 22
ASL Season 21
IPSL Spring 2026
KCM Race Survival 2026 Season 2
Acropolis #4
KK 2v2 League Season 1
BSL 22 Non-Korean Championship
YSL S3
SCTL 2026 Spring
RSL Revival: Season 5
Heroes Pulsing #1
Asian Champions League 2026
IEM Atlanta 2026
PGL Astana 2026
BLAST Rivals Spring 2026
IEM Rio 2026
PGL Bucharest 2026
Stake Ranked Episode 1
BLAST Open Spring 2026
ESL Pro League S23 Finals
ESL Pro League S23 Stage 1&2

Upcoming

Escore Tournament S2: W8
CSLAN 4
Kung Fu Cup 2026 Grand Finals
HSC XXIX
uThermal 2v2 2026 Main Event
Maestros of the Game 2
WardiTV Spring 2026
2026 GSL S2
Bounty Cup 2026
BLAST Bounty Summer 2026
BLAST Bounty Summer Qual
Stake Ranked Episode 3
XSE Pro League 2026
IEM Cologne Major 2026
Stake Ranked Episode 2
CS Asia Championships 2026
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2026 TLnet. All Rights Reserved.