• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 13:25
CEST 19:25
KST 02:25
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
[ASL20] Finals Preview: Arrival6TL.net Map Contest #21: Voting10[ASL20] Ro4 Preview: Descent11Team TLMC #5: Winners Announced!3[ASL20] Ro8 Preview Pt2: Holding On9
Community News
Chinese SC2 server to reopen; live all-star event in Hangzhou21Weekly Cups (Oct 13-19): Clem Goes for Four3BSL Team A vs Koreans - Sat-Sun 16:00 CET10Weekly Cups (Oct 6-12): Four star herO85.0.15 Patch Balance Hotfix (2025-10-8)81
StarCraft 2
General
RotterdaM "Serral is the GOAT, and it's not close" Chinese SC2 server to reopen; live all-star event in Hangzhou The New Patch Killed Mech! Weekly Cups (Oct 13-19): Clem Goes for Four 5.0.15 Patch Balance Hotfix (2025-10-8)
Tourneys
Merivale 8 Open - LAN - Stellar Fest Tenacious Turtle Tussle RSL Season 3 Qualifier Links and Dates $1,200 WardiTV October (Oct 21st-31st) SC2's Safe House 2 - October 18 & 19
Strategy
Custom Maps
Map Editor closed ?
External Content
Mutation # 496 Endless Infection Mutation # 495 Rest In Peace Mutation # 494 Unstable Environment Mutation # 493 Quick Killers
Brood War
General
Is there anyway to get a private coach? BGH Auto Balance -> http://bghmmr.eu/ [ASL20] Finals Preview: Arrival BSL Team A vs Koreans - Sat-Sun 16:00 CET OGN to release AI-upscaled StarLeague from Feb 24
Tourneys
[ASL20] Grand Finals ASL final tickets help Small VOD Thread 2.0 [Megathread] Daily Proleagues
Strategy
Roaring Currents ASL final Simple Questions, Simple Answers Relatively freeroll strategies BW - ajfirecracker Strategy & Training
Other Games
General Games
Nintendo Switch Thread Path of Exile Stormgate/Frost Giant Megathread Dawn of War IV ZeroSpace Megathread
Dota 2
Official 'what is Dota anymore' discussion LiquidDota to reintegrate into TL.net
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread SPIRED by.ASL Mafia {211640}
Community
General
US Politics Mega-thread Russo-Ukrainian War Thread Things Aren’t Peaceful in Palestine YouTube Thread The Chess Thread
Fan Clubs
White-Ra Fan Club The herO Fan Club!
Media & Entertainment
Anime Discussion Thread [Manga] One Piece Korean Music Discussion Series you have seen recently... Movie Discussion!
Sports
MLB/Baseball 2023 2024 - 2026 Football Thread TeamLiquid Health and Fitness Initiative For 2023 Formula 1 Discussion NBA General Discussion
World Cup 2022
Tech Support
SC2 Client Relocalization [Change SC2 Language] Linksys AE2500 USB WIFI keeps disconnecting Computer Build, Upgrade & Buying Resource Thread
TL Community
The Automated Ban List Recent Gifted Posts
Blogs
The Benefits Of Limited Comm…
TrAiDoS
Sabrina was soooo lame on S…
Peanutsc
Our Last Hope in th…
KrillinFromwales
Certified Crazy
Hildegard
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1559 users

What you don't know about statistics

Forum Index > SC2 General
Post a Reply
1 2 3 4 5 6 7 Next All
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
Last Edited: 2012-05-04 07:33:28
May 04 2012 00:00 GMT
#1
This post is originally written by Drabzalver on Reddit. Since he does not have a TL account I asked and was allowed to repost if I include this:

Drabzalver on Reddit:
TL is the [swear word] of praetentious [swear word] where moderators reward supposed 'high quality posts' which are full of statistical and scientific garbage I just outlined just because they are 'praesented nicely' and the mods basically think that any long post with a lot of images and no swear words is intellectually advanced while often a lot of it is total garbage filled with wrong interpretations and grave statistical errors.
Also, I don't have a TL account, you're welcome to repost, but do include this qualifier, should be fun,

Clarification: This is not my opinion of this community and I only posted the quote because I was asked to and respect his author rights. The content is very interesting.


EDIT: A lot of people claim the OP/author just wants to bash the TL community. That is wrong! The text is directed towrds the reddit Community and was originally called "What reddit doesn't know about statistics" with me liberately altering the title... I see this might have let to some confusion.
-----------------------------------------------------------------------------------------------------------------------------------

What you don't know about statistics
by Drabzalver

Not a day goes by without some stat or graph being posted here or whenever in the community, 99% of the time this deals with winrates and balance, 99% of the people commenting on it have no clue how to interpret these numbers and how little they actually mean, and no, this has nothing to do with sample size. Stats and probability theory is actually something you study on for years and is a specialized field to avoid people from making the errors people make here. So stay a while and listen:

- Fallacy: winrates indicate balance

Yes and no, for the most part, they do not indicate balance, rather, they indicate balance shifts. It's so trivial to see that it boggles me that people don't figure this out on their own. Assume that race X was actually underpowered last month and is balanced this month. This means that the X players who actually qualified last month and stayed in tournaments are actually better than the Y and Z players, therefore, now that it gets balanced, as they are overal better, they start smashing Y and Z because they are better, thereby suddenly making the graph appear as X has been 'overbuffed' or whatever else while simply the few X players that were around in the scene were better. This will continue on until the mediocre Y and Z players who were more so carried by their race than the X players get weeded out.

This extends even further, most tournaments have qualifiers, so say X is underpowered, the players who play X that get into the tournament are simply better because thety got in despite the imbalance, therefore as they are better, they will continue to win even despite the imbalance vested against them, thereby skewing the results to more 50-50 than it actually is.

In fact, there are even more things wrong with this idea, because tournaments generally feature some form of elimination, this means that players who are better stay in the tournament for longer, therefore they contribute more to the amount of games played, since these are the players that overcome balance, again, it skews to 50-50 more than it is.

So yes, no matter how you have it, balance will skew to 50-50 more than it actually is, and monthily spikes and variation indicate balance shift more so than absolute balance. The only way to find out absolute balance is to get a random pool of pros, force them to play random, and have a round robin tournament to ensure that everyone plays the same amount of games. Very unfeasible to get enough games with that for a reasonable sample size.

- Fallacy: sample size is big enough

The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments. The fact that Stephano won 4-0 this time and 4-0 the last time is not independent, they have a common cause, Stephano is fucking good. More so than that, as indicated before, most tournaments feature a form of elimination, which means that because Stephano is fucking good he simply contributes more to the sample size than players who are not as good. One has to realize that in some single elimination tournaments, it's possible for the champion to have played 40% of all the games in the tournament...

As an extreme example to show that the idea is fundamentally flawed. Say you have 2 and only 2 players, the best 2 in the world, let's say you have MKP vs DRG, they play a thousand games, this ends 720-280 in MKP's favour. Can you then say 'Clearly since this is taken from the absolute top, TvZ is highly imbalanced as a thousand games is a very large sample size'. No! you cannot, and while this is an extreme example it shows the general idea and the fallacy thereof, the games are not independent probability experiments.

Especially in Korea, it is very likely that the TLPD graphs we're all so fond of do not indicate balance overall, they just indicate whichever race has a couple of good players this month that dominated everything. The Korean graph just shifts around every month while the amount of games should be large enough to stop that from happening, if they were independent experiments, but they aren't, they are quite dependent and how much the KR graph flips around each months demonstrates how unreliable it is.

Simply put, the amount of games has a large enough sample size to be significant, but the amount of players is way too small.

- Fallacy: advantages at certain times of matchups expressed in graphs

There are also a lot of graphs posted which supposedly indicate that some races may have an advantage at certain matchups. Oh boy do people misread what these graphs mean. Take this bad boy.

A naïve way to read it would simply be 'Hmm, Z has an early game advantage in ZvP, then it becomes about even, then P has a slight advantage up to the end, then Z again.', wrong; look closely, what does the graph actually say, it says this:
IF a ZvP game ends in the 0-5 minute range, the chance is 60% it ends in Z's favour.
Now the 'if' is so bloody important here, the game needn't end. Now, everyone of course realizes that that part is caused by early pools. Does Zerg really have a large advantage at that point? Can Zerg force a win at that point if they want to, are early pools overpowered? No, not at all, so what is going on?

Imagine a ZvP, Z decides to 7pool, P doesn't scout soon enough, lings get in, kill every probe, traalalala, P GG's. Game over in the first 5 minutes in Z's favour.
Okay, imagine a ZvP, Z decides to 7pool, P scouts in time, gets his wall up, damn, Z's like 'fuck man, shouldn't have done that'. But not necesarily GG's unless IdrA, the game goes on, Z however plays at such a disadvantage that in the next 5-10 minutes surely P will claim his victory unless P messes up.

See the fallacy? That Z has that 'early game advantage' doesn't mean that Z is more powerful or that 7pools are too powerful, it just means that IF the game immediately ends due to a 7pool it will most likely be in Z's favour. If the 7pool fails, the game doesn't end at that point, Z will most likely stay in the game and play from a significant disadvantage to lose later.

It is a grave statistical error of the magnitude of interpreting 'If a 8 year old child dies, the chance is the greatest he dies from a car swoop' as 'It is very likely 8 year old children die from car swoops'.

The graph doesn't even say how likely it is that the game ends at certain intervals. For all you it's far more likely for P to win in the late game than in the mid game, even though the graph indicates that if the game ends in the late game, the chance is higher that Z takes it. And even so, that still says nothing about advantages of races at certain times. One would assume that if a race is likely to win at time X, that race enjoys an advantage slightly before that time, no?

What would be far more intersting, though also not conclusive, would be a graph which outlines 'How large was the percentage of Z wins in ZvP at each interval', which is fundamentally different from 'at each interval, if the game ends, how often does it end into Z's favour in ZvP'. My bet is that because 7pools are actually quite rare, it would not at all show the huge spike for Z in the early game.
halfies
Profile Joined November 2011
United Kingdom327 Posts
May 04 2012 00:09 GMT
#2
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.
Lorizean
Profile Blog Joined March 2011
Germany1330 Posts
May 04 2012 00:12 GMT
#3
None of this is new or even in-depth. He starts of by saying that probability and stats are a filed that people study and continues on to bring arguments anybody could make. Not really proving a point there.
Not that I'm saying he's necessarily wrong, but having such an arrogant tone and then bringing something this trivial to the table is a bit... pretentious.
imMUTAble787
Profile Joined November 2011
United States680 Posts
May 04 2012 00:13 GMT
#4
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.
*eternalenvy fanboy*
Ripper41
Profile Joined July 2011
284 Posts
Last Edited: 2012-05-04 00:16:54
May 04 2012 00:16 GMT
#5
On May 04 2012 09:09 halfies wrote:
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.

Well he explains...
it's worse, he spelled it that way intentionally. still, he makes some good points in the substance of what he says.
Chocolate
Profile Blog Joined December 2010
United States2350 Posts
May 04 2012 00:17 GMT
#6
So basically the person who came up with this thinks that he is smarter than 99% of everyone else and then proceeds to list "fallacies" which tend to skew game results. Then he explains that ZvP ends early more often in favor of the zerg because of early pools.

I don't really see the value in this. It only shows what we already knew, that statistics aren't always true due to a couple factors. He also calls us pretentious but spells it oddly :/. The author himself strikes me as very pretentious.
Integra
Profile Blog Joined January 2008
Sweden5626 Posts
Last Edited: 2012-05-04 00:19:21
May 04 2012 00:18 GMT
#7
Yes, statistics taken out of context and without understanding of what they are suppose to prove makes them misleading. Good for reddit that they finally understand this

EDIT. @Chocolate: there is no value in the post, the poster just wanted to bash on TL.Net community and used statistics as an excuse.
"Dark Pleasure" | | I survived the Locust war of May 3, 2014
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:19 GMT
#8
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Please stop bashing the r/sc forums if you don't actually know what goes on over there. That was a lot more prevalent before, but even if you look at the front page right now you can see that there's a lot of starcraft legitimate related content and not just random shitty memes.
No tears now, only dreams.
Trumpstyle
Profile Joined May 2011
Sweden114 Posts
May 04 2012 00:23 GMT
#9
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:26 GMT
#10
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.
No tears now, only dreams.
i)awn
Profile Joined October 2011
United States189 Posts
Last Edited: 2012-05-04 00:28:04
May 04 2012 00:26 GMT
#11
While tournaments are indeed not reliable to calculate balance statistics because of the small pool and elimination process as Drabzalver said; tournaments and statistics still contain quite useful information. Many people might not interpret or analyze the numbers correctly but that doesn't mean that the numbers are useless.
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
May 04 2012 00:33 GMT
#12
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.


I think so, too... There's no reason for bashing around. I wrote why I included his text but I'd rather have a discussion on the topic and not its preface.
Bidj
Profile Joined September 2010
France98 Posts
May 04 2012 00:37 GMT
#13
At least a very interesting read.
Rooooaaaar
Jinsho
Profile Joined March 2011
United Kingdom3101 Posts
May 04 2012 00:37 GMT
#14
IdrA has been always been referring to point 1 and 2 when asked about winrates.
windsupernova
Profile Joined October 2010
Mexico5280 Posts
May 04 2012 00:40 GMT
#15
As much as I agree with him in some points. I don't like how he comes off as someone pretty arrogant and doesn't even present some kind of credentials on why he understand statistics more than 99% of people.I mean for all we know he could be some arrogant College kid who just passed his 1st statistics class.

Arguing from authority only works if you prove you are have some kind of authority. But his arguments are nice and he does seems to have some kind of understanding of statistics. But then he doesn't say how we should go about interpreting those statistics and providing proof.

That being said I do think most of the people take a really simplistic approach to statistics, but well statistics are a hard subject to tackle
"Its easy, just trust your CPU".-Boxer on being good at games
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:42 GMT
#16
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Maybe you should read the OP instead of just the qualifier.
Anything is Possible
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:43 GMT
#17
On May 04 2012 09:23 Trumpstyle wrote:
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.


He makes literally 0 claims about the skill gap between races. Reread what he said.
Anything is Possible
JitnikoVi
Profile Joined May 2010
Russian Federation396 Posts
May 04 2012 00:45 GMT
#18
i dont understand the point of this... most of this is common sense yet it states that 99% of people dont know this?

really?
In theory yes, but theoretically, no.
Mephiztopheles1
Profile Blog Joined December 2010
1124 Posts
May 04 2012 00:49 GMT
#19
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.

I read it. He is extremely aggressive so he begets extremely aggressive answers.

Regardless of his tone and of any tl vs reddit thing brewing here, his points should accompany threads (better phrased, of course) like the TLPD win rates thread in my opinion so that users with no background in statistics (for whatever reason) can have a better grasp of what the data presented means and/or doesn't.
xrapture
Profile Blog Joined December 2011
United States1644 Posts
May 04 2012 00:55 GMT
#20
If that is a top of the line post from reddit then I'm glad I stick to TL lol.

He just said random stuff in an aggressive way (stuff everyone knows anyway) and called TLPD statistics pretentious? hypocrite much?
Everyone is either delusional, a nihlilst, or dead from suicide.
1 2 3 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
PSISTORM Gaming Misc
15:55
FSL TeamLeague: RR vs PTB
Freeedom18
Liquipedia
OSC
15:00
Mid Season Playoffs
ByuN vs MaxPaxLIVE!
WardiTV1145
Liquipedia
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Harstem 169
BRAT_OK 99
ProTech90
Railgan 65
StarCraft: Brood War
Britney 55937
Dewaltoss 117
Artosis 101
Backho 94
zelot 56
Rock 44
JulyZerg 31
ajuk12(nOOB) 11
SilentControl 9
Dota 2
Gorgc9430
Counter-Strike
fl0m840
Super Smash Bros
Mew2King91
Heroes of the Storm
Khaldor511
Liquid`Hasu417
Other Games
singsing1887
Grubby1182
ScreaM1166
B2W.Neo964
Beastyqt537
KnowMe482
Lowko214
Skadoodle158
mouzStarbuck123
Trikslyr64
Organizations
Other Games
gamesdonequick597
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 16 non-featured ]
StarCraft 2
• printf 25
• LaughNgamezSOOP
• AfreecaTV YouTube
• sooper7s
• intothetv
• Migwel
• Kozan
• IndyKCrew
StarCraft: Brood War
• Airneanach29
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
Dota 2
• C_a_k_e 1598
Other Games
• imaqtpie526
• WagamamaTV405
• Shiphtur264
Upcoming Events
Afreeca Starleague
14h 35m
Snow vs Soma
Sparkling Tuna Cup
16h 35m
WardiTV Invitational
18h 35m
CrankTV Team League
19h 35m
BASILISK vs Streamerzone
Team Liquid vs Shopify Rebellion
Team Vitality vs Team Falcon
BSL Team A[vengers]
21h 35m
Gypsy vs nOOB
JDConan vs Scan
RSL Revival
23h 35m
Wardi Open
1d 18h
CrankTV Team League
1d 19h
Replay Cast
2 days
WardiTV Invitational
2 days
[ Show More ]
CrankTV Team League
2 days
Replay Cast
3 days
CrankTV Team League
3 days
Replay Cast
4 days
The PondCast
4 days
CrankTV Team League
4 days
Replay Cast
5 days
WardiTV Invitational
5 days
CrankTV Team League
5 days
Replay Cast
6 days
Liquipedia Results

Completed

Acropolis #4 - TS2
WardiTV TLMC #15
HCC Europe

Ongoing

BSL 21 Points
ASL Season 20
CSL 2025 AUTUMN (S18)
C-Race Season 1
IPSL Winter 2025-26
KCM Race Survival 2025 Season 4
EC S1
Thunderpick World Champ.
CS Asia Championships 2025
ESL Pro League S22
StarSeries Fall 2025
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025

Upcoming

SC4ALL: Brood War
BSL Season 21
BSL 21 Team A
BSL 21 Non-Korean Championship
RSL Offline Finals
RSL Revival: Season 3
Stellar Fest
SC4ALL: StarCraft II
CranK Gathers Season 2: SC II Pro Teams
eXTREMESLAND 2025
ESL Impact League Season 8
SL Budapest Major 2025
BLAST Rivals Fall 2025
IEM Chengdu 2025
PGL Masters Bucharest 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.