• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 14:39
CEST 20:39
KST 03:39
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Team Liquid Map Contest #21 - Presented by Monster Energy5uThermal's 2v2 Tour: $15,000 Main Event14Serral wins EWC 202549Tournament Spotlight: FEL Cracow 202510Power Rank - Esports World Cup 202580
Community News
Weekly Cups (Aug 4-10): MaxPax wins a triple5SC2's Safe House 2 - October 18 & 195Weekly Cups (Jul 28-Aug 3): herO doubles up6LiuLi Cup - August 2025 Tournaments5[BSL 2025] H2 - Team Wars, Weeklies & SB Ladder10
StarCraft 2
General
RSL Revival patreon money discussion thread #1: Maru - Greatest Players of All Time Lambo Talks: The Future of SC2 and more... Team Liquid Map Contest #21 - Presented by Monster Energy uThermal's 2v2 Tour: $15,000 Main Event
Tourneys
Enki Epic Series #5 - TaeJa vs Classic (SC Evo) Sparkling Tuna Cup - Weekly Open Tournament RSL: Revival, a new crowdfunded tournament series SEL Masters #5 - Korea vs Russia (SC Evo) ByuN vs TaeJa Bo7 SC Evo Showmatch
Strategy
Custom Maps
External Content
Mutation # 486 Watch the Skies Mutation # 485 Death from Below Mutation # 484 Magnetic Pull Mutation #239 Bad Weather
Brood War
General
StarCraft player reflex TE scores New season has just come in ladder BGH Auto Balance -> http://bghmmr.eu/ Simultaneous Streaming by CasterMuse Google Play ASL (Season 20) Announced
Tourneys
[Megathread] Daily Proleagues KCM 2025 Season 3 Small VOD Thread 2.0 [ASL20] Online Qualifiers Day 2
Strategy
Simple Questions, Simple Answers Fighting Spirit mining rates [G] Mineral Boosting Muta micro map competition
Other Games
General Games
Stormgate/Frost Giant Megathread Total Annihilation Server - TAForever Nintendo Switch Thread Beyond All Reason [MMORPG] Tree of Savior (Successor of Ragnarok)
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread Vanilla Mini Mafia
Community
General
Things Aren’t Peaceful in Palestine Russo-Ukrainian War Thread US Politics Mega-thread The Games Industry And ATVI Bitcoin discussion thread
Fan Clubs
INnoVation Fan Club SKT1 Classic Fan Club!
Media & Entertainment
Anime Discussion Thread [\m/] Heavy Metal Thread [Manga] One Piece Movie Discussion! Korean Music Discussion
Sports
2024 - 2025 Football Thread TeamLiquid Health and Fitness Initiative For 2023 Formula 1 Discussion
World Cup 2022
Tech Support
Gtx660 graphics card replacement Installation of Windows 10 suck at "just a moment" Computer Build, Upgrade & Buying Resource Thread
TL Community
TeamLiquid Team Shirt On Sale The Automated Ban List
Blogs
Gaming After Dark: Poor Slee…
TrAiDoS
[Girl blog} My fema…
artosisisthebest
Sharpening the Filtration…
frozenclaw
ASL S20 English Commentary…
namkraft
from making sc maps to makin…
Husyelt
StarCraft improvement
iopq
Customize Sidebar...

Website Feedback

Closed Threads



Active: 579 users

What you don't know about statistics

Forum Index > SC2 General
Post a Reply
1 2 3 4 5 6 7 Next All
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
Last Edited: 2012-05-04 07:33:28
May 04 2012 00:00 GMT
#1
This post is originally written by Drabzalver on Reddit. Since he does not have a TL account I asked and was allowed to repost if I include this:

Drabzalver on Reddit:
TL is the [swear word] of praetentious [swear word] where moderators reward supposed 'high quality posts' which are full of statistical and scientific garbage I just outlined just because they are 'praesented nicely' and the mods basically think that any long post with a lot of images and no swear words is intellectually advanced while often a lot of it is total garbage filled with wrong interpretations and grave statistical errors.
Also, I don't have a TL account, you're welcome to repost, but do include this qualifier, should be fun,

Clarification: This is not my opinion of this community and I only posted the quote because I was asked to and respect his author rights. The content is very interesting.


EDIT: A lot of people claim the OP/author just wants to bash the TL community. That is wrong! The text is directed towrds the reddit Community and was originally called "What reddit doesn't know about statistics" with me liberately altering the title... I see this might have let to some confusion.
-----------------------------------------------------------------------------------------------------------------------------------

What you don't know about statistics
by Drabzalver

Not a day goes by without some stat or graph being posted here or whenever in the community, 99% of the time this deals with winrates and balance, 99% of the people commenting on it have no clue how to interpret these numbers and how little they actually mean, and no, this has nothing to do with sample size. Stats and probability theory is actually something you study on for years and is a specialized field to avoid people from making the errors people make here. So stay a while and listen:

- Fallacy: winrates indicate balance

Yes and no, for the most part, they do not indicate balance, rather, they indicate balance shifts. It's so trivial to see that it boggles me that people don't figure this out on their own. Assume that race X was actually underpowered last month and is balanced this month. This means that the X players who actually qualified last month and stayed in tournaments are actually better than the Y and Z players, therefore, now that it gets balanced, as they are overal better, they start smashing Y and Z because they are better, thereby suddenly making the graph appear as X has been 'overbuffed' or whatever else while simply the few X players that were around in the scene were better. This will continue on until the mediocre Y and Z players who were more so carried by their race than the X players get weeded out.

This extends even further, most tournaments have qualifiers, so say X is underpowered, the players who play X that get into the tournament are simply better because thety got in despite the imbalance, therefore as they are better, they will continue to win even despite the imbalance vested against them, thereby skewing the results to more 50-50 than it actually is.

In fact, there are even more things wrong with this idea, because tournaments generally feature some form of elimination, this means that players who are better stay in the tournament for longer, therefore they contribute more to the amount of games played, since these are the players that overcome balance, again, it skews to 50-50 more than it is.

So yes, no matter how you have it, balance will skew to 50-50 more than it actually is, and monthily spikes and variation indicate balance shift more so than absolute balance. The only way to find out absolute balance is to get a random pool of pros, force them to play random, and have a round robin tournament to ensure that everyone plays the same amount of games. Very unfeasible to get enough games with that for a reasonable sample size.

- Fallacy: sample size is big enough

The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments. The fact that Stephano won 4-0 this time and 4-0 the last time is not independent, they have a common cause, Stephano is fucking good. More so than that, as indicated before, most tournaments feature a form of elimination, which means that because Stephano is fucking good he simply contributes more to the sample size than players who are not as good. One has to realize that in some single elimination tournaments, it's possible for the champion to have played 40% of all the games in the tournament...

As an extreme example to show that the idea is fundamentally flawed. Say you have 2 and only 2 players, the best 2 in the world, let's say you have MKP vs DRG, they play a thousand games, this ends 720-280 in MKP's favour. Can you then say 'Clearly since this is taken from the absolute top, TvZ is highly imbalanced as a thousand games is a very large sample size'. No! you cannot, and while this is an extreme example it shows the general idea and the fallacy thereof, the games are not independent probability experiments.

Especially in Korea, it is very likely that the TLPD graphs we're all so fond of do not indicate balance overall, they just indicate whichever race has a couple of good players this month that dominated everything. The Korean graph just shifts around every month while the amount of games should be large enough to stop that from happening, if they were independent experiments, but they aren't, they are quite dependent and how much the KR graph flips around each months demonstrates how unreliable it is.

Simply put, the amount of games has a large enough sample size to be significant, but the amount of players is way too small.

- Fallacy: advantages at certain times of matchups expressed in graphs

There are also a lot of graphs posted which supposedly indicate that some races may have an advantage at certain matchups. Oh boy do people misread what these graphs mean. Take this bad boy.

A naïve way to read it would simply be 'Hmm, Z has an early game advantage in ZvP, then it becomes about even, then P has a slight advantage up to the end, then Z again.', wrong; look closely, what does the graph actually say, it says this:
IF a ZvP game ends in the 0-5 minute range, the chance is 60% it ends in Z's favour.
Now the 'if' is so bloody important here, the game needn't end. Now, everyone of course realizes that that part is caused by early pools. Does Zerg really have a large advantage at that point? Can Zerg force a win at that point if they want to, are early pools overpowered? No, not at all, so what is going on?

Imagine a ZvP, Z decides to 7pool, P doesn't scout soon enough, lings get in, kill every probe, traalalala, P GG's. Game over in the first 5 minutes in Z's favour.
Okay, imagine a ZvP, Z decides to 7pool, P scouts in time, gets his wall up, damn, Z's like 'fuck man, shouldn't have done that'. But not necesarily GG's unless IdrA, the game goes on, Z however plays at such a disadvantage that in the next 5-10 minutes surely P will claim his victory unless P messes up.

See the fallacy? That Z has that 'early game advantage' doesn't mean that Z is more powerful or that 7pools are too powerful, it just means that IF the game immediately ends due to a 7pool it will most likely be in Z's favour. If the 7pool fails, the game doesn't end at that point, Z will most likely stay in the game and play from a significant disadvantage to lose later.

It is a grave statistical error of the magnitude of interpreting 'If a 8 year old child dies, the chance is the greatest he dies from a car swoop' as 'It is very likely 8 year old children die from car swoops'.

The graph doesn't even say how likely it is that the game ends at certain intervals. For all you it's far more likely for P to win in the late game than in the mid game, even though the graph indicates that if the game ends in the late game, the chance is higher that Z takes it. And even so, that still says nothing about advantages of races at certain times. One would assume that if a race is likely to win at time X, that race enjoys an advantage slightly before that time, no?

What would be far more intersting, though also not conclusive, would be a graph which outlines 'How large was the percentage of Z wins in ZvP at each interval', which is fundamentally different from 'at each interval, if the game ends, how often does it end into Z's favour in ZvP'. My bet is that because 7pools are actually quite rare, it would not at all show the huge spike for Z in the early game.
halfies
Profile Joined November 2011
United Kingdom327 Posts
May 04 2012 00:09 GMT
#2
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.
Lorizean
Profile Blog Joined March 2011
Germany1330 Posts
May 04 2012 00:12 GMT
#3
None of this is new or even in-depth. He starts of by saying that probability and stats are a filed that people study and continues on to bring arguments anybody could make. Not really proving a point there.
Not that I'm saying he's necessarily wrong, but having such an arrogant tone and then bringing something this trivial to the table is a bit... pretentious.
imMUTAble787
Profile Joined November 2011
United States680 Posts
May 04 2012 00:13 GMT
#4
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.
*eternalenvy fanboy*
Ripper41
Profile Joined July 2011
284 Posts
Last Edited: 2012-05-04 00:16:54
May 04 2012 00:16 GMT
#5
On May 04 2012 09:09 halfies wrote:
praetentious?
really?
i hope that how he spelt it too, because it would be really funny if he wrote this much about people being pretentious and used big words that he couldn't even spell.

Well he explains...
it's worse, he spelled it that way intentionally. still, he makes some good points in the substance of what he says.
Chocolate
Profile Blog Joined December 2010
United States2350 Posts
May 04 2012 00:17 GMT
#6
So basically the person who came up with this thinks that he is smarter than 99% of everyone else and then proceeds to list "fallacies" which tend to skew game results. Then he explains that ZvP ends early more often in favor of the zerg because of early pools.

I don't really see the value in this. It only shows what we already knew, that statistics aren't always true due to a couple factors. He also calls us pretentious but spells it oddly :/. The author himself strikes me as very pretentious.
Integra
Profile Blog Joined January 2008
Sweden5626 Posts
Last Edited: 2012-05-04 00:19:21
May 04 2012 00:18 GMT
#7
Yes, statistics taken out of context and without understanding of what they are suppose to prove makes them misleading. Good for reddit that they finally understand this

EDIT. @Chocolate: there is no value in the post, the poster just wanted to bash on TL.Net community and used statistics as an excuse.
"Dark Pleasure" | | I survived the Locust war of May 3, 2014
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:19 GMT
#8
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Please stop bashing the r/sc forums if you don't actually know what goes on over there. That was a lot more prevalent before, but even if you look at the front page right now you can see that there's a lot of starcraft legitimate related content and not just random shitty memes.
No tears now, only dreams.
Trumpstyle
Profile Joined May 2011
Sweden114 Posts
May 04 2012 00:23 GMT
#9
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.
dmasterding
Profile Blog Joined January 2011
United States205 Posts
May 04 2012 00:26 GMT
#10
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.
No tears now, only dreams.
i)awn
Profile Joined October 2011
United States189 Posts
Last Edited: 2012-05-04 00:28:04
May 04 2012 00:26 GMT
#11
While tournaments are indeed not reliable to calculate balance statistics because of the small pool and elimination process as Drabzalver said; tournaments and statistics still contain quite useful information. Many people might not interpret or analyze the numbers correctly but that doesn't mean that the numbers are useless.
Cyberonic
Profile Blog Joined September 2011
Germany80 Posts
May 04 2012 00:33 GMT
#12
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.


I think so, too... There's no reason for bashing around. I wrote why I included his text but I'd rather have a discussion on the topic and not its preface.
Bidj
Profile Joined September 2010
France98 Posts
May 04 2012 00:37 GMT
#13
At least a very interesting read.
Rooooaaaar
Jinsho
Profile Joined March 2011
United Kingdom3101 Posts
May 04 2012 00:37 GMT
#14
IdrA has been always been referring to point 1 and 2 when asked about winrates.
windsupernova
Profile Joined October 2010
Mexico5280 Posts
May 04 2012 00:40 GMT
#15
As much as I agree with him in some points. I don't like how he comes off as someone pretty arrogant and doesn't even present some kind of credentials on why he understand statistics more than 99% of people.I mean for all we know he could be some arrogant College kid who just passed his 1st statistics class.

Arguing from authority only works if you prove you are have some kind of authority. But his arguments are nice and he does seems to have some kind of understanding of statistics. But then he doesn't say how we should go about interpreting those statistics and providing proof.

That being said I do think most of the people take a really simplistic approach to statistics, but well statistics are a hard subject to tackle
"Its easy, just trust your CPU".-Boxer on being good at games
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:42 GMT
#16
On May 04 2012 09:13 imMUTAble787 wrote:
reddit is a cesspool of memes and retardation

that being said, i love the job the mods do on these forums.

as far as stats etc go i really dont pay any mind to them. the only ones i think have any validity would be the GSL player stats because they are tournament level and reliable.


Maybe you should read the OP instead of just the qualifier.
Anything is Possible
LaM
Profile Blog Joined September 2011
United States1321 Posts
May 04 2012 00:43 GMT
#17
On May 04 2012 09:23 Trumpstyle wrote:
Sorry he's theory is totally garbage as he thinks the skill gap between the races are massive(it's minimum) and screws with the statistic.


He makes literally 0 claims about the skill gap between races. Reread what he said.
Anything is Possible
JitnikoVi
Profile Joined May 2010
Russian Federation396 Posts
May 04 2012 00:45 GMT
#18
i dont understand the point of this... most of this is common sense yet it states that 99% of people dont know this?

really?
In theory yes, but theoretically, no.
Mephiztopheles1
Profile Blog Joined December 2010
1124 Posts
May 04 2012 00:49 GMT
#19
On May 04 2012 09:26 dmasterding wrote:
Did any of you guys actually read the thing? He didn't actually give any opinions about the matchups, he was just trying to get rid of some misconceptions people had about interpretation of results. I am pretty sure that if the OP never mentioned this person was from r/SC you guys wouldn't be so biased against the author.

I read it. He is extremely aggressive so he begets extremely aggressive answers.

Regardless of his tone and of any tl vs reddit thing brewing here, his points should accompany threads (better phrased, of course) like the TLPD win rates thread in my opinion so that users with no background in statistics (for whatever reason) can have a better grasp of what the data presented means and/or doesn't.
xrapture
Profile Blog Joined December 2011
United States1644 Posts
May 04 2012 00:55 GMT
#20
If that is a top of the line post from reddit then I'm glad I stick to TL lol.

He just said random stuff in an aggressive way (stuff everyone knows anyway) and called TLPD statistics pretentious? hypocrite much?
Everyone is either delusional, a nihlilst, or dead from suicide.
1 2 3 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
Next event in 5h 21m
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
mcanning 272
mouzHeroMarine 181
ProTech105
BRAT_OK 74
MindelVK 18
StarCraft: Brood War
Britney 21517
Bisu 823
Larva 473
Shuttle 461
Mong 127
Zeus 81
Hyun 74
ggaemo 73
sSak 62
HiyA 31
[ Show more ]
Backho 31
TY 27
yabsab 25
Rock 18
soO 17
scan(afreeca) 15
Gretorp14
IntoTheRainbow 7
Stormgate
TKL 157
UpATreeSC106
JuggernautJason78
Dota 2
Gorgc6949
qojqva3765
Dendi1292
LuMiX0
Counter-Strike
PGG 26
Heroes of the Storm
Grubby1547
Liquid`Hasu39
Other Games
Beastyqt529
RotterdaM407
Fuzer 248
PiGStarcraft168
ToD136
ArmadaUGS126
Hui .119
oskar78
Trikslyr64
StateSC221
ZombieGrub4
Organizations
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 21 non-featured ]
StarCraft 2
• 3DClanTV 80
• davetesta28
• Hinosc 15
• Reevou 8
• Migwel
• sooper7s
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
StarCraft: Brood War
• HerbMon 46
• 80smullet 8
• Pr0nogo 8
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
League of Legends
• Nemesis3930
• TFBlade952
Other Games
• imaqtpie2438
• Shiphtur234
Upcoming Events
OSC
5h 21m
The PondCast
15h 21m
WardiTV Summer Champion…
16h 21m
Replay Cast
1d 5h
LiuLi Cup
1d 16h
Online Event
2 days
SC Evo League
2 days
uThermal 2v2 Circuit
2 days
CSO Contender
2 days
Sparkling Tuna Cup
3 days
[ Show More ]
WardiTV Summer Champion…
3 days
SC Evo League
3 days
uThermal 2v2 Circuit
3 days
Afreeca Starleague
4 days
Sharp vs Ample
Larva vs Stork
Wardi Open
4 days
RotterdaM Event
4 days
Replay Cast
5 days
Replay Cast
5 days
Afreeca Starleague
5 days
JyJ vs TY
Bisu vs Speed
WardiTV Summer Champion…
5 days
Afreeca Starleague
6 days
Mini vs TBD
Soma vs sSak
WardiTV Summer Champion…
6 days
Liquipedia Results

Completed

StarCon 2025 Philadelphia
FEL Cracow 2025
CC Div. A S7

Ongoing

Copa Latinoamericana 4
Jiahua Invitational
BSL 20 Team Wars
KCM Race Survival 2025 Season 3
BSL 21 Qualifiers
WardiTV Summer 2025
uThermal 2v2 Main Event
HCC Europe
BLAST Bounty Fall Qual
IEM Cologne 2025
FISSURE Playground #1
BLAST.tv Austin Major 2025

Upcoming

CSL Season 18: Qualifier 1
ASL Season 20
CSLAN 3
CSL 2025 AUTUMN (S18)
LASL Season 20
BSL Season 21
BSL 21 Team A
RSL Revival: Season 2
Maestros of the Game
SEL Season 2 Championship
PGL Masters Bucharest 2025
MESA Nomadic Masters Fall
Thunderpick World Champ.
CS Asia Championships 2025
Roobet Cup 2025
ESL Pro League S22
StarSeries Fall 2025
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.