• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 05:04
CEST 11:04
KST 18:04
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Code S Season 2 (2026): RO4 and Finals Preview12TL.net Map Contest #22 - Voting & Ladder Map Selection6Code S Season 2 (2026) - RO8 Preview5[ASL21] Finals Preview: Two Legacies21Code S Season 2 (2026) - RO12 Preview2
Community News
[BSL22] Non-Korean Championship from 13 to 28 June2Weekly Cups (May 25-31): Clem doubles, 2v2 circuit heads toward finale0StarCraft II 5.0.16 PTR Patch Notes may 26th151Weekly Cups (May 18-24): MaxPax wins doubles0Crank Gathers Season 4: BW vs SC2 Team League6
StarCraft 2
General
ku88app2ku88app2 FlushCleanse Review- How To Order? Exposed TL.net Map Contest #22 - Voting & Ladder Map Selection Code S Season 2 (2026): RO4 and Finals Preview TL Poll: How do you feel about the 5.0.16 PTR balance changes?
Tourneys
Sparkling Tuna Cup - Weekly Open Tournament GSL Code S Season 2 (2026) WardiTV Mondays Maestros of The Game 2 announcement and schedule ! Crank Gathers Season 4: BW vs SC2 Team League
Strategy
[G] Having the right mentality to improve
Custom Maps
[D]RTS in all its shapes and glory <3
External Content
The PondCast: SC2 News & Results Mutation # 529 Opportunities Unleashed Mutation # 528 Infection Detected Welcome to the External Content forum
Brood War
General
25 Years Since Brood War Patch 1.08 BW animated web series: seeking contributors FlaSh's ASL S21 Finals Review BGH Auto Balance -> http://bghmmr.eu/ [BSL22] Non-Korean Championship from 13 to 28 June
Tourneys
[BSL22] Grand Finals - Sunday 21:00 CEST [ASL21] Grand Finals [Megathread] Daily Proleagues Escore Tournament StarCraft Season 2
Strategy
Why doesn't anyone use restoration? Any training maps people recommend? Muta micro map competition [G] Hydra ZvZ: An Introduction
Other Games
General Games
Nintendo Switch Thread ZeroSpace Megathread PC Games Sales Thread Summer Games Done Quick 2026! The Perfect Game
Dota 2
Looking for a Dota Mentor Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Vanilla Mini Mafia
Community
General
US Politics Mega-thread Trading/Investing Thread Things Aren’t Peaceful in Palestine YouTube Thread Russo-Ukrainian War Thread
Fan Clubs
The herO Fan Club!
Media & Entertainment
[Req][Books] Good Fantasy/SciFi books [TV/BOOK] *SPOILERS* Game of Thrones Discussion Movie Discussion! [Manga] One Piece
Sports
2024 - 2026 Football Thread McBoner: A hockey love story Formula 1 Discussion TeamLiquid Health and Fitness Initiative For 2023
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread Facing Challenges in Mobile App Development
TL Community
The Automated Ban List
Blogs
An Exploration of th…
waywardstrategy
I'm an arrogant trash talke…
FlaShFTW
Gauntlet SC2: A Retrospectiv…
Ctone23
Esportsmanship: How to NOT B…
TrAiDoS
Why RTS gamers make better f…
gosubay
ASL S21 English Commentary…
namkraft
StarCraft improvement
iopq
Customize Sidebar...

Website Feedback

Closed Threads



Active: 7147 users

[G] How To Apply And Understand Statistics

Forum Index > General Forum
Post a Reply
Sleight
Profile Blog Joined May 2009
2471 Posts
Last Edited: 2010-09-17 01:03:19
September 16 2010 21:20 GMT
#1
Hey y'all,

There will always be a lot of talk about trends in from a statistical perspective. People love to throw numbers around to support their argument, because numbers cannot lie, right? Well, yes, numbers themselves cannot lie, but with incorrect application or selection, they can provide an incomplete or incorrect picture. So all I want to do is clarify basic terms and statistical concepts for everyone, so that y'all know enough to intelligently understand and apply statistical tests to data.


Terms

People love to banter over what the various terms mean in statistics, because, in many cases, there is some wiggle room. What I will provide here are clear definitions of what they are supposed to mean, and how we can use them appropriately.

1) "Probability"
Probability, in general, comes from very complicated mathematical theories regarding how random events should behave given a large amount of data. What you should know is that "a probability" refers to the chance of an event relative to all events, or example, the number of heads in total coin flips. So we would say the probability of getting heads when you flip a coin is .5. Probabilities should always be between 0 and 1.

2)" Odds"
Odds is one of the most misused words. Odds are a RATIO, whereas Probability is a PROPORTION. A ratio is the chance of 1 event relative to another event, where both events are mutually exclusive. The odds of flipping a coin and getting heads is 1 (1 chance at heads/1 chance at tails = 1). Odds should always be between 0 and positive infinity.

3) "Mean"
The mean is the average value of all your data points, whether in a sample or population. This value can be dramatically skewed by the high and low ends of your data set.

4) "Median"
The actual 'middle' point. Consider if you wrote down all data points in a line and took off those at each end until you had 1 (or 2) left. That number left is the median. It is often more accurate to the sample/population than the mean. When the median and the mean are statistically significantly (we will discuss this term later) different, your sample/population typically has a non-normal distribution.

5) "Variance"
Imagine you rolled a die 12 times and got only 3s and 4s. The variance of that data set is low, because 3 and 4 are close together. If you roll a die 10 times and get 1-6 twice each, that data has a higher variance. Low variances suggest that the data is clustered around the mean. High variances suggest that the data is spread out. We can account for differing variances if we apply our test statistics correctly.

6) "Test Statistic"
The mathematical test or equation we will use to analyze if the data. There are an incredible number of these and selecting an appropriate one is one of the challenges of data analysis.

7) "Sample"
We can say that a sample, referred to as 'n', is a representative portion of the true population, where the population is everyone effectively. So if you took 100 players out of all the SC2 players, that would be your sample. If we select our sample correctly, it should accurately reflect our true population, with some exceptions. Importantly, the ONLY difference between a sample and a population for statistical analysis is in the names given to the variable representing variance and mean values. Almost all statistical tests CAN STILL apply even if you know the entire population, because you are not just examining if the data can represent a population but also if the data is possible according to a given distribution, normal, random, or non-random, for example. If our sample is too small, our tests lose Power and test statistics often cannot provide us statistically significant data.

8) "Null hypothesis"
A null hypothesis is what we expect to be true (or what should be true, in some cases). For example, that all three races in SC2 are equally powerful and this would be represented by equal success, in terms of ELO, win percentage, etc. A simple example is the null hypothesis that probability of getting heads when flipping a coin should be .5.

9) "P-value"
A p-value is the cornerstone of statistical analysis. What a p-value, and not alpha, refers to is the probability that, given an analogous set of data, meaning concerning the same topic and within a similar range and sample size, your test statistic would find a difference AS or MORE extreme from the null hypothesis.

10) "Power"
The ability of a test statistic to detect a difference between the sample(s) and the population at a given null hypothesis, expected difference, and sample size, assuming a difference exists. Ways to increase power are by increasing sample size, expecting a larger difference between your sample(s) and the population,

11) "Statistical significance"
The bombshell. This refers to whether or not, given an appropriate test statistic, the examiner can make a mathematically supported conclusion to reject the null hypothesis. We can ONLY reject the null hypothesis or say there is not enough evidence to reject the null hypothesis. We can never know the 'truth' so what we have to settle for is whether or not our test statistic gave us a different answer than expected. If the combination of sample, expected result, and data lack sufficient power, finding statistical significance from your data is less likely or can even be impossible.

12) "alpha"
Alpha is set to .05 by convention, which means that, if we were to run this test data again and again, .05 percent of results would NOT contain the value of the true population. What this means, in reverse, is that if our test statistic provides us statistically significant results, we can say that there is a 95% chance that our results contain the true mean, where our results are a value and a 95% confidence interval, a description of which I will add later, it is complicated and rarely necessary outside of publications.

13) "confounder"
Any factor that might account for your result other than what you are testing. If you were to have Idra and I play 100 games, and I won 50, you might conclude we are equally skilled. If he were drunk at the time, or if I cheese'd him every game, that might be a confounder. These can be explained and controlled for if you are careful in your analysis, experiment, and data collection.

14) "bias"
Any factor that skews the data of your sample and eliminates its ability to be 'externally valid,' meaning whether or not it can adequately reflect the true population. Bias may occur in selection, testing, data collection, pretty much anywhere. Common bias might be an imbalanced map pool, a tournament's non-equal race distribution or matchup distribution. The big one you will have to deal with is non-random sample selection. If you want to examine for Terran imbalance in the whole population, you cannot look at just the top percentage, because you are actually forcing a bias onto your sample. You can never actually be sure that a non-randomly selected sample reflects your population. http://en.wikipedia.org/wiki/Bias_(statistics)


Data Sets and Appropriate Statistical Tests

I will not be able to explain all of these things in sufficient detail, but what I will be able to do is explain which one's you can use for what kind of data. These will be the most common tests we can apply in RTS.

1) The student's t-test
For one-sample, like if I were to play Idra in a series, we can examine whether or not the results are consistent with an expected result. If we have a single matchup with a given sample size and expected result under a single paradigm, we can use a one-sample t-test to compare the data we have against an expectation. For example, our null hypothesis could be that the map pool is balanced ZvT. We could look at the win percentages for ZvT over all the maps, compares those to the expected .5 result across all maps and we would be able to find a t-value. http://en.wikipedia.org/wiki/Student's_t-test#Independent_one-sample_t-test

For a two-sample data set, like if I would want to compare if Terran's success is significantly different than Zerg's, we can see if two total populations are actually different from one another. In most cases, the size and variance of each sample (the Terran results and Zerg results, individually) will be different and we can account for that with proper statistical understanding. We would set our null hypothesis to be that Terran and Zerg should have equal win rates, variances, distribution, and examine the rest of the data based on that. http://en.wikipedia.org/wiki/Student's_t-test#Independent_two-sample_t-test

Our results will be in the form of t-value which can be converted to a p-value with a table.

2) The chi-squared test or "goodness of fit"
This test can be used to examine whether or not a series of data points conform to an expectation. For example, if we want to examine whether or not all three races have equal win percentages, this is most appropriate. Going back to our last example, if we want to prove Terran is imbalanced we would need to show that it varies statistically significantly from the appropriate 'goodness of fit' model with regards to not just Zerg but Protoss as well, and that, given the whole dynamic, there is a demonstrable difference. Importantly, this CANNOT account for population variance like appropriate two-sample data sets can. This means that it will provide statistically conclusive result but may not be enough to actually make the conclusion with any external validity, meaning it represents the population correctly. http://en.wikipedia.org/wiki/Pearson's_chi-square_test

This will give us a chi-squared value, which can be used to determine a p-value from a table.

I will add more to this as time goes on and demand increases.

Conclusions

The rule is this... You need to correctly select a test, apply it correctly, and then understand its limitations with regards to your data. Even if you pick the 'right' test and have adequate data, you cannot conclude anything but one specific result from any single test. I could show that Terran outperforms Zerg by two-sample t-test, but Protoss' success is a confounder. I could show that Terran outperforms both Zerg and Protoss is terms of mean win percentage, but this would not take into account sample variance, meaning a few extremely well performing Terrans could skew the mean (a confounder). No test is perfect, and be open to the fact that your data is not conclusive. It rarely will be. The correct response is to start explaining why your data is adequate, why the confounders aren't actual confounders, why the possible bias is not actual bias, etc.

If people ask for specific explanations, I am happy to provide them. I hope this offers some clarity into the nature of statistics and statistical discussion.

Cheers!
One Love
Judicator
Profile Blog Joined August 2004
United States7270 Posts
September 16 2010 22:25 GMT
#2
So you decided to shorten the most relevant part of statistics, aka the tests and the qualifications for using each test? That would be a hell of a lot more relevant if people understood the shortcomings and strengths of each test (parametric or nonparametric) than what you posted here.
Get it by your hands...
The_Pacifist
Profile Blog Joined May 2010
United States540 Posts
September 16 2010 22:31 GMT
#3
This is TL. Balance discussions and topics on the race distribution of the top 200 players will not go from "T is OP" to "Well, I applied a Chi-Square Analysis with an alpha value of .05..." because a thread was made explaining stats terms.

Sorry.
Yurie
Profile Blog Joined August 2010
12096 Posts
September 16 2010 22:38 GMT
#4
On September 17 2010 07:31 The_Pacifist wrote:
This is TL. Balance discussions and topics on the race distribution of the top 200 players will not go from "T is OP" to "Well, I applied a Chi-Square Analysis with an alpha value of .05..." because a thread was made explaining stats terms.

Sorry.


It will not become like that as long as the ones doing statistical analysis and posting their results are people not educated in statistics. This I assume is an attempt to try to educate a few people. Yet it mostly reads as a terms explanation list.

Also, the wonder of forums is that anybody can create a thread. If you feel a thread lacks proper background, create another one with it or just post it in the thread...
Sleight
Profile Blog Joined May 2009
2471 Posts
September 16 2010 22:53 GMT
#5
On September 17 2010 07:25 Judicator wrote:
So you decided to shorten the most relevant part of statistics, aka the tests and the qualifications for using each test? That would be a hell of a lot more relevant if people understood the shortcomings and strengths of each test (parametric or nonparametric) than what you posted here.


Thanks for the feedback! I can tell you put a lot of time into it and I will try to take all that into consideration.

I'll work on expanding the tests and explain parametric vs. nonparametric. Interestingly, most test statistics are not applicable to most RTS paradigms. In fact, besides the Mann-Whitney (which is a stretch), there don't seem to be any non-parametric tests useful except the chi-squared. And regarding parametric, how would you propose anyone use an ANOVA or matched t-test?

I personally, being responsible for doing epidemiological analysis of health studies, might not have as good an understanding of why these data on irrelevant tests are so important.

Furthermore, I actually talk about example of data sets we can analyze using applicable data and why these data are not necessarily absolute. What more should I do?

Thanks in advance, I know you will have a clear answer!
One Love
Judicator
Profile Blog Joined August 2004
United States7270 Posts
September 16 2010 23:08 GMT
#6
It's not about using, it's about understanding when to use what and how to qualify your results for a given test. That's important, not so much using the test, anyone can use a program to find that. Picking the right test and then interpreting the results is what's important.

I read biology papers all the time with some really shitty statistical methods and really wonder if these people ever had a class in statistics in their lifetime. Same goes for the reviewers of the papers.

Most situations are parametric, but I just happened to work in the one lab that required non-parametrics; they just illustrated the shortcomings of parametrics even more so for me.
Get it by your hands...
Nomak
Profile Joined March 2010
United States32 Posts
September 16 2010 23:20 GMT
#7
Nice post, just remember that the word 'data' is plural.
GGTeMpLaR
Profile Blog Joined June 2009
United States7226 Posts
September 16 2010 23:24 GMT
#8
haha finally

a statistics thread has been long overdue
Starfox
Profile Joined April 2010
Austria699 Posts
September 17 2010 00:33 GMT
#9
How can you post such a thread without linking

Greek Mythology 2.0: Imagine Sisyphos as a man who wants to watch all videos on youtube... and Tityos as one who HAS to watch all of them.
gogogadgetflow
Profile Joined March 2010
United States2583 Posts
Last Edited: 2010-09-17 00:50:36
September 17 2010 00:49 GMT
#10
Would have made more sense to post this with starcraft analogies, or examples of use and misuse from other threads. I know its the general forum but... not like there's anything here that's not in wikipedia or something. Statistics terminology: I don't think many people are gonna take this into consideration when posting :/ ... good work tho
Sleight
Profile Blog Joined May 2009
2471 Posts
September 17 2010 01:01 GMT
#11
On September 17 2010 09:49 gogogadgetflow wrote:
Would have made more sense to post this with starcraft analogies, or examples of use and misuse from other threads. I know its the general forum but... not like there's anything here that's not in wikipedia or something. Statistics terminology: I don't think many people are gonna take this into consideration when posting :/ ... good work tho


There are starcraft analogies... You just have to read the explanation of the tests. Though I will add some in to the 'terms' section now.
One Love
illu
Profile Blog Joined December 2008
Canada2531 Posts
Last Edited: 2010-09-17 01:24:26
September 17 2010 01:22 GMT
#12
I didn't read the whole thing, but I will point out one thing.

In term of mean vs median, they are both useful. For distributions that are highly symmetrical, the mean is usually a better estimator than the median for the true mean. For highly asymmetrical ones, however, median is often better.

Also, by "alpha", you mean the "level" of the test. For single hypothesis 0.05 is often used.
:]
s_86
Profile Blog Joined January 2009
United States191 Posts
September 17 2010 01:34 GMT
#13
You may ask, what has statistics done for us?

An increase in firing efficiency by 120%, the difference of a 63% increased lethal profiency.
Please log in or register to reply.
Live Events Refresh
Replay Cast
09:00
KungFu Cup 2026 Week 10
CranKy Ducklings18
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
ProTech75
StarCraft: Brood War
Shine 1552
BeSt 434
Hyuk 383
Tasteless 358
Killer 170
Dewaltoss 63
sorry 56
scan(afreeca) 55
ZergMaN 47
Sharp 34
[ Show more ]
sSak 29
Barracks 20
soO 19
hero 17
Sacsri 15
Hm[arnc] 11
Bale 10
Terrorterran 2
Light 0
Counter-Strike
olofmeister1026
Stewie2K998
shoxiejesuss540
Other Games
PiGStarcraft701
Liquid`RaSZi607
WinterStarcraft475
crisheroes126
SortOf38
RuFF_SC235
Organizations
Other Games
gamesdonequick701
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
[ Show 16 non-featured ]
StarCraft 2
• LUISG 19
• CranKy Ducklings SOOP5
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• iopq 5
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
League of Legends
• Nemesis4174
• Jankos2671
• Rush1168
Upcoming Events
Kung Fu Cup
1h 56m
Maestros of the Game
5h 56m
Classic vs Lambo
Clem vs Maru
Replay Cast
14h 56m
The PondCast
1d
Maestros of the Game
1d 5h
Serral vs Rogue
herO vs SHIN
OSC
1d 13h
Replay Cast
1d 14h
Maestros of the Game
2 days
Replay Cast
2 days
CranKy Ducklings
3 days
[ Show More ]
uThermal 2v2 Circuit
3 days
Sparkling Tuna Cup
4 days
uThermal 2v2 Circuit
4 days
OSC
4 days
Wardi Open
5 days
Replay Cast
6 days
Liquipedia Results

Completed

BSL Season 22
2026 GSL S2
Heroes Pulsing #1

Ongoing

IPSL Spring 2026
KCM Race Survival 2026 Season 2
Acropolis #4
CSCL: Masked Kings S4
YSL S3
Acropolis #4 - GSB
SCTL 2026 Spring
WardiTV Spring 2026
Maestros of the Game 2
uThermal 2v2 2026 Main Event
Murky Cup 2026
IEM Cologne Major 2026
Stake Ranked Episode 2
CS Asia Championships 2026
Asian Champions League 2026
IEM Atlanta 2026
PGL Astana 2026
BLAST Rivals Spring 2026
IEM Rio 2026
PGL Bucharest 2026
Stake Ranked Episode 1
BLAST Open Spring 2026

Upcoming

BSL 22 Non-Korean Championship
CSLAN 4
Blizzard Classic Cup 2026
Kung Fu Cup 2026 Grand Finals
CranK Gathers Season 4: BW vs SC2 Team League
HSC XXIX
Douyu Cup 2026
Heroes Pulsing #3
Heroes Pulsing #2
Esports World Cup 2026
BLAST Bounty Summer 2026
BLAST Bounty Summer Qual
Stake Ranked Episode 3
XSE Pro League 2026
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2026 TLnet. All Rights Reserved.