• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EST 17:34
CET 23:34
KST 07:34
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
RSL Revival - 2025 Season Finals Preview8RSL Season 3 - Playoffs Preview0RSL Season 3 - RO16 Groups C & D Preview0RSL Season 3 - RO16 Groups A & B Preview2TL.net Map Contest #21: Winners12
Community News
Weekly Cups (Dec 15-21): Classic wins big, MaxPax & Clem take weeklies1ComeBackTV's documentary on Byun's Career !10Weekly Cups (Dec 8-14): MaxPax, Clem, Cure win4Weekly Cups (Dec 1-7): Clem doubles, Solar gets over the hump1Weekly Cups (Nov 24-30): MaxPax, Clem, herO win2
StarCraft 2
General
Weekly Cups (Dec 15-21): Classic wins big, MaxPax & Clem take weeklies ComeBackTV's documentary on Byun's Career ! Micro Lags When Playing SC2? When will we find out if there are more tournament Weekly Cups (Dec 8-14): MaxPax, Clem, Cure win
Tourneys
$5,000+ WardiTV 2025 Championship Sparkling Tuna Cup - Weekly Open Tournament $100 Prize Pool - Winter Warp Gate Masters Showdow Winter Warp Gate Amateur Showdown #1 RSL Offline Finals Info - Dec 13 and 14!
Strategy
Custom Maps
Map Editor closed ?
External Content
Mutation # 505 Rise From Ashes Mutation # 504 Retribution Mutation # 503 Fowl Play Mutation # 502 Negative Reinforcement
Brood War
General
BGH Auto Balance -> http://bghmmr.eu/ Klaucher discontinued / in-game color settings Anyone remember me from 2000s Bnet EAST server? How Rain Became ProGamer in Just 3 Months FlaSh on: Biggest Problem With SnOw's Playstyle
Tourneys
[BSL21] LB QuarterFinals - Sunday 21:00 CET Small VOD Thread 2.0 [Megathread] Daily Proleagues [BSL21] WB SEMIFINALS - Saturday 21:00 CET
Strategy
Simple Questions, Simple Answers Game Theory for Starcraft Current Meta Fighting Spirit mining rates
Other Games
General Games
Nintendo Switch Thread Stormgate/Frost Giant Megathread Beyond All Reason Path of Exile General RTS Discussion Thread
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Deck construction bug Heroes of StarCraft mini-set
TL Mafia
Mafia Game Mode Feedback/Ideas Survivor II: The Amazon Sengoku Mafia TL Mafia Community Thread
Community
General
Things Aren’t Peaceful in Palestine US Politics Mega-thread The Games Industry And ATVI Russo-Ukrainian War Thread YouTube Thread
Fan Clubs
White-Ra Fan Club
Media & Entertainment
Anime Discussion Thread [Manga] One Piece Movie Discussion!
Sports
2024 - 2026 Football Thread Formula 1 Discussion
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread
TL Community
TL+ Announced Where to ask questions and add stream?
Blogs
The (Hidden) Drug Problem in…
TrAiDoS
I decided to write a webnov…
DjKniteX
James Bond movies ranking - pa…
Topin
Thanks for the RSL
Hildegard
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1707 users

GSL Code S Membership statistical analysis

Forum Index > StarCraft 2 Tournaments
Post a Reply
1 2 3 Next All
Mip
Profile Joined June 2010
United States63 Posts
December 10 2010 03:34 GMT
#1
So I've been working on a SC2 player ranking algorithm (see my other post).

So far I've only used the GSL, and I've only included player rankings, no race bias or map bias, or time-based skill evolution (all in progress and will be implemented as my data quantity increases).

Anyway, so I was looking over the list of Code S players and thought to myself that a lot of those players could easily have lost some of their matches and failed to qualify for Code S. So I wanted to see, based on the data, what was the probability of each player actually being in the Top 32.

Here are the results in a Google Spreadsheet

So as you look at that data, bear in mind, this data only obseving the GSL bracket final 64 player wins/losses is all the data in the world on the subject. This makes the algorithm non-ideal for prediction of the top skilled players. But it is ideal for assessing the uncertainty about the point system in actually getting the best players (at least for the top players).

Also bear in mind, this model implicitly assumes that not-qualifying for top 64 and not registering for the tournament are equivalent, which isn't a fair assumption, but there's no data available to fix this. JookToJung gets the raw end of this assumption. He must be very good to qualify all 3 seasons, but the model sees only his losing in the early rounds. This isn't something I like, but I don't have the proper data to correct this problem at this time.

So the table shows a lot of uncertainty about who actually belongs in Code S. There are plenty that could easy have been Code S if things turned out a slightly differently. July is easily Code S caliber, as is Ret, Loner only needed one more set and he'd be S class.

If I had more data on the qualifying rounds, I'm sure that people like JookToJung would look better. I might look into grouping all the players that have 3 or fewer games into one. Because they are hardly estimable with how little data there is on them.

But the higher up on the spreadsheet you go, the results get a lot more accurate since they are based on more games played. There are players that are clearly Top 32, a lot of people are really good, but the uncertainty associated with knowing their skills is fairly high (completely an artifact of not having a lot of data on them). The way the bracket system works, it just doesn't give very good estimates for the people who get knocked out in the first rounds.

Anyway, it is what it is. It should give you an underlying sense on what kind of information is in the data. You don't have to agree with the results, it's just what the data seem to be pointing to (under the constraints of the assumptions I had to make).
Treadmill
Profile Joined July 2010
Canada2833 Posts
December 10 2010 04:07 GMT
#2
This is pretty cool. Thanks a lot.
dissonantharmony
Profile Joined August 2010
United States46 Posts
December 10 2010 04:41 GMT
#3
Without going back through the match history, I'm curious to know why oGsTop ranks so high without being S Class...
Mip
Profile Joined June 2010
United States63 Posts
Last Edited: 2010-12-10 05:05:49
December 10 2010 04:52 GMT
#4
That's kind of the thing, the whole system is based on match history. Before the data, I say that all players are equally skilled and then let the data inform the skill parameters.

To answer your question, honestly I can't say for certain. The skill parameters are calculated using a complex mix of all of the data, borrowing strength from how good the opponents you beat are and how good the people they beat are, etc.

My guess is that in the case of oGsTop, he took out Polt who carried the information about beating MC and then he took a game off of FruitDealer. So though he was not in a lot of games, he had a pretty difficult bracket that he performed well in. Then he didn't get S class because he lost to FruitDealer in the Ro16 and didn't qualify in Seasons 2 and 3. The point system doesn't care who you lost to, this model does.
Drizz
Profile Joined August 2010
25 Posts
Last Edited: 2010-12-10 05:13:40
December 10 2010 05:11 GMT
#5
nice
wherebugsgo
Profile Blog Joined February 2010
Japan10647 Posts
December 10 2010 05:47 GMT
#6
Why is MC ranked below Jinro?

Why is Genius so low?

Why is Idra below Ret? Why is Ret so high?

Why is Butterflyeffect in there twice?

It seems like the weaknesses of certain playstyles, or at least certain weird occurences where one player beats another but loses to someone else is causing the rankings to become really weird. Players seem to get sandwiched between who they've won against and lost to. It looks really inaccurate.

Some players who have made it further into the GSL, or have qualified numerous times, and have been consistent, are being beaten out on this list by players who just have wins against them.
Treadmill
Profile Joined July 2010
Canada2833 Posts
December 10 2010 05:54 GMT
#7
On December 10 2010 14:47 wherebugsgo wrote:
Why is MC ranked below Jinro?

Why is Genius so low?

Why is Idra below Ret? Why is Ret so high?

Why is Butterflyeffect in there twice?

It seems like the weaknesses of certain playstyles, or at least certain weird occurences where one player beats another but loses to someone else is causing the rankings to become really weird. Players seem to get sandwiched between who they've won against and lost to. It looks really inaccurate.

Some players who have made it further into the GSL, or have qualified numerous times, and have been consistent, are being beaten out on this list by players who just have wins against them.

I think the reason for this is that winning is what matters.
Mip
Profile Joined June 2010
United States63 Posts
December 10 2010 06:06 GMT
#8
Winning is what matters at the end of the day. I didn't choose where the players went, but the model penalizes losing against low skill players, and rewards winning against high skilled players. I feel this is reasonable.

This isn't my model though, I don't want anyone to think that. This is the basic Bayesian Bradley-Terry model and it's used for thousands of pairwise comparison problems, so don't blame the method on me, it's just the commonly accepted Bayesian approach to pairwise comparison models. The data did all the determining of who goes where.

Butterflyeffect being in there twice is an error in the data and thanks for pointing it out. I copy-pasted the brackets from Liquipedia, and the player names are not consistent across seasons and sometimes not even within seasons.
danson
Profile Joined April 2010
United States689 Posts
Last Edited: 2010-12-10 06:11:10
December 10 2010 06:10 GMT
#9
yeahh not too sure about your algorithm...

idra made 2x 32s and 1x 16 and he only has a 24% chance of being in the top 32?

like the most basic of assumptions of that data would imply hes at LEAST the top 15-20, and seeing as how few people have actually qualified for all three gsls much less advanced in all 3 gsls hes probably much higher than that., i r confuse
danson
Profile Joined April 2010
United States689 Posts
Last Edited: 2010-12-10 06:24:38
December 10 2010 06:13 GMT
#10
On December 10 2010 15:06 Mip wrote:
Winning is what matters at the end of the day. I didn't choose where the players went, but the model penalizes losing against low skill players, and rewards winning against high skilled players. I feel this is reasonable.

This isn't my model though, I don't want anyone to think that. This is the basic Bayesian Bradley-Terry model and it's used for thousands of pairwise comparison problems, so don't blame the method on me, it's just the commonly accepted Bayesian approach to pairwise comparison models. The data did all the determining of who goes where.

Butterflyeffect being in there twice is an error in the data and thanks for pointing it out. I copy-pasted the brackets from Liquipedia, and the player names are not consistent across seasons and sometimes not even within seasons.



I cant read
Mip
Profile Joined June 2010
United States63 Posts
December 10 2010 06:16 GMT
#11
To answer your questions specifically, the spreadsheet for this post is based off of simulated plausible skill parameters for each player, and quantifies the percentage of times that each player is in the top 32 in their skill parameters.

Ret is high on this list because of the uncertainty associated with his skill level. Refer to my other post for the ranking where Ret is ranked lower.

Genius is low because his skill level has lower variance, but it is known to be smaller than Fruitdealer, Nestea,etc. His probability of being in the Top 32 is being dragged down by the uncertainty in the skill of others.

IdrA is also suffering from a smaller uncertainty. He's actually amazingly good, and plays super solid, which is reflected better in the ranking spreadsheet from my other post.

Don't misinterpret this spreadsheet, this one is NOT a ranking. It's just a measure of uncertainty about membership in the Top 32. It is related to their rank, but it is not exactly their rank.




wherebugsgo
Profile Blog Joined February 2010
Japan10647 Posts
December 10 2010 06:27 GMT
#12
I see, I understand now. It just seems weird that the likelihood of these players being in the top 32 doesn't actually hash with what we know to be consistent play.
Mip
Profile Joined June 2010
United States63 Posts
December 10 2010 06:28 GMT
#13
@danson I agree that the pre-lims should be included, and it does skew the results somewhat by not having them in there, esp. against people like IdrA and JookToJung who qualified all three seasons but didn't advance much.

If you know where to find that data, I would be immensely grateful, but I have not been able to track down anything from the pre-lims.

I totally understand any feelings of non-satisfaction with the current results. The model is great, it's backed by a lot of research, but it can't be better than the data that feeds into it. Which for now, is a problem as you've pointed out, as time goes on, however, these problems will go away.

If you or anyone else is interested in helping me find data and/or format data, PM me and we can trade skills. Right now, I'd really like someone who can parse the TL database and extract that information.

As of now, my data consists only of player names, but if we could extract the TL database information, we could get information like, which matchups are imbalanced overall? Which maps favor which race matchups and by how much? Plus an overall increase in prediction accuracy.


TyPsi5
Profile Joined May 2010
United States204 Posts
December 10 2010 06:30 GMT
#14
cool stuff -thanks for the effort
Plutonium
Profile Joined November 2007
United States2217 Posts
Last Edited: 2010-12-10 06:32:43
December 10 2010 06:30 GMT
#15
There is absolutely not enough data to extract any sort of conclusions from so far in SC2.

The game is still evolving, maps and luck play a huge factor, and the sheer lack of volume of games precludes any sort of meaningful analysis.

Additionally, the idea that losing in the prelims and not registering at all are equivalent is absolutely not a fair assumption. It massively biases the results in the favor of players who make a big run once but fail to qualify the other times, like Jinro, whereas a player like IdrA who made the top 32 every single tournament is somehow not in the top 32.
rwright
Profile Joined December 2010
1 Post
December 10 2010 06:33 GMT
#16
This should be interesting when there's more data.
Plutonium
Profile Joined November 2007
United States2217 Posts
Last Edited: 2010-12-10 08:16:15
December 10 2010 06:36 GMT
#17
hmm
Mip
Profile Joined June 2010
United States63 Posts
Last Edited: 2010-12-10 07:21:21
December 10 2010 07:00 GMT
#18
@Plutonium You're absolutely wrong about not being able to do any meaningful analysis. If you feel you can still make that statement after taking a Bayesian analysis class (you could not honestly do so), we can talk then, but you don't know what you're talking about.

I don't see how you can be peeved by a statistical analysis that is perfectly honest about it's assumptions. If I were to just present the results and didn't acknowledge my assumptions, you could say that it was not sound. But my assumptions I've been up front about, and they do not stop me from obtaining meaningful results.

The sheer luck of the matter is captured beautifully in the model. If you look at my original post, the predictions for last nights and tonights matches are both around 50/50. So yeah, the model works great and capturing our uncertainty.

Yeah, the edges are rough, but there's is information to be learned by beginning an analysis already. My top 32 has 26 of the same players as the point system, if there were absolutely no analysis that could be done, I would not be able to pull that off.

Your "add data" approach doesn't make much sense, the probability of winning a match also requires knowing who the opponent is, how do you propose I decide the skill of players I have never measured for the matches that I am arbitrarily adding into the data? And why 6 games? Why not, 4, or 8, or 20? Should I add data until I like the results? How sound of statistical practice is that?

My assumptions are based off of necessity, I had to make the implicit assumption that failing to qualify and not registering were the same thing because there is no data that allows me to separate the two.

You really shouldn't complain however, the point system does the exact same thing, you get 0 points for not entering, you get 0 points for failing to qualify, and no one QQs about that. The points are also rigged so that someone like Jinro will get more points than IdrA for his one entry, than IdrA who plays solid and qualifies every single time. Jinro only tried once, so take SSKS, he has failed to qualify twice, but because he made it to Ro8 once, he's ranked higher than IdrA by 200ish points.

Realistically, you should think of the current GSL point system as an approximation to an actual ranking system that assesses wins and losses fairly. The Bradley-Terry model that I'm using is backed by hundreds of research papers showing it's effectiveness in ranking competitions. As I get more data, I can relax most assumptions or they will simply wash out through repeated sampling. The biggest advantage I have with a the B-T model is that at the end of the day, I can make predictions based on the current state of knowledge provided by the data whereas with the point system, all you have is ranking. And as the amount of data increases, the predictions will be based off of even more knowledge.
wherebugsgo
Profile Blog Joined February 2010
Japan10647 Posts
December 10 2010 07:08 GMT
#19
On December 10 2010 16:00 Mip wrote:


You really shouldn't complain however, the point system does the exact same thing, you get 0 points for not entering, you get 0 points for failing to qualify, and no one QQs about that. The points are also rigged so that someone like Jinro will get more points than IdrA for his one entry, than IdrA who plays solid and qualifies every single time. Jinro only tried once, so take SSKS, he has failed to qualify twice, but because he made it to Ro8 once, he's ranked higher than IdrA by 200ish points.


Just wanted to point out that Jinro did not only try once, he tried both times prior. He had some difficulties and that's why he didn't qualify until GSL3. Hayder, also, for example, tried out for GSL2, but didn't make it till GSL3.
skipgamer
Profile Blog Joined April 2010
Australia701 Posts
Last Edited: 2010-12-10 07:13:00
December 10 2010 07:09 GMT
#20
On December 10 2010 12:34 Mip wrote: This makes the algorithm non-ideal for prediction of the top skilled players. But it is ideal for assessing the uncertainty about the point system in actually getting the best players (at least for the top players).


I challenge this statement.

If an algorithm is not ideal for prediction of the top skilled players, how can it then be ideal for assessing the uncertainty about the point system; the point of which is determining the top skilled players? :s

I think the data's cool and all, and it would be an awesome way of comparing players if the GSL was a 64 player invitational tournament. But because of the unavailability of data beyond the RO64 it's pretty inaccurate.
1 2 3 Next All
Please log in or register to reply.
Live Events Refresh
Next event in 1d 13h
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
RotterdaM 1207
IndyStarCraft 218
SteadfastSC 209
MaxPax 144
StarCraft: Brood War
910 24
Dota 2
syndereN1135
monkeys_forever273
NeuroSwarm67
League of Legends
C9.Mang0140
Counter-Strike
summit1g657
minikerr27
Heroes of the Storm
Liquid`Hasu483
Other Games
Grubby6844
FrodaN2111
fl0m899
B2W.Neo321
ArmadaUGS100
QueenE97
Mew2King65
ZombieGrub37
Trikslyr34
Maynarde33
Organizations
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 21 non-featured ]
StarCraft 2
• kabyraGe 149
• EnkiAlexander 49
• Reevou 13
• davetesta8
• HeavenSC 7
• RyuSc2 1
• Kozan
• AfreecaTV YouTube
• sooper7s
• intothetv
• IndyKCrew
• LaughNgamezSOOP
• Migwel
StarCraft: Brood War
• blackmanpl 66
• mYiSmile120
• STPLYoutube
• ZZZeroYoutube
• BSLYoutube
Other Games
• imaqtpie1976
• Scarra943
• Shiphtur258
Upcoming Events
WardiTV Invitational
1d 13h
Gerald vs YoungYakov
Spirit vs MaNa
SHIN vs Percival
Creator vs Scarlett
Replay Cast
2 days
WardiTV Invitational
2 days
ByuN vs Solar
Clem vs Classic
Cure vs herO
Reynor vs MaxPax
Replay Cast
4 days
Sparkling Tuna Cup
5 days
Replay Cast
6 days
Wardi Open
6 days
Liquipedia Results

Completed

YSL S2
WardiTV 2025
META Madness #9

Ongoing

C-Race Season 1
IPSL Winter 2025-26
KCM Race Survival 2025 Season 4
BSL Season 21
Slon Tour Season 2
CSL Season 19: Qualifier 2
eXTREMESLAND 2025
SL Budapest Major 2025
ESL Impact League Season 8
BLAST Rivals Fall 2025
IEM Chengdu 2025
PGL Masters Bucharest 2025
Thunderpick World Champ.
CS Asia Championships 2025
ESL Pro League S22

Upcoming

CSL 2025 WINTER (S19)
BSL 21 Non-Korean Championship
Acropolis #4
IPSL Spring 2026
Bellum Gens Elite Stara Zagora 2026
HSC XXVIII
Big Gabe Cup #3
OSC Championship Season 13
Nations Cup 2026
ESL Pro League Season 23
PGL Cluj-Napoca 2026
IEM Kraków 2026
BLAST Bounty Winter 2026
BLAST Bounty Winter Qual
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.