• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 11:07
CEST 17:07
KST 00:07
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Power Rank - Esports World Cup 202569RSL Season 1 - Final Week9[ASL19] Finals Recap: Standing Tall15HomeStory Cup 27 - Info & Preview18Classic wins Code S Season 2 (2025)16
Community News
Google Play ASL (Season 20) Announced9BSL Team Wars - Bonyth, Dewalt, Hawk & Sziky teams10Weekly Cups (July 14-20): Final Check-up0Esports World Cup 2025 - Brackets Revealed19Weekly Cups (July 7-13): Classic continues to roll8
StarCraft 2
General
What tournaments are world championships? #1: Maru - Greatest Players of All Time Power Rank - Esports World Cup 2025 Server Blocker The GOAT ranking of GOAT rankings
Tourneys
Esports World Cup 2025 WardiTV Mondays FEL Cracov 2025 (July 27) - $8000 live event Sparkling Tuna Cup - Weekly Open Tournament Master Swan Open (Global Bronze-Master 2)
Strategy
How did i lose this ZvP, whats the proper response
Custom Maps
External Content
Mutation #239 Bad Weather Mutation # 483 Kill Bot Wars Mutation # 482 Wheel of Misfortune Mutation # 481 Fear and Lava
Brood War
General
Google Play ASL (Season 20) Announced Simple editing of Brood War save files? (.mlx) BGH Auto Balance -> http://bghmmr.eu/ Ginuda's JaeDong Interview Series [Update] ShieldBattery: 2025 Redesign
Tourneys
[Megathread] Daily Proleagues [BSL20] Non-Korean Championship 4x BSL + 4x China CSL Xiamen International Invitational [CSLPRO] It's CSLAN Season! - Last Chance
Strategy
[G] Mineral Boosting Does 1 second matter in StarCraft? Simple Questions, Simple Answers
Other Games
General Games
Stormgate/Frost Giant Megathread Total Annihilation Server - TAForever Nintendo Switch Thread [MMORPG] Tree of Savior (Successor of Ragnarok) Path of Exile
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
Heroes of the Storm
Simple Questions, Simple Answers Heroes of the Storm 2.0
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
TL Mafia Community Thread Vanilla Mini Mafia
Community
General
US Politics Mega-thread Stop Killing Games - European Citizens Initiative Things Aren’t Peaceful in Palestine Russo-Ukrainian War Thread Post Pic of your Favorite Food!
Fan Clubs
INnoVation Fan Club SKT1 Classic Fan Club!
Media & Entertainment
[\m/] Heavy Metal Thread Anime Discussion Thread Movie Discussion! [Manga] One Piece Korean Music Discussion
Sports
Formula 1 Discussion 2024 - 2025 Football Thread TeamLiquid Health and Fitness Initiative For 2023 NBA General Discussion
World Cup 2022
Tech Support
Installation of Windows 10 suck at "just a moment" Computer Build, Upgrade & Buying Resource Thread
TL Community
TeamLiquid Team Shirt On Sale The Automated Ban List
Blogs
Ping To Win? Pings And Their…
TrAiDoS
momentary artworks from des…
tankgirl
from making sc maps to makin…
Husyelt
StarCraft improvement
iopq
Socialism Anyone?
GreenHorizons
Eight Anniversary as a TL…
Mizenhauer
Customize Sidebar...

Website Feedback

Closed Threads



Active: 635 users

Starcraft Statistics in R: A New Ranking Algorithm

Blogs > zoniusalexandr
Post a Reply
zoniusalexandr
Profile Blog Joined August 2010
United States39 Posts
July 28 2011 04:32 GMT
#1
In this ongoing series, I am using my years of statistical experience to analyze the game of Starcraft 2. Previously, I've looked at visualizing the different scenes in pro competition (Part 1), evaluating ELO as a tool for measuring player performance (Part 2), and calculating to what extent skill can be considered transitive (Part 3). Today I'll be outlining a general framework for modeling Starcraft 2 player performance and explaining how I've used it to rank professional players.

Note: This post is rather long and heavy on the statistical theory. There's no shame in skipping down to the results at the bottom.

Ranking Starcraft Players: A Bayesian Approach

Fundamentally every system for evaluating players possesses three main elements: a model of individual games, a set of information to be sought, and a computational mechanism to tie it all together. The model of individual games is key in defining which attributes are thought to matter in determining a winner. In most cases, the set of information corresponds to a series of ratings, one for each player, which are the "optimal" estimates of each player's true skill level. There are many existing estimation strategies that can be used, including simple formulas and complex multivariate regressions.

In ELO, individual games are modeled along the logistic curve, using simple formulas to update player ratings after each game is played. Additionally, the only information obtained from ELO is a single number for each player, which will accurately measure true skill in the long run, but perhaps not over finite samples. If this system produces consistent results, then we have no need to modify any of these elements. However, I believe that ELO does not work consistently when applied to Starcraft at the top levels (and if you've read part 2, then I hope you agree with me).

With that in mind, I would like to propose a Bayesian framework which will allow us to improve on ELO with respect to all three elements. This approach, which can be applied to practically any model of individual games, is based around a Bayesian view of player performance using the Metropolis-Hastings algorithm as our estimation tool.

Some of you may be wondering what I mean when I use the term Bayesian to describe this framework. In statistics, there are two main paradigms when it comes to inference: frequentist inference and Bayesian inference. The frequentist method views the model that produces our data as having certain fixed parameters, and uses estimation as a means of finding the most likely values for those parameters. In contrast, the Bayesian view treats the parameters as variables and takes the data to be fixed, estimating the probability of each possible value of a parameter being the "true" value, based on the data.

The Bayesian paradigm is named after a theorem discovered by Thomas Bayes, which is called "Bayes' rule". Its form is quite simple:
[image loading]
The rule provides a way of turning one conditional probability into its opposite, using the ratio of the unconditional probabilities. We will use this rule to calculate the probability of a set of possible parameter values given the data using the probability of the data given the values. The probability which we are calculating, P(A|B), is known as the posterior probability, since it represents the probability of a certain set of values after we have taken into account the data. In contrast, the probability P(A) is known as the prior probability, since it represents the probability before we take the data into account. The quantity P(B|A) is referred to as the likelihood since it reflects the chances of the data being produced by the set of values in question. Lastly, the term P(B) is nameless and we will essentially ignore it in our analysis (the reason for which will be explained later).

Now that we understand Bayes' rule, let's observe a few things. One, the end result we are obtaining is not a single value, nor is it a finite number of values. Instead, the Bayesian framework allows us to look at the entire distribution of possible ratings. For each player we can estimate the most likely ratings, the variance of possible ratings, and even do arbitrary quantiles. Not only that, we can also examine the correlations across player ratings, since we are viewing the joint distribution, not just the marginals.

In addition, we have that prior term in the calculation. This allows us our estimation to not only take into account the evidence, but also our prior beliefs about what the results might look like. Or not, if we choose to. With ELO, there was no possibility to take this into account. Perhaps we believe that skill ratings are approximately normally distributed. In that case, we could use a bell curve as our prior. These priors are inherently subjective, and so it is often useful to examine the results for multiple different possible priors.

And finally, we can observe that the only restriction on modeling individual games is that they produce a likelihood equation. The model could be of a logistic form, or perhaps a normal form, or something else entirely. It can include just ratings for the players, or take into account a wide variety of different effects, such as racial advantages, map effects and more. Admittedly, this is the area of this framework that I have least explored. As you will see below, I keep the logistic curve from ELO in my analysis, but I am hoping to extend this in the near future. With my new computational mechanism I can now move beyond the model limitations of an ELO-like approach.

So now we have a framework where we can use a wide variety of game models, and obtain a large amount of information about our model parameters. The final component is a computational tool to bridge the other two elements. Now, even though we can calculate our likelihood and prior, we are not able to calculate the posterior yet because of that pesky P(B) term. This term represents the unconditional probability of the actual data occurring, taking into account all possible values of the parameters. An astute observer might notice that when we are computing the posterior, this P(B) value will not change as we vary the values (the A term). This fact is what let's us use the Metropolis-Hastings algorithm.

The Metropolis-Hastings algorithm is a method for sampling from a distribution using only a factor proportional to the density. In our case, the likelihood * prior is proportional to the posterior, because the P(B) value is constant. As a quick overview, we start with an arbitrary set of parameters (setting all player ratings to 0). Then we randomly search for new candidate sets of values and test whether they represent an acceptable increase in probability compared to the previous values. If they are acceptable, then they become the new values and we repeat the process over.

As we repeat the process thousands of times, it won't matter what the initial set of values was (this is called the steady state). Once the algorithm is at this state, when we continue sampling we will be effectively sampling from the posterior, allowing us to estimate the distribution. Typically, we will sample several thousand values initially and then throw them out, in order to ensure that we have reached the steady state before estimating the posterior. This is known as the "burn-in" process. In my analysis, I use a burn-in period of ~200,000 samples, followed by 10,000 samples for estimating the process. (This is probably more than is necessary, but better to err on the side of caution).

----------------------------
Implementation
_________________

Now that we have covered the theory, let's turn to the actual ways I have implemented this framework. Today I will be using the logistic curve from ELO to model individual games, mostly because I spent all of my time working on other elements of the framework. I think this is still a reasonable model, although it no doubt can (and will) be improved upon in the coming weeks.

The other big choice I had to make was deciding which priors to use in the calculations. Today I will be utilizing two different priors to answer two slightly different questions about Starcraft 2 players. The first prior is a perfectly naive, uniform prior, which essentially presumes nothing about the players. This is essentially the same as not using a prior at all, and so the results are more comparable to ELO. I like to think of this prior as asking the question "Who could be the best Starcraft 2 player?", since it gives all players the benefit of the doubt.

I also wanted to estimate a model that incorporated some skepticism about certain players who are newer to the scene or haven't really had their abilities tested at the top level of Starcraft gaming. For this model I did two things. First, I used a skew normal distribution that dropped off in probability as the ratings increased, basically saying that the level of skepticism should increase as a player's rating increases. Second, I varied the thinness of the upper tail according to how many games each player had played in Premier Tournaments. The end result is that someone who has played a huge amount of games in top tournaments (i.e. MC, Sjow, July, and others) would face a prior that looks like this:
[image loading]
Meanwhile, a player who has never played in a Premier tournament would face a prior that looks like this (note the difference in the axes):
[image loading]
Note that I'm not directly assuming that players with fewer games in Premier tournaments are worse. Instead I'm decreasing the probability in the upper tail, which is equivalent to viewing a higher ranking by someone with few games in such tournaments with a greater amount of skepticism. Also note that this is just my prior, which can be overruled by the likelihood function. If a player has proved that he can beat the best players in the world, he will be ranked highly regardless of how many games he has played in Premier tournaments.

The full script I created to run the algorithm and produce the results can be found in the spoiler below:
+ Show Spoiler +

rm(list=ls())
options("stringsAsFactors"=FALSE)
dir <- "/home/ubuntu/"
setwd(dir)

tlpd <- read.csv(file = "tlpd.csv")

library(igraph)

tlpd$Date <- as.Date(tlpd$Date,format="%m/%d/%Y")
tlpdcondensed <- cbind(tlpd$WinnerID,tlpd$LoserID)
tlpdgraph <- graph.data.frame(tlpdcondensed,directed=FALSE)
stlpdgraph <- simplify(tlpdgraph)
cores <- graph.coreness(stlpdgraph)
tlpdgraph2 <- subgraph(stlpdgraph,as.vector(which(cores>10))-1)
tlpd <- tlpd[tlpd$Date > "2011-03-22",]

tlpdplayer <- read.csv("tlpdplayers.csv")
tlpdplayer <- tlpdplayer[!duplicated(tlpdplayer$PlayerID),]

tlpdm <- merge(tlpd,tlpdplayer,by.x="WinnerID",by.y="PlayerID",suffixes=c("",".w"),sort=FALSE)
tlpdm <- merge(tlpdm,tlpdplayer,by.x="LoserID",by.y="PlayerID",suffixes=c("",".l"),sort=FALSE)

tlpdkcore <- tlpdm[which(tlpdm$WinnerID %in% V(tlpdgraph2)$name),]
tlpdkcore <- tlpdkcore[which(tlpdkcore$LoserID %in% V(tlpdgraph2)$name),]

tlpdcondensed2 <- cbind(tlpdkcore$WinnerID,tlpdkcore$LoserID)
tlpdgraph3 <- graph.data.frame(tlpdcondensed2,directed=FALSE)
V(tlpdgraph3)$label <- unique(c(tlpdkcore$Winner,tlpdkcore$Loser))
V(tlpdgraph3)$size <- 0
V(tlpdgraph3)$label.cex <- 0.75
stlpdgraph2 <- simplify(tlpdgraph3)
cores2 <- graph.coreness(stlpdgraph2)
tlpdgraph4 <- subgraph(stlpdgraph2,as.vector(which(cores2>5))-1)
plot(tlpdgraph4,layout=layout.fruchterman.reingold(tlpdgraph4),main="TLPD Game Network",
sub="Only players with games against 12+ top opponents are shown (12-core)",margin=c(-1,-1,-1,-1))


tlpdkcore <- tlpdkcore[which(tlpdkcore$WinnerID %in% V(tlpdgraph4)$name),]
tlpdkcore <- tlpdkcore[which(tlpdkcore$LoserID %in% V(tlpdgraph4)$name),]

#Likelihood function
likeli <- function(vec){
return(sum(plogis(0,vec,2,log.p=TRUE)))
}

#Get number of matches against Koreans
getKoreaNumber <- function(player,data){
wins <- data[data$Winner==player,]
losses <- data[data$Loser==player,]
return(length(which(wins$Country.l=="ks"))+length(which(losses$Country=="ks")))
}

#Get number of matches against players in the top 50 by likelihood
getGoodPlayerNumber <- function(player,data,players){
wins <- data[data$Winner==player,]
losses <- data[data$Loser==player,]
return(length(which(wins$Loser %in% players))+length(which(losses$Winner %in% players)))
}

#Get number of matches in premier tournaments
getPremierNumber <- function(player,data,tournaments){
wins <- data[data$Winner==player,]
losses <- data[data$Loser==player,]
return(length(which(wins$Tournament %in% tournaments))+length(which(losses$Tournament %in% tournaments)))
}

premierTournaments <- c("2011 NASL Season 1 Finals","2011 Sapphire AMD Dreamhack Summer","2011 Pepsi Global StarCraft 2 League Season 4: Code S",
"2011 Pepsi Global StarCraft 2 League Season 4: Code A","2011 LG Cinema 3D StarCraft 2 3D Special League",
"2011 Major League Gaming Pro Circuit: Columbus","2011 Major League Gaming Pro Circuit: Dallas",
"2010 Major League Gaming Pro Circuit: Dallas","2010 Major League Gaming Pro Circuit: DC",
"2010 Major League Gaming Pro Circuit: Raleigh","2011 LG Cinema 3D GSL Super Tournament","2011 Gigabyte StarsWar Killer 6",
"2011 Copenhagen Games Razer Domination","2011 North American Star League Season 1",
"2011 LG Cinema 3D GSL Sponsorship League Season 3: Code S","2011 LG Cinema 3D GSL Sponsorship League Season 3: Code A",
"2011 Dreamhack Stockholm Invitational","2011 LG Cinema 3D GSL World Championship Seoul",
"2011 PokerStrategy.com TeamLiquid StarLeague","2011 Intel Extreme Masters World Championship",
"2011 Intel Extreme Masters European Championship Finals Seas","2011 Intel GSL Sponsorship League Season 2: Code S",
"2011 Intel GSL Sponsorship League Season 2: Code A","2011 ASSEMBLY Winter: SteelSeries Challenge",
"2011 Gainward PlayXP.com StarCraft II Tournament","2011 Sony Ericsson GSL Sponsorship League Season 1: Code S",
"2011 Sony Ericsson GSL Sponsorship League Season 1: Code A","2010 Winter DreamHack SteelSeries LAN Tournament",
"2010 Sony Ericsson Global StarCraft 2 League Open Season 3","2010 Sony Ericsson Global StarCraft 2 League Open Season 2",
"2010 TG Sambo-Intel Global StarCraft 2 League Open Season 1","2010 GSTAR StarCraft 2 All-Star Tournament",
"2010 Blizzcon Invitational","2010 Intel Extreme Masters American Championship Season 5","2010 IEM Global Challenge Gamescom")

premierPlayers <- c("Hack","DongRaeGu","PuMa","Bomber","Polt","sC","MC","MMA","NesTea","MVP","MarineKing","SuperNoVa","GuMiho",
"TOP","Byun","Alicia","Line","Ryung","Taeja","Sen","Min","Leenock","LosirA","aLive","Younghwa","oGsJ","Curious",
"Zenio","July","GanZi","Keen","TheStC","HuK","NaNiwa","Nerchio","ThorZaIN","Moon","Yoda","NaDa","Ace",
"GuineaPig","Rain","Maka","Strelok","Squirtle","Stephano","MorroW","IdrA","KiWiKaKi","Puzzle")

kn <- c()
pn <- c()
pln <- c()
for (i in tlpdplayer$Player){
kn <- c(kn,getKoreaNumber(i,tlpdkcore))
pn <- c(pn,getPremierNumber(i,tlpdkcore,premierTournaments))
pln <- c(pln,getGoodPlayerNumber(i,tlpdkcore,premierPlayers))
}
tlpdplayer$KoreaNumber <- kn
tlpdplayer$PremierNumber <- pn
tlpdplayer$PremierPlayers <- pln
library(sn)

mh <- function(data,n=1000,v=0.1,tlpdplayer=tlpdplayer){
numplayers <- length(unique(c(data$WinnerID,data$LoserID)))
tranvec <- vector()
pids <- unique(c(data$WinnerID,data$LoserID))
tranvec[pids] <- 1:numplayers
kn <- tlpdplayer$KoreaNumber[match(pids,tlpdplayer$PlayerID)]
pn <- tlpdplayer$PremierNumber[match(pids,tlpdplayer$PlayerID)]
pln <- tlpdplayer$PremierPlayers[match(pids,tlpdplayer$PlayerID)]
kn <- 1/(1+exp(-(kn-50)/15))
pn <- 1/(1+exp(-(pn-30)/10))
pln <- 1/(1+exp(-(pln-15)/5))
ratings <- numeric(numplayers)
mhdist <- matrix(nrow=0,ncol=numplayers)
games <- cbind(tranvec[data$WinnerID],tranvec[data$LoserID])
oldll <- likeli(ratings[games[,2]]-ratings[games[,1]]) #+ sum(dsn(ratings, -3, 1+3*pn, 8, log = TRUE))
burnin <- 1000*numplayers
n <- n + burnin

for (i in 1:n){
canFound <- FALSE
while(canFound==FALSE){
can <- rnorm(numplayers,ratings,v)
canll <- likeli(can[games[,2]]-can[games[,1]]) #+ sum(dsn(can, -3, 1+3*pn, 8, log = TRUE))
if(log(runif(1)) < (canll-oldll)){
ratings <- can
oldll <- canll
canFound <- TRUE
if(i > burnin){
mhdist <- rbind(mhdist,ratings)
}
}
}
if(i %% 1000 == 0){
print(i)
}
}
return(mhdist)
}

fivepercentci <- function(vec){
return(quantile(vec,0.05))
}

ninetyfivepercentci <- function(vec){
return(quantile(vec,0.95))
}

system.time(mhraw <- mh(tlpdkcore,10000,0.04,tlpdplayer))
mr <- apply(mhraw,2,mean)
vr <- apply(mhraw,2,var)
fpci <- apply(mhraw,2,fivepercentci)
nfpci <- apply(mhraw,2,ninetyfivepercentci)
quantr <- apply(mhraw,2,quantile)
pids <- unique(c(tlpdkcore$WinnerID,tlpdkcore$LoserID))
mhsummary <- cbind(pids,mr,vr,fpci,nfpci,t(quantr))
mhtlpd <- merge(mhsummary,tlpdplayer,by.x="pids",by.y="PlayerID")

write.csv(mhtlpd, "methast_tlpd_logis_patch13_july.csv")
write.csv(mhraw,"methast_raw_logis_patch13_july.csv")


Just as a quick aside before I get to the results, I like to mention that I used RStudio Server running on Amazon EC2 to run the program and it worked excellently. The ability to run computationally intensive programs on a powerful server for dirt-cheap continues to impress me. If any of you have questions about setting this up, let me know. I used one of Amazon's ordinary sized machines and it produced >225,000 samples in about 25 minutes. While it's not quite as fast to compute as ELO, my hope is that the added predictive ability will more than compensate for the extra time.

-------------------------
Results
-------------------------

I'll be showing you the rankings for the top 50 players out of the sample of 230 under each prior (that's all I could fit on-screen in Excel - if anyone knows how to print the spreadsheet to one long jpeg, let me know). For each player I have computed the mean rating, the variance in ratings, and their highest possible rating. This last value is obtained by calculating the upper 95% confidence bound on the rating, and seeing where the player would rank if this were their mean rating. The variance has been color-coded to aid you in noticing players whose ratings have more uncertainty. Also, don't pay so much attention to the magnitudes of the ratings, they are on a standard logistic curve instead of the modified ELO curve.

First, let's examine the rankings based on the completely naive prior:
[image loading]
As I said before, these ratings incorporate no skepticism, so a player who has only played 25 games in the sample, but has won all 25, would be rated infinitely high. No player falls into that category, but there are several players who have won most of their small history of games. My thoughts are that the lack of skepticism in this prior means that the system overrates many of the high variance players. These players have beaten some of the consensus top players, but to me it's unclear whether this is truly luck or skill based on the small number of games played.

The somewhat less than realistic results with the naive prior prompted me to introduce more skepticism into the computation. I did that by means of a skew normal prior that varies in the upper tail according to how many games in Premier tournaments that player has played. See above for a detailed explanation. See below for the results:
[image loading]
These results strike me as substantially more reasonable, in that the system isn't quite overreacting as much to the players with higher variance, but also isn't dramatically punishing players who haven't played as much (as ELO does).

I'd be curious to see what you think about how these two priors compare. Do you agree that the second rankings are "better"? If you have ideas for other priors that could be tested, let me know.

------------------------
Going Forward
------------------------

From now on I will be sticking with this system for measuring skill, but expanding the model of individual games to account for the effects of race and map. I rushed to get these results done before MLG Anaheim, so that this weekend can be a test of the predictive abilities of the ranking method. Let me know your thoughts on this system and its results below.

*****
TheAmazombie
Profile Blog Joined September 2010
United States3714 Posts
July 28 2011 04:48 GMT
#2
BTW, I love the effort that you put into all of this, first of all.

I do think that the second list looks more reasonable. I do find a few weird things in there, like Huk being so low, even after all of the recent success. Also, is that IM.Happy or the European Happy? I would be surprised to see the European Happy that high, even though I love him as a player. Also, I love Kas but it seems strange to see him over players like Sase, White-Ra, Mana, Socke, and others.
We think too much and feel too little. More than machinery, we need humanity. More than cleverness, we need kindness and gentleness. Without these qualities, life will be violent and all will be lost. -Charlie Chaplin
zoniusalexandr
Profile Blog Joined August 2010
United States39 Posts
July 28 2011 05:21 GMT
#3
Thanks!

First, that's Empire.Happy. Second, I cannot emphasize enough that this is still a work in progress. I'll be tinkering with the prior based on what happens in Anaheim. One other thing I've thought about doing is maybe weighting the games differently depending on how premier the tournament is. Right now a win in the round of 1024 in a random TL Open counts the same towards skill as Game 7 of the NASL finals (as long as it's the same players playing against each other).

For the record, Happy has won 6 of 8 against Socke, but has lost 6 of 9 against MaNa and 4 of 4 against SaSe. He hasn't played against White-Ra in the patch 1.3 era.
RoninShogun
Profile Joined November 2010
United States315 Posts
Last Edited: 2011-07-28 05:25:14
July 28 2011 05:24 GMT
#4
I understand what you're doing and all, but its just weird to look at the second list and see NesTea at 17th, though that is a major improvement for him not being on the first one at all

edit: oops he's actually 11th on the first one. hmmmmmmmmmmmmm...
Artosis: Yeah I was gonna probe rush but someone did that yesterday
TheAmazombie
Profile Blog Joined September 2010
United States3714 Posts
July 28 2011 06:05 GMT
#5
On July 28 2011 14:21 zoniusalexandr wrote:
Thanks!

First, that's Empire.Happy. Second, I cannot emphasize enough that this is still a work in progress. I'll be tinkering with the prior based on what happens in Anaheim. One other thing I've thought about doing is maybe weighting the games differently depending on how premier the tournament is. Right now a win in the round of 1024 in a random TL Open counts the same towards skill as Game 7 of the NASL finals (as long as it's the same players playing against each other).

For the record, Happy has won 6 of 8 against Socke, but has lost 6 of 9 against MaNa and 4 of 4 against SaSe. He hasn't played against White-Ra in the patch 1.3 era.


I see. That seems to be where the work needs to be the most then, in that weighting of tourney results and spots, because there has to be a difference in skill and whatnot between NASL finals and round of 1024. I dig it though.

I stated once on one of your other blogs that I would like to see a system that could take into account situations. Like the final game in a 3-3 tie for a bo7 championship. Not to mention even more in-depth intangibles like clutch performance and map performance, let alone in-game situations like coming back to win from huge deficits or something. I know that goes way more in-depth than what you are working on, but someday I would like to see it, being a huge baseball fan I love player and team statistics. That brings me to another point, what about team situations? Like weight a player with their team in a way where the team ratings or every single person on a team could get some kind of rankings based on their team mates rankings. I know it is not a "team" sport in the way that soccer or football are, but in time that could becoming a factor.
We think too much and feel too little. More than machinery, we need humanity. More than cleverness, we need kindness and gentleness. Without these qualities, life will be violent and all will be lost. -Charlie Chaplin
rabidch
Profile Joined January 2010
United States20289 Posts
July 28 2011 06:21 GMT
#6
Interesting, some players jumped up insanely in ranking like naniwa, moon, sen. I kind of want to test it on BW because it's a very simple modification in the script and BW has had very long histories, though you cant go from BW -> SC2. Don't have the time now though
LiquidDota StaffOnly a true king can play the King.
Primadog
Profile Blog Joined April 2010
United States4411 Posts
July 28 2011 20:12 GMT
#7
holy shit, time to brush off my stats book. This will take some time.


Again, nice work. I am your biggest fan.
Thank God and gunrun.
Erik.TheRed
Profile Blog Joined May 2010
United States1655 Posts
July 28 2011 20:20 GMT
#8
Hooray for graphs and spreadsheets!!

No seriously, this looks really interesting

I'm going to try my best to get through all of this.
"See you space cowboy"
Primadog
Profile Blog Joined April 2010
United States4411 Posts
July 28 2011 20:22 GMT
#9
Re: spreadsheet

I personally found using [code\] [/code\] is the best way on displaying tables at TL. Copy-pasting from the spreadsheet into [code/] retains the format reasonably well with minimum editing needed. See here for a sample.

Any chance you can put this dataset on google docs/spreadsheet? I can probably make use of your research in the future.
Thank God and gunrun.
Primadog
Profile Blog Joined April 2010
United States4411 Posts
July 28 2011 20:30 GMT
#10
TrueSkill uses conservative skill estimate = μ - k*σ for its display skill level, can you output a similar set for your rankings (as oppose to 95% ranking)?

Breezing through the data, what strikes me right away is your 95% ranking is a good indicator of "skill tiers" that was much lauded subjectively on the forums, but never really researched into.

Thank God and gunrun.
alexhard
Profile Joined May 2010
Sweden317 Posts
Last Edited: 2011-07-29 00:37:14
July 29 2011 00:35 GMT
#11
Well, depending on the actual forecast results this writeup may have been in vain...looking forward to see how your model performs.

If you're interested in individual games prediction I've had good results with a neural network ensemble using ELOs, matchup winrates, and WR over the last 20 games as inputs.
Heyoka
Profile Blog Joined March 2008
Katowice25012 Posts
July 29 2011 21:50 GMT
#12
This is really awesome, I was just thinking about your transitivity problem earlier and had some thoughts but now they should wait until I can digest this. Unfortunately I've never really taken time to properly grasp bayesian models but I see them referenced all the time in more advanced prediction simulations.
@RealHeyoka | ESL / DreamHack StarCraft Lead
alexhard
Profile Joined May 2010
Sweden317 Posts
August 01 2011 14:19 GMT
#13
So how did it go? :D
Cassel_Castle
Profile Blog Joined July 2011
United States820 Posts
August 02 2011 18:18 GMT
#14
I'm taking a 6-week mathematical statistics class starting today. In 6 weeks or less I will comment on this post and tell you what I think of your ranking system.
Please log in or register to reply.
Live Events Refresh
BSL20 Non-Korean Champi…
14:00
Bracket Day 1
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
Hui .392
BRAT_OK 90
ProTech67
MindelVK 36
StarCraft: Brood War
Britney 57727
Bisu 3769
Horang2 2870
Shuttle 2211
Jaedong 2006
Flash 1584
Barracks 1231
BeSt 1062
EffOrt 820
firebathero 448
[ Show more ]
Nal_rA 302
Rush 195
Soulkey 178
GuemChi 170
Soma 160
sorry 95
Dewaltoss 82
Hyun 73
Light 45
Shinee 43
sas.Sziky 42
JYJ36
zelot 30
Aegong 22
sSak 17
scan(afreeca) 16
Terrorterran 13
IntoTheRainbow 9
Stormgate
BeoMulf131
Dota 2
Gorgc6160
qojqva3204
XaKoH 532
420jenkins469
XcaliburYe333
LuMiX1
League of Legends
Trikslyr22
Counter-Strike
fl0m2859
sgares320
Heroes of the Storm
Khaldor446
Other Games
singsing2088
B2W.Neo1380
Beastyqt995
Fuzer 261
KnowMe90
kaitlyn55
Organizations
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 15 non-featured ]
StarCraft 2
• Berry_CruncH350
• Adnapsc2 19
• Dystopia_ 3
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• LaughNgamezSOOP
• Migwel
• sooper7s
StarCraft: Brood War
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• C_a_k_e 3023
League of Legends
• Nemesis2719
Upcoming Events
CSO Cup
53m
BSL20 Non-Korean Champi…
2h 53m
Bonyth vs Sziky
Dewalt vs Hawk
Hawk vs QiaoGege
Sziky vs Dewalt
Mihu vs Bonyth
Zhanhun vs QiaoGege
QiaoGege vs Fengzi
FEL
17h 53m
BSL20 Non-Korean Champi…
22h 53m
BSL20 Non-Korean Champi…
1d 2h
Bonyth vs Zhanhun
Dewalt vs Mihu
Hawk vs Sziky
Sziky vs QiaoGege
Mihu vs Hawk
Zhanhun vs Dewalt
Fengzi vs Bonyth
Sparkling Tuna Cup
2 days
Online Event
3 days
uThermal 2v2 Circuit
4 days
The PondCast
4 days
Replay Cast
5 days
[ Show More ]
Korean StarCraft League
6 days
CranKy Ducklings
6 days
Liquipedia Results

Completed

CSL Xiamen Invitational
Esports World Cup 2025
Murky Cup #2

Ongoing

Copa Latinoamericana 4
Jiahua Invitational
BSL20 Non-Korean Championship
BSL Team Wars
CSLPRO Last Chance 2025
CC Div. A S7
Underdog Cup #2
IEM Cologne 2025
FISSURE Playground #1
BLAST.tv Austin Major 2025
ESL Impact League Season 7
IEM Dallas 2025
PGL Astana 2025
Asian Champions League '25

Upcoming

ASL Season 20: Qualifier #1
ASL Season 20: Qualifier #2
ASL Season 20
CSLPRO Chat StarLAN 3
BSL Season 21
RSL Revival: Season 2
Maestros of the Game
SEL Season 2 Championship
uThermal 2v2 Main Event
FEL Cracov 2025
HCC Europe
ESL Pro League S22
StarSeries Fall 2025
FISSURE Playground #2
BLAST Open Fall 2025
BLAST Open Fall Qual
Esports World Cup 2025
BLAST Bounty Fall 2025
BLAST Bounty Fall Qual
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.