|
The latest TLPD version now includes ELO ratings for players. You can check them out here.
The resulting ranking is more or less "beta" right now, and the different variables used in the algorithm might change in the future (see below for more details).
Something everyone should know about ELO: it’s a 100% skill-based system. Success is not taken into account. Winning a game in the OSL finals or in the MBC prelims gives the exact same amount of points (provided the opponent is the same). We will probably get another rating system later, which will take success into account as well-- but ELO will most likely stay anyway.
So in the player index, there are two new distinct columns: - the "ELO" column gives the current ELO rating of the corresponding player: it can basically be seen as his "current strength" (again, pure skill-wise) and should evolve accordingly ; - the "ELO Peak" sorts players by the highest ELO rating they ever reached: this basically provides an easy way to spot which players were the most dominant at their time, but also when (clicking on their rating brings you to their "peak game").
For those knowledgeable about ELO, we currently use K = 40 for players who have played a limited amount of games (< 20), and K = 20 for the rest. However, nothing is set in stone in this regard-- it's kind of experimental and is likely to evolve.
I think that's it. A big thanks to gravity for all the help and the ELO lesson.
|
Awesome
|
Very nice
|
I don't think ELO would work out so well for OSL or MSL. In chess, they add rating points only for swiss or round robin tournaments. They never add or subtract points for knockout tournaments. But it will still be interesting to see.
|
Calgary25938 Posts
Nice.
It's interesting to try to draw conclusions from these ratings and see if history backs them up.
|
interesting. iloveoov was the most skilled BW player EVA.............
|
SweeT!
What does ELO stand for?
|
Good stuff! This matches quite well with intuition on who the best current players are.
On the first page (top 40) there are 16 terrans, 15 zerg, and 9 protoss. Surprisingly, almost as many zergs as terrans.
|
8716 Posts
On September 11 2007 04:55 polarwolf wrote: interesting. iloveoov was the most skilled BW player EVA.............
..? PoP already explained it best in the opening post: the highest peak shows the most dominant player. Skill isn't something that can be judged through something like an ELO ranking.
|
On September 11 2007 04:33 PoP wrote:
Something everyone should know about ELO: it’s a 100% skill-based system. Success is not taken into account.
|
Actually it only means that Oov's peak was the most ridiculous, which is a well known fact (although Savior came close to matching it).
You can say that Oov was the most skilled by far at that point, and that relatively to his opponents of the time he was the most dominant of them all, but comparing skills between two different "eras" is wrong imho, ELO or not.
|
In other words, the ELO value can be seen as a skill measure, while the peak one can be seen as a dominance measure. Sounds pretty accurate to me this way.
|
United States5262 Posts
ELO is the name of the guy who invented the system. Some Hungarian dude.
|
is awesome32244 Posts
|
51136 Posts
I find it funny that Boxer isn't even on the first page while someone like Darkelf is there.
|
GrandInquisitor
New York City13113 Posts
If I had known you guys were doing this I would have suggested Glicko-2, a modification of ELO ( http://en.wikipedia.org/wiki/Glicko_rating_system ) instead; it's what's used on XBox Live and allows for Ratings Uncertainty and volatility and measures like that, making it in general far more accurate than plain vanilla ELO
|
On September 11 2007 06:15 GTR-2-Go wrote:I find it funny that Boxer isn't even on the first page while someone like Darkelf is there. ditto Boxer isn't even on first page? we can't deny how skilled he is really darkelf... geez
|
Don't bash Elfie, he rocks.
|
On September 11 2007 06:19 GrandInquisitor wrote:If I had known you guys were doing this I would have suggested Glicko-2, a modification of ELO ( http://en.wikipedia.org/wiki/Glicko_rating_system ) instead; it's what's used on XBox Live and allows for Ratings Uncertainty and volatility and measures like that, making it in general far more accurate than plain vanilla ELO
We can totally change it if needed. Again, this is beta/experimental. I've heard about Glicko, but then forgot about it. Any other opinion? If it's officially better then we'll definitely change.
|
On September 11 2007 06:21 Pressure wrote:Show nested quote +On September 11 2007 06:15 GTR-2-Go wrote:I find it funny that Boxer isn't even on the first page while someone like Darkelf is there. ditto Boxer isn't even on first page? we can't deny how skilled he is really darkelf... geez
ELO column represents the current strength. Unfortunately, Boxer has lost most of his recent matches, explaining his rather low rating.
|
This is a great feature and I hope that eventually we will have modified K value for winning more prestigious matches like OSL or MSL finals over simple qualification games.
|
On September 11 2007 06:19 GrandInquisitor wrote:If I had known you guys were doing this I would have suggested Glicko-2, a modification of ELO ( http://en.wikipedia.org/wiki/Glicko_rating_system ) instead; it's what's used on XBox Live and allows for Ratings Uncertainty and volatility and measures like that, making it in general far more accurate than plain vanilla ELO Yeah, alot of other rated games now use Glicko instead of ELO nowadays.
|
|
Very very awesome work. I'm impressed.
~LoA
|
Very nice addition. I'm liking these new features :D
|
United States20661 Posts
|
For current rating that wouldn't be suprising at all
|
Glicko is not objectively better than ECO, they both have their up- and downsides.
Another great addition PoP, <3!
|
Heh, the top ~20 or so players are all also on the Kespa top 30.
Very nice feature for TLPD. ^.^
|
yellows peak was higher than boxers, thats quite surprising.
|
Isn't there a problem with inflation when comparing the peak ELO of players from different eras?
edit: I've heard that the Chessmetric system is supposed to be the best at calculating the relative historical rating of players.
Also, there's a fomula for factoring in the overall strength of the tournament into the ratings.
|
GrandInquisitor
New York City13113 Posts
Glicko is pretty much universally considered better than ELO; it's basically admitting that ratings are wildly variable in certain cases.
It's a pain to describe, though, and it's kind of more complicated. Check http://math.bu.edu/people/mg/glicko/glicko.doc/glicko.html out (the original glicko webpage), and just think whether or not it's worth it. I think it is; others might not.
|
Aaah thats a great thing to add. You guys rule. I really like ELO and stats.
|
Another great feature. I'm really impressed with all the work on this site.
I'll add my voice to those calling for Glicko2, if it's not too much trouble.
|
You guys make progaming so much.
|
|
What does ELO stand for? Interesting though
|
Electronic Light Orchestra
|
On September 11 2007 13:27 A3iL3r0n wrote: Electronic Light Orchestra
I thought it was Extreme Logistic Order
|
On September 11 2007 13:26 il0seonpurpose wrote: What does ELO stand for? Interesting though
It doesn't actually stand for anything. As mentioned earlier in this thread, it's just the last name of the guy who invented this rating system.
|
Stork is so high.. Switch him with July and take silent_control off the front page then I will be able to rest.
Really nice though, good work.
|
the ranking is automated, there is no room for switching anything except the formula calculating it.
|
Cayman Islands24199 Posts
|
Wow, Kosiro is on the front page. Not only does he have only 8 recorded matches, he's the same guy who messed up the Sandlot when he got Firefist to play for his own matches against Draco because he was drunk.
|
Osaka26958 Posts
kosiro can go to hell -_-
|
Cayman Islands24199 Posts
i dunno, elo works better for certain situations than others. you should try to see the predicative power of such a system as it si configured now. look at winning % in relation to elo difference between 2 players. might be a big project though
|
Rankings usually have numbers beside them indicating rank.
|
On September 11 2007 17:18 MYM.Testie wrote: Rankings usually have numbers beside them indicating rank.
Will be available shortly.
|
Nice to see this implemented. Like others have said, there's still room for tweaks/updates to the formulas, but I think the results are pretty reasonable as-is, and it's nice to finally have a less-subjective way of comparing players' strengths.
|
The top ranks moved around a bit! Hwasin first, Savior second, Stork dropping to third. It'll be interesting to see how these ranks end up after the current season of leagues.
|
I was thinking about this. Starcraft is a lot different then other things because there are 3 races. Some players are very good at a certain MU and not as good on others. I think it would be more accurate to add in that factor. It would probably be better to have 3 different elo ratings per player (well, technically 9 in case someone off-races), and when they play another player, only that specific elo rating would be affected. The overall elo rating for the player would be the length of the vector (square root of the sum of the squares).
I'm not sure if you are still working on this, or if there are other rating systems that are specifically meant to deal with what I'm talking about.
|
Yes, I agree that having different ratings for each matchup (although the overall rating could stay as it is) would be very interesting. It would lead to higher peaks (and lower lows), since a player is rarely as good in all matchups as they are in their best (or as bad as they are in their worst), plus you could use it for things like seeing who's truly a one-matchup wonder, what a player's best MU actually is, making more accurate predictions of matches, etc.
I think they might have even implemented this in the background or something because I remember there was a post where a TL admin provided a graph of a player's rating over time in one particular matchup, or something like that.
|
Reason for bumping old thread : new idea
It strikes me there is a interest in the community for a measure of pro-players on the rise, and the power rank does not fully address this need. For those who find the power rank to be too much dependant on speculation and opinion, it would be nice if the auto-analysis of the TL elo system were to be extended. I'm talking; measuring derivatives and ranges of elo variation over time.
Can the relavent TL techys hear me on this frequency?
|
|
|
|