TLPD Update - ELO Ranking

PoP

France15446 Posts

September 10 2007 19:33 GMT

The latest TLPD version now includes ELO ratings for players. You can check them out here.

The resulting ranking is more or less "beta" right now, and the different variables used in the algorithm might change in the future (see below for more details).

Something everyone should know about ELO: it’s a 100% skill-based system. Success is not taken into account. Winning a game in the OSL finals or in the MBC prelims gives the exact same amount of points (provided the opponent is the same). We will probably get another rating system later, which will take success into account as well-- but ELO will most likely stay anyway.

So in the player index, there are two new distinct columns:
- the "ELO" column gives the current ELO rating of the corresponding player: it can basically be seen as his "current strength" (again, pure skill-wise) and should evolve accordingly ;
- the "ELO Peak" sorts players by the highest ELO rating they ever reached: this basically provides an easy way to spot which players were the most dominant at their time, but also when (clicking on their rating brings you to their "peak game").

For those knowledgeable about ELO, we currently use K = 40 for players who have played a limited amount of games (< 20), and K = 20 for the rest. However, nothing is set in stone in this regard-- it's kind of experimental and is likely to evolve.

I think that's it. A big thanks to gravity for all the help and the ELO lesson.

statix

United States1760 Posts

September 10 2007 19:38 GMT

Awesome

vstar

Korea (South)693 Posts

September 10 2007 19:42 GMT

Very nice

TranceStorm

1616 Posts

September 10 2007 19:43 GMT

I don't think ELO would work out so well for OSL or MSL. In chess, they add rating points only for swiss or round robin tournaments. They never add or subtract points for knockout tournaments. But it will still be interesting to see.

Chill

Calgary25998 Posts

September 10 2007 19:52 GMT

Nice.

It's interesting to try to draw conclusions from these ratings and see if history backs them up.

polarwolf

924 Posts

September 10 2007 19:55 GMT

interesting. iloveoov was the most skilled BW player EVA.............

minus_human

4784 Posts

September 10 2007 20:03 GMT

SweeT!

What does ELO stand for?

Polemarch

Canada1564 Posts

September 10 2007 20:09 GMT

Good stuff! This matches quite well with intuition on who the best current players are.

On the first page (top 40) there are 16 terrans, 15 zerg, and 9 protoss. Surprisingly, almost as many zergs as terrans.

NonY

8751 Posts

September 10 2007 20:11 GMT

On September 11 2007 04:55 polarwolf wrote:
interesting. iloveoov was the most skilled BW player EVA.............

..? PoP already explained it best in the opening post: the highest peak shows the most dominant player. Skill isn't something that can be judged through something like an ELO ranking.

polarwolf

924 Posts

September 10 2007 20:14 GMT

#10

On September 11 2007 04:33 PoP wrote:

Something everyone should know about ELO: it’s a 100% skill-based system. Success is not taken into account.

PoP

France15446 Posts

September 10 2007 20:23 GMT

#11

Actually it only means that Oov's peak was the most ridiculous, which is a well known fact (although Savior came close to matching it).

You can say that Oov was the most skilled by far at that point, and that relatively to his opponents of the time he was the most dominant of them all, but comparing skills between two different "eras" is wrong imho, ELO or not.

PoP

France15446 Posts

September 10 2007 20:25 GMT

#12

In other words, the ELO value can be seen as a skill measure, while the peak one can be seen as a dominance measure. Sounds pretty accurate to me this way.

jkillashark

United States5262 Posts

September 10 2007 20:28 GMT

#13

ELO is the name of the guy who invented the system. Some Hungarian dude.

IntoTheWow

is awesome32278 Posts

September 10 2007 20:40 GMT

#14

Nice PoP

GTR

51595 Posts

September 10 2007 21:15 GMT

#15

I find it funny that Boxer isn't even on the first page while someone like Darkelf is there.

GrandInquisitor

New York City13113 Posts

September 10 2007 21:19 GMT

#16

If I had known you guys were doing this I would have suggested Glicko-2, a modification of ELO ( http://en.wikipedia.org/wiki/Glicko_rating_system ) instead; it's what's used on XBox Live and allows for Ratings Uncertainty and volatility and measures like that, making it in general far more accurate than plain vanilla ELO

Pressure

7326 Posts

September 10 2007 21:21 GMT

#17

On September 11 2007 06:15 GTR-2-Go wrote:
I find it funny that Boxer isn't even on the first page while someone like Darkelf is there.

ditto
Boxer isn't even on first page? we can't deny how skilled he is really

darkelf... geez

SonuvBob

Aiur21550 Posts

September 10 2007 21:27 GMT

#18

Don't bash Elfie, he rocks.

PoP

France15446 Posts

September 10 2007 21:28 GMT

#19

On September 11 2007 06:19 GrandInquisitor wrote:
If I had known you guys were doing this I would have suggested Glicko-2, a modification of ELO ( http://en.wikipedia.org/wiki/Glicko_rating_system ) instead; it's what's used on XBox Live and allows for Ratings Uncertainty and volatility and measures like that, making it in general far more accurate than plain vanilla ELO

We can totally change it if needed. Again, this is beta/experimental. I've heard about Glicko, but then forgot about it. Any other opinion? If it's officially better then we'll definitely change.

PoP

France15446 Posts

September 10 2007 21:29 GMT

#20

On September 11 2007 06:21 Pressure wrote:

Show nested quote +

ditto
Boxer isn't even on first page? we can't deny how skilled he is really

darkelf... geez

ELO column represents the current strength. Unfortunately, Boxer has lost most of his recent matches, explaining his rather low rating.

NovaTheFeared

United States7232 Posts

September 10 2007 21:29 GMT

#21

This is a great feature and I hope that eventually we will have modified K value for winning more prestigious matches like OSL or MSL finals over simple qualification games.

TranceStorm

1616 Posts

September 10 2007 21:33 GMT

#22

Yeah, alot of other rated games now use Glicko instead of ELO nowadays.

Wizard

Poland5055 Posts

September 10 2007 22:07 GMT

#23

W0w sweet...

LordofAscension

United States589 Posts

September 10 2007 22:22 GMT

#24

Very very awesome work. I'm impressed.

~LoA

OrderlyChaos

United States1115 Posts

September 10 2007 22:26 GMT

#25

Very nice addition. I'm liking these new features :D

Last Romantic

United States20663 Posts

September 10 2007 22:29 GMT

#26

BoxeR and YellOw tied?

;P

TheFoReveRwaR

United States10657 Posts

September 10 2007 22:38 GMT

#27

For current rating that wouldn't be suprising at all

Orome

Switzerland11984 Posts

September 10 2007 23:19 GMT

#28

Glicko is not objectively better than ECO, they both have their up- and downsides.

Another great addition PoP, <3!

Jyvblamo

Canada13788 Posts

September 10 2007 23:21 GMT

#29

Heh, the top ~20 or so players are all also on the Kespa top 30.

Very nice feature for TLPD. ^.^

IPS.ZeRo

Germany1142 Posts

September 10 2007 23:23 GMT

#30

yellows peak was higher than boxers, thats quite surprising.

fuglyfrog

United States521 Posts

September 10 2007 23:36 GMT

#31

Isn't there a problem with inflation when comparing the peak ELO of players from different eras?

edit: I've heard that the Chessmetric system is supposed to be the best at calculating the relative historical rating of players.

Also, there's a fomula for factoring in the overall strength of the tournament into the ratings.

GrandInquisitor

New York City13113 Posts

September 11 2007 00:00 GMT

#32

Glicko is pretty much universally considered better than ELO; it's basically admitting that ratings are wildly variable in certain cases.

It's a pain to describe, though, and it's kind of more complicated. Check http://math.bu.edu/people/mg/glicko/glicko.doc/glicko.html out (the original glicko webpage), and just think whether or not it's worth it. I think it is; others might not.

Eatme

Switzerland3919 Posts

September 11 2007 00:25 GMT

#33

Aaah thats a great thing to add. You guys rule.
I really like ELO and stats.

Waves

Australia185 Posts

September 11 2007 01:55 GMT

#34

Another great feature. I'm really impressed with all the work on this site.

I'll add my voice to those calling for Glicko2, if it's not too much trouble.

SuperJongMan

Jamaica11586 Posts

September 11 2007 02:03 GMT

#35

You guys make progaming so much.

xmShake

United States1100 Posts

September 11 2007 04:12 GMT

#36

DarkElf is HOT

il0seonpurpose

Korea (South)5638 Posts

September 11 2007 04:26 GMT

#37

What does ELO stand for? Interesting though

A3iL3r0n

United States2196 Posts

September 11 2007 04:27 GMT

#38

Electronic Light Orchestra

XCetron

5226 Posts

September 11 2007 04:30 GMT

#39

On September 11 2007 13:27 A3iL3r0n wrote:
Electronic Light Orchestra

I thought it was Extreme Logistic Order

Waves

Australia185 Posts

September 11 2007 06:24 GMT

#40

On September 11 2007 13:26 il0seonpurpose wrote:
What does ELO stand for? Interesting though

It doesn't actually stand for anything. As mentioned earlier in this thread, it's just the last name of the guy who invented this rating system.

jimminy_kriket

Canada5532 Posts

September 11 2007 06:49 GMT

#41

Stork is so high..
Switch him with July and take silent_control off the front page then I will be able to rest.

Really nice though, good work.

Aepplet

Sweden2908 Posts

September 11 2007 07:01 GMT

#42

the ranking is automated, there is no room for switching anything except the formula calculating it.

oneofthem

Cayman Islands24199 Posts

September 11 2007 07:03 GMT

#43

woot go oov

Silverflame

United States428 Posts

September 11 2007 07:35 GMT

#44

Wow, Kosiro is on the front page. Not only does he have only 8 recorded matches, he's the same guy who messed up the Sandlot when he got Firefist to play for his own matches against Draco because he was drunk.

Manifesto7

Osaka27173 Posts

September 11 2007 07:50 GMT

#45

kosiro can go to hell -_-

oneofthem

Cayman Islands24199 Posts

September 11 2007 08:08 GMT

#46

i dunno, elo works better for certain situations than others. you should try to see the predicative power of such a system as it si configured now. look at winning % in relation to elo difference between 2 players. might be a big project though

SK.Testie

Canada11084 Posts

September 11 2007 08:18 GMT

#47

Rankings usually have numbers beside them indicating rank.

PoP

France15446 Posts

September 11 2007 09:27 GMT

#48

On September 11 2007 17:18 MYM.Testie wrote:
Rankings usually have numbers beside them indicating rank.

Will be available shortly.

gravity

Australia2163 Posts

September 11 2007 12:18 GMT

#49

Nice to see this implemented. Like others have said, there's still room for tweaks/updates to the formulas, but I think the results are pretty reasonable as-is, and it's nice to finally have a less-subjective way of comparing players' strengths.

Jyvblamo

Canada13788 Posts

September 23 2007 18:22 GMT

#50

The top ranks moved around a bit!
Hwasin first, Savior second, Stork dropping to third.
It'll be interesting to see how these ranks end up after the current season of leagues.

fight_or_flight

United States3988 Posts

February 15 2008 02:05 GMT

#51

I was thinking about this. Starcraft is a lot different then other things because there are 3 races. Some players are very good at a certain MU and not as good on others. I think it would be more accurate to add in that factor. It would probably be better to have 3 different elo ratings per player (well, technically 9 in case someone off-races), and when they play another player, only that specific elo rating would be affected. The overall elo rating for the player would be the length of the vector (square root of the sum of the squares).

I'm not sure if you are still working on this, or if there are other rating systems that are specifically meant to deal with what I'm talking about.

gravity

Australia2163 Posts

February 15 2008 03:22 GMT

#52

Yes, I agree that having different ratings for each matchup (although the overall rating could stay as it is) would be very interesting. It would lead to higher peaks (and lower lows), since a player is rarely as good in all matchups as they are in their best (or as bad as they are in their worst), plus you could use it for things like seeing who's truly a one-matchup wonder, what a player's best MU actually is, making more accurate predictions of matches, etc.

I think they might have even implemented this in the background or something because I remember there was a post where a TL admin provided a graph of a player's rating over time in one particular matchup, or something like that.

meemoe_uk

United Kingdom29 Posts

January 26 2011 17:28 GMT

#53

Reason for bumping old thread : new idea

It strikes me there is a interest in the community for a measure of pro-players on the rise, and the power rank does not fully address this need.
For those who find the power rank to be too much dependant on speculation and opinion, it would be nice if the auto-analysis of the TL elo system were to be extended.
I'm talking; measuring derivatives and ranges of elo variation over time.

Can the relavent TL techys hear me on this frequency?

Normal

Please or register to reply.

TLPD Update - ELO Ranking

Completed

Ongoing

Upcoming