Aligulac Feb 6 Update: Oops

TheBB

Switzerland5133 Posts

February 07 2013 00:26 GMT

What's this?

Aligulac.com is an ongoing statistical project and website in development started in December 2012. It offers a comprehensive database of games from the pro and semipro SC2 scene, as well as a unique rating system aimed at rating players and teams, and predicting games.

The FAQ might be able to answer your questions. If not, I'll be keeping an eye on this thread so you can ask away here.

Also, before we start, I want to quote Heartland from one of my previous threads who had this to share, better put into words than I ever could.

On February 03 2013 02:50 Heartland wrote:
I think what's cool and great about this work is that it does what statistics are good for. They give you the ability to create data and then to look at it critically. But maybe people in this thread confuse statistics with the Truth with a capital p (sic). That's not the way you should read statistics, whether in the morning paper or on TL. Rather statistics can make us think about deeper connections that we haven't seen before, twist and turn around concepts and play with them through statistical models. They're not meant to say "Scarlett should be in Code S." Obviously there are flaws or issues with these stats, but it's common for statistics everywhere. What you can do with that is to add or change some modifier, let it meet other forms of reasoning or to extrapolate on what we take for granted.

So yeah, tl;dr, lies, damn lies and statistics are the case with all stats but it's not the point of stats.

So, let's get on with the news.

Ratings bug

Aside from a lot of minor improvements, two significant things have happened. On Monday I got an e-mail from a certain Zomia, who had lots of constructive feedback to the site.

You see, I initially envisioned this site to be about ratings, but now it's more of a TLPD-like thing, which is to say it's a database of results, and most of the development lately has been related to that. But Zomia convinced me to go back and look at the rating system once more, and lo and behold, there was a bug.

Strictly speaking, the bug was not in the rating system, but in the code that I used analyze it and pick parameters. Until now I have basically been basing these numbers off of flawed information. No longer! As it turns out this is for everyone's benefit, because it allowed me to tweak the system to be more conservative (which a lot of people have been wanting), while giving it the predictive power that I thought it had, and which it now actually has.

+ Show Spoiler [Proof] +

This graph plots all 114k+ games, categorizing them by predicted winrate for the presumed strongest player on the horizontal axis, and actual winrate for same on the vertical axis. That's the thick black line. The blue line shows a weighted linear fit, while the red line is the ideal that we want. The system tends to slightly overestimate the strength of the strongest player up to a game winrate of 80%. From then on the overestimation might be significant. (Note that this is game winrate, not Bo3 or Bo5 winrate, which is higher.)

Hopefully this can convince everyone that it's working just fine now.

The upshot of this is that the rating list looks much more sensible, as does the Hall of Fame (which I've changed a bit how it works), which is now topped by people like MC, MKP and Mvp.

So to those of you who have been critiquing me, I owe you an apology.

Add results! Now easy even for grandma

Now, to the results database. We have 114 thousand games. That's about twice the size of sc2charts.net and about a third more than TLPD has, which makes us the most complete publically available pro/semipro SC2 database in the world (Blizzard's internal database is neither). Which is fine by me, I don't much care for sc2charts.net, and while TLPD is great the international database is not kept up to date very well any more.

Yet, I don't blame them. Creating this database with little more than just four people pulling the cart, I've learned that this shit's hard.

That is why we have opened for everyone to submit results. You can do that here: http://aligulac.com/add/

Publically submitted results will be subject to review by us before becoming visible. If you are interested, you can still PM me and get an admin account, which will allow you to:

submit results directly without all the review bother.
change, create and delete players, teams, matches...
review other people's submitted results.
sort matches into our events catalogue, which is still missing about 40% of the database.
mark the offline games as offline (some people wanted a separate rating for offline games, but that can't happen until this sorting is done).
bug me for feature requests (priority given to helpers).
just about anything else you can think of.

So, if your favourite player is rated too low, I suggest you go out and find some games that they won and add them for us.

(Kinda joking, but also kinda not.)

Thanks to the unknown guy out there who submitted IPTL this morning, and thus unknowingly became the first to use this system. (No idea who you are...)

Now, as usual the ratings have been updated to the latest two-week period. This one includes results from the two first days of GSL Code A and the preliminaries, most of GSL Code S Ro32, Proleague, IPL qualifiers, IPTL, Iron Squid finals, a handful of go4sc2s and ZOTACs and a bunch of other tournaments.

Zerg is still ahead in the top of the ladder, but this time it's Terran who is lagging behind, and not (as is usually the case) Protoss.

(I'm a little bit earlier this time so the list can be out in time for the GSL Ro16.)

Current top 10

Life 2256 (+4 after Iron Squid)
PartinG 2239 (+15 after Code S Ro32, MLG quals and FXO inv. playoffs)
Leenock 2235 (no change after FXO Inv. playoffs)
Bomber 2233 (+12 after Code S Ro32)
Rain 2211 (-9 after Proleague and Code A prelims)
DongRaeGu 2203 (+41 after IPL intl. regionals, Iron Squid, and Code S Ro32)
TaeJa 2194 (-1 after Code S Ro32 and a delayed game from IPL intl. regional)
RorO 2179 (+93 after Proleague, Code S Ro32 and MLG quals)
Scarlett 2137 (no games)
viOLet 2132 (no games)

Parting and Leenock have switched places, same with DongRaeGu and Taeja. Roro and Violet make their appearances while Hero and Last drop out. Roro and DRG make the biggest jumps upwards.

Foreigner top 10

Scarlett 2137
Stephano 2015
VortiX 2014
Snute 1973
LucifroN 1947
Sen 1909
Kas 1886
Fraer 1869
TitaN 1868
Nerchio 1826

The foreigner list is still Zerg heavy, but not as much as before maybe. Nobody in the top 10 gained any points, except Snute who shot up 136 after playing a ton of games.

Top 10 teams

MVP 91.44%
StarTale 91.11%
SK Telecom T1 91.04%
Incredible Miracle 90.14%
AZUBU 89.08%
Team Liquid 88.75%
Prime 86.25%
FXOpen e-Sports Korea 85.96%
STX SouL 84.76%
Evil Geniuses 84.44%

This is the allkill rank, and it's very close in the top. StarTale has lost their big advantage, and SKT T1 keeps making their name as the strongest Kespa team.

The proleague rank looks like this:

SK Telecom T1 79.26%
MVP 78.80%
Incredible Miracle 74.91%
AZUBU 74.71%
StarTale 72.00%
STX SouL 67.12%
Team Liquid 66.52%
Prime 62.50%
Evil Geniuses 61.92%
FXOpen e-Sports Korea 60.87%

Still close between SKT T1 and MVP, who might be the strongest teams (on paper) at the moment. StarTale is comparatively weaker in this format (their roster is topped out with Life and Bomber), while STX is stronger.

(Note team ranks are based on player ratings and rosters, not actual team matches.)

Thanks

To my team of trusted helpers: Conti, kiekaboe, Grovbolle, PhoenixVoid, Inflicted_ (new) and scisyhp (new). This project would never have been possible without you.

This week particularly to Zomia whose feedback led me to reconsider a few things.

Also to my academic advisor whose timely conference trip abroad allowed me the free time to waste.

Bonus

If you haven't seen the player connectivity graph yet you need to check it out. It's the kind of thing that makes this work truly worth it, I feel. http://www.teamliquid.net/forum/viewmessage.php?topic_id=396566

+ Show Spoiler [Awesomeness inside] +

Promethelax

Canada7089 Posts

February 07 2013 00:32 GMT

You guys are awesome and I always love seeing your statistics. Not much I can add to the discussion but I think you do great work.

PhoenixVoid

Canada32740 Posts

February 07 2013 00:41 GMT

Hahaha, the person who submitted the IPTL results was ME! The system reset, and I forgot I had to login again, which I only realized after seeing the message that showed I was not logged in.

Funnily enough, I still didn't do it properly, forgot the "Source".

edwahn

New Zealand121 Posts

February 07 2013 00:44 GMT

Brilliant update, I really enjoy this website and hope to contribute some day

pPingu

Switzerland2892 Posts

February 07 2013 00:47 GMT

Small mistake here

2. StarTale 91.11%
3. SK Telecom T1 91.14%

Anyways great job as usual

opterown

Australia54784 Posts

February 07 2013 00:49 GMT

keep up the good work theBB!

TheBB

Switzerland5133 Posts

February 07 2013 00:49 GMT

On February 07 2013 09:47 pPingu wrote:
Small mistake here

Show nested quote +

Anyways great job as usual

One day I will manage this without having to edit.

Greenei

Germany1754 Posts

February 07 2013 00:55 GMT

hmm the 80%+ winrates seem to be really poorly predicted. but i guess that makes sense when new people have 1000 points and play against somewhat equally skilled players who have 2000 points. what really matters is the 50-70% region anyways. thx for the update, now i may FINALLY make some money with this thing :D

TheBB

Switzerland5133 Posts

February 07 2013 00:57 GMT

On February 07 2013 09:55 Greenei wrote:
hmm the 80%+ winrates seem to be really poorly predicted.

Also this could be because of the underlying model (maybe using normal distribution wasn't the best idea, logistic might be better after all... more on this in a later edition maybe), or because there really aren't that many games with 80%+ skill gap.

mierin

United States4943 Posts

February 07 2013 01:01 GMT

#10

On February 07 2013 09:49 opterown wrote:
keep up the good work theBB!

Greenei

Germany1754 Posts

February 07 2013 01:02 GMT

#11

btw: i still think a timeframe independant model would be nice, because it would make ratingupdates quicker. i could understand it though if that's not possible. so here is my idea:

how about a 'predicted rating development'? as in 'if these games, which have been reviewed, were the only games played in this timeframe, this rating would be the result.

Day[9]

United States7366 Posts

February 07 2013 01:07 GMT

#12

Aligulac is very nearly Caligula backwards. At least they anagram.

EtherealDeath

United States8366 Posts

February 07 2013 01:11 GMT

#13

On February 07 2013 10:07 Day[9] wrote:
Aligulac is very nearly Caligula backwards. At least they anagram.

Whoa. You're right O.O

TheBB

Switzerland5133 Posts

February 07 2013 01:15 GMT

#14

On February 07 2013 10:07 Day[9] wrote:
Aligulac is very nearly Caligula backwards. At least they anagram.

It's not even that complicated. Just move the C to the front or the back.

It's a word I came up with as a kid :-P. I don't remember if Caligula was the inspiration. I guess he could've been.

StarVe

Germany13591 Posts

February 07 2013 01:36 GMT

#15

Just want to say that you're awesome, TheBB. It's great someone picked up the pieces after TLPD just kinda shattered and made something even better and more complete.

I've been using it quite a bit and it's now my go-to SC2 database site if I need to look up something, thanks a lot to you and all your helpers!

Also, I'm very happy with the foreigner rankings, ROX.KIS fighting!

LockeTazeline

2390 Posts

February 07 2013 02:05 GMT

#16

Awesome stuff, as always.

What are you going to do with HotS? I noticed you've already put in the MLG Showdowns.

Greenei

Germany1754 Posts

February 07 2013 02:11 GMT

#17

On February 07 2013 10:02 Greenei wrote:
how about a 'predicted rating development'? as in 'if these games, which have been reviewed, were the only games played in this timeframe, this rating would be the result.

so what do you think about this?

i just think it would be really cool stuff.

docvoc

United States5491 Posts

February 07 2013 02:43 GMT

#18

On February 07 2013 10:07 Day[9] wrote:
Aligulac is very nearly Caligula backwards. At least they anagram.

He has been revealed O.o

idkfa

United States77 Posts

February 07 2013 02:56 GMT

#19

Click for cool graphs, evolved!

I like how Sen is a geographic region all by himself!

feanor1

United States1899 Posts

February 07 2013 03:05 GMT

#20

On February 07 2013 11:56 idkfa wrote:
Click for cool graphs, evolved!

I like how Sen is a geographic region all by himself!

Probably should make SEA "MoonGlade" lol

_Puppies

24 Posts

February 07 2013 03:10 GMT

#21

On February 07 2013 10:07 Day[9] wrote:
Aligulac is very nearly Caligula backwards. At least they anagram.

Text Twist Power!

Canucklehead

Canada5074 Posts

February 07 2013 03:12 GMT

#22

Ok boys, someone put this to the test and start betting for a month, then report on your winnings!

http://www.pinnaclesports.com/League/E Sports/SC2 GSL/1/Lines.aspx

Grovbolle

Denmark3805 Posts

February 07 2013 07:25 GMT

#23

It makes no sense for me to do a write-up on the kespa pros since a shit ton of old matches were added, meaning that everyones ranking changed so it's hard to compare between period 76 and 77.
Will do it for the next ranking. I feel like quoting my own skype log with TheBB because I said the whole "aligulac is like caligula" first :D

StarGalaxy

Germany744 Posts

February 07 2013 12:28 GMT

#24

One of my favourite sc2 releated sites.

poor socke he lost the position as the best foreigner protoss. He isn't even in the top 10 Eu toss right now.
I guess he needs some more consistency to get up there again with the new ranking.^^

Grovbolle

Denmark3805 Posts

February 07 2013 12:34 GMT

#25

On February 07 2013 21:28 StarGalaxy wrote:
One of my favourite sc2 releated sites.

poor socke he lost the position as the best foreigner protoss. He isn't even in the top 10 Eu toss right now.
I guess he needs some more consistency to get up there again with the new ranking.^^

Consistency and lots of matches will boost him

WigglingSquid

5194 Posts

February 07 2013 12:41 GMT

#26

We're just waiting for your "Quality Poster" star, that's all.

JustPassingBy

10776 Posts

February 07 2013 12:41 GMT

#27

Only 5 terrans in the top 40 non-Koreans now... ;;
Is there a reason why Demulsim does not show up in the non-Korean rankings, btw?

Targe

United Kingdom14103 Posts

February 07 2013 12:44 GMT

#28

Awesome as usual BB!

Conti

Germany2516 Posts

February 07 2013 12:51 GMT

#29

On February 07 2013 11:05 LockeTazeline wrote:
Awesome stuff, as always.

What are you going to do with HotS? I noticed you've already put in the MLG Showdowns.

There are already various HotS showmatches and tournaments in the database, and we (and anyone who wants to help out!) are in the process of categorizing them as HotS matches/tournaments.

As far as I know, there won't be a distinction made between WoL and HotS matches played in regards to the ratings. There might be some chaos during the launch of HotS because of this, but there's really no smart way to prevent this, and the ratings will adjust quickly enough.

Grovbolle

Denmark3805 Posts

February 07 2013 12:55 GMT

#30

On February 07 2013 21:41 JustPassingBy wrote:
Only 5 terrans in the top 40 non-Koreans now... ;;
Is there a reason why Demulsim does not show up in the non-Korean rankings, btw?

No matches played for 4 periods (8 weeks) meaning he is currently too hard to rate, his rating isn't deleted, just hidden until we again have an idea of his skill level.

It's explained here
http://aligulac.com/faq

"If a player has not played any games for four periods (eight weeks, or about two months), he or she is removed from the list. The entry is still there, and will be taken into account if the player plays a new game some time in the future. It's only kept from the published list to keep it from filling up with uncertain and possibly irrelevant data.

The same happens if your rating uncertainty goes over 200.

I'm aware that a strict limit of only two months may seem harsh, but as the game is so volatile, if a player doesn't play fairly frequently, the ratings quickly become very uncertain and useless. Remember, your rating is still there."

TheBB

Switzerland5133 Posts

February 07 2013 14:30 GMT

#31

On February 07 2013 11:11 Greenei wrote:

Show nested quote +

so what do you think about this?

i just think it would be really cool stuff.

Yeah, this is doable.

dcemuser

United States3248 Posts

February 07 2013 14:50 GMT

#32

I love Aligulac.

The -only- issue that I can think of is that there is a 'flaw' to the way it handles GSL/PL. When the Korean scene stays largely separate from the foreign scene, Aligulac has a hard time distinguishing "the gap". For example, the first few GSL seasons had almost no players who had played against foreigners in recorded matches.

Therefore, players who beat exclusively Koreans who did not play (and stomp) foreigners are not having their bar set high enough. This taints a lot of the 2010/early 2011 data. For example, look at NesTea in the Hall of Fame. He's below -Naniwa- and like 4 other foreign players, despite winning 3 GSLs in a 9 month time period.

Luckily, this probably isn't a major issue going forward, since MLG and other tournaments are bringing major Koreans to stomp foreigners, and then those Koreans go home and get beaten by better Koreans, which keeps things balanced. However, there was a once upon a time where the scenes rarely mixed.

TL;DR: GSL Code S is the highest skill tournament in the world, but to Aligulac, this is something that needs to be -continuously- proven, and therefore runs the risk of -not- being proven due to the said players not traveling and focusing entirely on GSL.

http://aligulac.com/periods/21/?sort=&race=ptzrs&nats=all is a big example of this. Some guy named Bubbles 3-0'd a bunch of foreigners (0 korean players) and became the #1 player in the world, lol.

I'm honestly not sure how you solve this, other than just mentioning it in the FAQ and admitting the early data is going to be kind of weird.

Conti

Germany2516 Posts

February 07 2013 15:15 GMT

#33

That's precisely the reason why we have an international TLPD and a Korean TLPD (at least I think so. Do let me know if I'm wrong here). Since in Brood War, Koreans practically never played against foreigners, and if they would have combined the TLPD elo ranking back then, foreigners would rank evenly with Koreans. Which was even more laughable back then in Brood War than it is now in SC2. So having separate databases for TLPD made perfect sense.

Now, however, Koreans play foreigners all the time, so we can compare them. With the exception, as you say, of the early days of GSL. Another more prominent exception is Proleague, and it's even worse there, IMO. You can see the jumps in the rankings of most Kespa players (Classic, Last, Trap, etc.), which denotes the point where Kespa players started to play against non-Kespa Koreans. As you can see, the rankings have adjusted pretty quickly.

Unfortunately, there are still a lot of lesser known Kespa players who have almost solely played in Proleague so far, and their ranking is unrealistically low. Heck, some of the worst ranked players in the entire ranking are Kespa players. I don't think there's any real way to "fix" this, other than waiting it out until the Kespa ratings have adjusted to the overall ratings.

Grovbolle

Denmark3805 Posts

February 07 2013 15:20 GMT

#34

On February 07 2013 23:50 dcemuser wrote:
I love Aligulac.

The -only- issue that I can think of is that there is a 'flaw' to the way it handles GSL/PL. When the Korean scene stays largely separate from the foreign scene, Aligulac has a hard time distinguishing "the gap". For example, the first few GSL seasons had almost no players who had played against foreigners in recorded matches.

Therefore, players who beat exclusively Koreans who did not play (and stomp) foreigners are not having their bar set high enough. This taints a lot of the 2010/early 2011 data. For example, look at NesTea in the Hall of Fame. He's below -Naniwa- and like 4 other foreign players, despite winning 3 GSLs in a 9 month time period.

Luckily, this probably isn't a major issue going forward, since MLG and other tournaments are bringing major Koreans to stomp foreigners, and then those Koreans go home and get beaten by better Koreans, which keeps things balanced. However, there was a once upon a time where the scenes rarely mixed.

TL;DR: GSL Code S is the highest skill tournament in the world, but to Aligulac, this is something that needs to be -continuously- proven, and therefore runs the risk of -not- being proven due to the said players not traveling and focusing entirely on GSL.

http://aligulac.com/periods/21/?sort=&race=ptzrs&nats=all is a big example of this. Some guy named Bubbles 3-0'd a bunch of foreigners (0 korean players) and became the #1 player in the world, lol.

I'm honestly not sure how you solve this, other than just mentioning it in the FAQ and admitting the early data is going to be kind of weird.

It speaks for itself that it takes a lot of matches from a large group of players to determine their skills,and I agree that the early stages of the lists are somewhat dubious. The same goes for a lot of the Kespa pros, the more matches we get to see from them, and the more interaction there is in the entire scene, the better the lists and the predictions get. They all started out at 1000 rating, which isn't a lot since we knew that most of them would still be pretty good, but it is also hard to just say "well they are better than a 1000 for sure" when we have no definitive way of being sure.

Valid points/concerns though.
The system will get shaken a lot when HotS hits as well.

VirgilSC2

United States6151 Posts

February 07 2013 15:25 GMT

#35

At this point I'm starting to think TL should just buy rights to Aligulac and roll it in as TLPD....

I love this site man.

Grovbolle

Denmark3805 Posts

February 07 2013 15:28 GMT

#36

On February 08 2013 00:25 VirgilSC2 wrote:
At this point I'm starting to think TL should just buy rights to Aligulac and roll it in as TLPD....

I love this site man.

Well they can always download the MySQL dump
http://aligulac.com/db/

twistedalpha

Philippines139 Posts

February 07 2013 15:54 GMT

#37

For a second there I thought you're talking about this guy

[image loading]

JustPassingBy

10776 Posts

February 07 2013 16:27 GMT

#38

On February 07 2013 21:55 Grovbolle wrote:

Show nested quote +

Ah, thanks for the info. That rules does make alot of sense to me.

KillerDucky

United States498 Posts

February 07 2013 17:16 GMT

#39

It's possible to fix this, some papers I read call it parameter smoothing, using backward filtering to smooth the past ratings. See for example this paper: http://tennis-skill-rankings.googlecode.com/hg-history/c977c53a3af2913e780e39666fe1a272cc298319/links/glicko.pdf

ZenithM

France15952 Posts

February 07 2013 17:20 GMT

#40

On February 07 2013 10:15 TheBB wrote:

Show nested quote +

It's not even that complicated. Just move the C to the front or the back.

It's a word I came up with as a kid :-P. I don't remember if Caligula was the inspiration. I guess he could've been.

I thought it definitely was :D
When you released aligulac for the first time, I even searched into Caligula to find some kind of connection to your website. It wasn't very insightful. Except that he had decent macro, I guess.

TheBB

Switzerland5133 Posts

February 07 2013 18:06 GMT

#41

On February 08 2013 02:16 KillerDucky wrote:

Show nested quote +

I thought about this (that's the paper I based my method on actually), but I didn't quite like the idea of past lists changing forever. When FIDE (chess) ratings are published they are set in stone, and you know for example that Kasparov's 2851 record from 1999 or Carlsen's 2872 at the moment will never be anything other than what they are. It makes it awkward for enthusiasts to track records. Not that I've noticed a lot of people tracking Aligulac records, since the pasts lists are changing anyway due to the expanding database (for the time being), but still, I wanted to give people the option.

Thoughts?

Edit: Just so it's clear, we're talking about basing ratings on both past and future results, so that the historical ratings look more correct in hindsight. It can fix some of the early problems by (for example) adjusting Koreans upwards because we now know that they have an average higher skill level.

KillerDucky

United States498 Posts

February 07 2013 18:54 GMT

#42

Maybe just run smoothing once. Really as long as you start from around October 2012 (MvP matches) and smooth backwards from there, most of the problems would probably be fixed.

MCXD

Australia2738 Posts

February 07 2013 18:58 GMT

#43

Just letting you know that I encountered an error when playing around with the prediction stuff:

+ Show Spoiler +

It seems like the round robin thing doesn't like large groups. (Was just an 8-man round robin w/ bo1 using the players shown)

EtherealDeath

United States8366 Posts

February 07 2013 19:03 GMT

#44

Lol funny rounding error that is.

TheBB

Switzerland5133 Posts

February 07 2013 19:11 GMT

#45

Ah yeah, you used a group that was so big it was forced to use Monte Carlo simulation. Thanks for the heads up.

ACrow

Germany6583 Posts

February 07 2013 19:56 GMT

#46

There is not a lot of difference between First and Last in the recent list.

Good job, always love your list! Glad you found a bug, it still seems a bit weird seeing Scarlett that high on the list, but w/e, math does not lie and it's only a model not the truth (whatever truth is).

Greenei

Germany1754 Posts

February 07 2013 22:22 GMT

#47

what does the '+-30' in the matchuppoints nad general rating actually mean? does it mean ~100% of the time the rating is in that area? or is that 1 or 2 or 3 standarddeviations? or the maximum amount that the rating will shift?

Grovbolle

Denmark3805 Posts

February 07 2013 23:05 GMT

#48

On February 08 2013 07:22 Greenei wrote:
what does the '+-30' in the matchuppoints nad general rating actually mean? does it mean ~100% of the time the rating is in that area? or is that 1 or 2 or 3 standarddeviations? or the maximum amount that the rating will shift?

Making a qualified guess, I would say it is +-3 standard deviations (meaning that 95% of the time, the actual rating falls within the confidence interval, i.e. Rating +- 3 St. Deviations.)

Edit: Of course 3 std's = 99% (I am retarded)

TheBB

Switzerland5133 Posts

February 07 2013 23:34 GMT

#49

It's actually just one estimated standard deviation, so it's a pretty weak confidence interval.

Greenei

Germany1754 Posts

February 08 2013 02:52 GMT

#50

Making a qualified guess, I would say it is +-3 standard deviations (meaning that 95% of the time, the actual rating falls within the confidence interval, i.e. Rating +- 3 St. Deviations.)

3 stds would be ~99%.

On February 08 2013 08:34 TheBB wrote:

Show nested quote +

It's actually just one estimated standard deviation, so it's a pretty weak confidence interval.

k thx. do you plan on making the database open source at any point? because i'd like to make some calculations of my own from time to time and there would be no point at all in starting an own database at this point.

Conti

Germany2516 Posts

February 08 2013 06:17 GMT

#51

On February 08 2013 11:52 Greenei wrote:

Show nested quote +

3 stds would be ~99%.

Show nested quote +

You can download an SQL database dump at http://aligulac.com/db/.

Greenei

Germany1754 Posts

February 08 2013 07:53 GMT

#52

On February 08 2013 15:17 Conti wrote:

Show nested quote +

You can download an SQL database dump at http://aligulac.com/db/.

ah thx, that was a bit hidden :D

Grovbolle

Denmark3805 Posts

February 08 2013 08:41 GMT

#53

On February 08 2013 08:34 TheBB wrote:

Show nested quote +

It's actually just one estimated standard deviation, so it's a pretty weak confidence interval.

Yeah ok, 68%

a3den

704 Posts

February 08 2013 12:00 GMT

#54

As a stats buff, gotta say it really is a nice website, like a cleaner and better version of TLPD (or sc2charts, whatever floated your boat). Both infuriated me for the longest time because they had the data and did nothing with it. You on the other hand understand that a db is as good as what you do with it. I also love how well your data is historized.

Downloading that Db dump from work is so tempting...

maty

Germany12 Posts

February 08 2013 13:10 GMT

#55

you could revisit the EG curse with those stats

MasterOfPuppets

Romania6942 Posts

February 10 2013 21:40 GMT

#56

So BB if you ever get particularly bored, could you make a prediction system for ProLeague/GSTL based on not only on player rating for both rosters but also maps? Or is it simply not going to be accurate enough to warrant the gargantuan effort involved in creating and implementing the system? xD

Conti

Germany2516 Posts

February 10 2013 22:14 GMT

#57

On February 11 2013 06:40 MasterOfPuppets wrote:
So BB if you ever get particularly bored, could you make a prediction system for ProLeague/GSTL based on not only on player rating for both rosters but also maps? Or is it simply not going to be accurate enough to warrant the gargantuan effort involved in creating and implementing the system? xD

There's currently no map information saved in the database, only matches and results. So before any kind of predictive ~~magic~~ math can be applied, we'd need that information for >100.000 games. And we'd need a whole lot more volunteers for that.

Nudge. Nudge.

Grovbolle

Denmark3805 Posts

February 10 2013 22:34 GMT

#58

On February 11 2013 07:14 Conti wrote:

Show nested quote +

Plus we (TheBB) had to rework how the entire database is configured because matches =/= games.

Plus it would be hard since a lot of LP-articles contain no mapinfo, even on big tournaments like MLG it is impossible to find map info for stuff like open bracket etc. So yeah, way too much work, whenever a new feature has to be "backtracked" as I like to call it, it literally takes our small team of 4-5 (TheBB, Conti, kiekaboe does a shit ton each and I + Inflicted does some as well) weeks, just look at this
http://aligulac.com/db/
"only" 64% is catalogued in the event hierarchy.

Epamynondas

387 Posts

February 10 2013 23:09 GMT

#59

On February 08 2013 03:06 TheBB wrote:
+ Show Spoiler +

On February 08 2013 02:16 KillerDucky wrote:

Show nested quote +

Maybe you could do some kind of backwards adjustement (or this "smoothing" you guys speak of) only on new players? Like, compute things normally for them for about 4 periods or something like that (or for a set amount of games played, i guess?), and then adjust their ratings retroactively, and then don't mess with their past ever again.

So imagine that I get a magical seed for Code S next season, and lose my first game of the group stages against Life (but only because i'm nervous). This doesn't give a lot of points to Life because I'm totally unknown at that point.

Then I proceed to stomp all competition and win Code S without dropping another map. Then your script readjusts my ratings and suddenly Life has a rating of like 3000 because he took a game off me.

And then pro players catch up to my silver strats and I don't win a game ever again.

Conti

Germany2516 Posts

February 10 2013 23:11 GMT

#60

..and sorting matches into events is about a gazillion times faster to do than adding maps to matches would be.

Grovbolle

Denmark3805 Posts

February 11 2013 09:19 GMT

#61

On February 11 2013 08:09 Epamynondas wrote:

Show nested quote +

On February 08 2013 03:06 TheBB wrote:
+ Show Spoiler +

On February 08 2013 02:16 KillerDucky wrote:

Show nested quote +

Do you have a Code S seed that you haven't told anyone about?
:D

Epamynondas

387 Posts

February 11 2013 16:06 GMT

#62

Negotations are still underway so I'm not supposed to talk about it.

;D

Grovbolle

Denmark3805 Posts

February 11 2013 16:37 GMT

#63

On February 12 2013 01:06 Epamynondas wrote:
Negotations are still underway so I'm not supposed to talk about it.

;D

In before EGEpamynondasRC

KingDime

Canada750 Posts

February 11 2013 17:05 GMT

#64

While it is definitely a rating based behind extremely high level players and seeing where they are ranked. I find it cool that I could even find some of my own results that I had completely forgotten about from like early 2011 etc. This is a great system for the high level professional players and also really useful for a lowly NA semi-pro.

fezvez

France3021 Posts

February 11 2013 17:57 GMT

#65

I am a bit ashamed, studying statistics myself,but why would you consider your red line ideal?

I suppose the linear regression gives you the approximation with smallest prediction error. Why would you minimize/maximize something else?

Traceback

United States469 Posts

February 11 2013 18:29 GMT

#66

On February 12 2013 02:57 fezvez wrote:
I am a bit ashamed, studying statistics myself,but why would you consider your red line ideal?

I suppose the linear regression gives you the approximation with smallest prediction error. Why would you minimize/maximize something else?

Because ideally you want the actual win rate to match your predicted win rate.

Grovbolle

Denmark3805 Posts

February 11 2013 18:31 GMT

#67

On February 12 2013 03:29 Traceback wrote:

Show nested quote +

Because ideally you want the actual win rate to match your predicted win rate.

Ninja'd me

Grovbolle

Denmark3805 Posts

February 11 2013 18:33 GMT

#68

On February 12 2013 02:05 KingDime wrote:
While it is definitely a rating based behind extremely high level players and seeing where they are ranked. I find it cool that I could even find some of my own results that I had completely forgotten about from like early 2011 etc. This is a great system for the high level professional players and also really useful for a lowly NA semi-pro.

If you ever participated in any of the major tournaments, or as in what I assume is your case, NASL Season 4 and zotac cup #20 + what we scraped from TLPD, there is a good chance you are in the system

Edit: You are canadian not american, so your results are from WCS

Edit edit: I merged the players Dime (us) and Dime (ca). Please respond to my PM to you so I can get the results you submitted reviewed

metroid composite

Canada231 Posts

February 11 2013 22:00 GMT

#69

On February 07 2013 09:57 TheBB wrote:

Show nested quote +

Logistic is what gets used in Chess ELO rankings specifically because it was found to be a better fit to the data. The only part the two functions handle significantly differently is the tail (so the 80% percentage gap cases).

I mean, specifically looking at the chart you posted:

+ Show Spoiler [Proof] +

Ignore the data noise for a moment and look at the fitted curve. The fitted curve starts dead even with the ideal curve, but slowly diverges (and from 50%-83% the actual game data very closely follows the fitted curve). This is what I'd expect to see if the probability distribution being used was collapsing too tightly.

The problem with the normal distribution is that it's not built for transitive operations.

If Life has an 84% chance of beating Sheth
and Sheth has an 84% chance of beating Artosis
Does this mean that Life has a 98% chance of beating Artosis?

Because this is what Normal distribution predicts when you combine two win percentages.

Two 31% chances combine into a 16% chance.
Two 23% chances combine into a 7% chance.
Two 16% chances combine into a 2% chance.
Two 7% chances combine into a 0.1% chance.

Which...doesn't feel right to me. If we take the above Life/Sheth/Artosis numbers, but make Sheth 50% less likely to win against Life, and Artosis 50% less likely to win against Sheth, suddenly Artosis is 95% less likely to win against Life? The odds against him really increase by a factor of 20, when the odds of the two intermediate matchups only get worse by a factor of 2?

Compare to the logistic curve. When you combine two win percentages to predict a more distant match...

Two 31% chances combine into a 17% chance
Two 23% chances combine into a 9% chance
Two 17% chances combine into a 4% chance
Two 9% chances combine into a 1% chance

Which is to say: if you take existing win percentages, and change them so that Sheth is 1/2 as likely to beat Life, and Artisis is 1/2 as likely to beat Sheth, then that makes Artosis is 1/4 as likely to beat Life (instead of 1/20 as likely).

Just intuitively, this just feels like a more reasonable way to combine percentages. If you told me with absolute certainty "Life beats sheth X% of the time" and "Sheth beats Artosis Y% of the time", and then asked me "What do you expect Life's winrate against Artosis to be?" My guess would be much closer to the logistic distribution than the normal distribution.

Heartland

Sweden24580 Posts

February 11 2013 22:21 GMT

#70

I got quoted in an OP? That's awesome.

Also, this is a cool project.

felisconcolori

United States6168 Posts

February 11 2013 22:23 GMT

#71

When I look at TheBB's posts, I see the gathering mass of a star being born.

This is insanely useful information, an excellent use of statistics, and I hope to (insert deity or otherworldly influence here) that you can get some academic use out of this project as well. (A paper, an essay, something.)

metroid composite

Canada231 Posts

February 11 2013 23:20 GMT

#72

On February 08 2013 04:56 ACrow wrote:

As a big Scarlett fan...yeah, I would not put her ahead of Hyun.

In general, the place the ratings feel a little wrong is when players play in an overly easy (or overly hard) group.

Take the list of best foreign Terrans:

http://aligulac.com/periods/77/?page=1&sort=&race=t&nats=foreigners

Actually, let me quickly note that MaSa is also not on this list (Aligulac has MaSa as Korean; Liquipedia has MaSa as Canadian; I believe Canadian is correct here).

But what I really want to point out here is.... Look at the 5th best foreign Terran. Bunny! Who is Bunny? Someone who participated in a Danish Starcraft tournament, and I guess got more wins than losses. Boom, 5th best foreigner Terran, apparently!

If you want to get highly rated by Aligulac, then play opponents weaker than yourself...which sums up a decent number of Scarlett's tournaments (WCS Canada, WCS North America, IPL qualifier for North America...).

Conversely, if you want a low rating on Aligulac, then play stronger opponents. (Most of the foreigners who participated in the MLG vs Proleague event had their ratings dip, usually by about 300 points right around October 2012. For example, look at the rating graph of qxc: http://aligulac.com/players/261/ ).

I don't really know if there's a good statistical way to fix this issue, however. If all the Danish people collectively decide to never play anyone outside of Denmark, some of them are going to end up with very high ratings, some of them are going to end up with very low ratings. Not a whole lot that can be done about it.

Grovbolle

Denmark3805 Posts

February 12 2013 07:52 GMT

#73

On February 12 2013 08:20 metroid composite wrote:

Show nested quote +

Very true, I believe a few pages back someone posted an explanation of how "islands" within a ranking system affects this. But please remember, this isn't "THE TRUTH". Bunny has very few matches, the problem is always when someone new enters in a scene with a lot of "new" players (non-ranked players starting on 1000). And yes it can become a problem if a subcommunity only plays each other.

TheBB

Switzerland5133 Posts

February 12 2013 08:07 GMT

#74

On February 12 2013 07:00 metroid composite wrote:

Show nested quote +

+ Show Spoiler +

Yes, I know all this now. But thanks for putting it into words anyway.

I will try and see what happens. I expect some improvement, too.

TheBB

Switzerland5133 Posts

February 12 2013 08:16 GMT

#75

On February 12 2013 07:23 felisconcolori wrote:
This is insanely useful information, an excellent use of statistics, and I hope to (insert deity or otherworldly influence here) that you can get some academic use out of this project as well. (A paper, an essay, something.)

Thanks

. I asked my advisor about whether or not the institute has a policy on publishing reports in topics that are outside the main area of research and he said it was fine as long as I found some statistician to look at it. (None of the people I usually work with know anything about statistics, lol.)

TheBB

Switzerland5133 Posts

February 12 2013 19:03 GMT

#76

I've now converted everything to using the logistic distribution. You should see somewhat more conservative predictions now.

Updated prediction analysis:

It didn't help as much in the 80%+ regime as I thought it would. I'm thinking the problem is more related to the sudden arrival of new player pools (koreans in late 2010, kespa in 2012), and I may have to do something about that, such as one or more of:

- parameter smoothing over certain time periods
- use time-dependent parameters

At the moment I'm a bit tired of the mathematical part and I'll go back to working on the website for a few weeks.

Boucot

France15997 Posts

February 14 2013 22:23 GMT

#77

Congrats for the article on ESFI World and all your job in general, well deserved !

http://www.esfiworld.com/interview-with-eivind-fonn-founder-of-starcraft-2-statistics-and-ranking-site-aligulac/

StarVe

Germany13591 Posts

February 14 2013 23:51 GMT

#78

Yeah, that's pretty fucking awesome, good luck with your thesis!

Greenei

Germany1754 Posts

February 16 2013 18:55 GMT

#79

aaaah! as i see you have included my proposed feature, sweeeet!

llIH

Norway2143 Posts

February 16 2013 18:56 GMT

#80

Scarlett! wow!

TheBB

Switzerland5133 Posts

February 16 2013 19:09 GMT

#81

On February 17 2013 03:55 Greenei wrote:
aaaah! as i see you have included my proposed feature, sweeeet!

The live preview? Yeah sure. They should update every six hours now.

I so do love cron.

Also, plugging this:

On February 15 2013 07:23 Boucot wrote:
Congrats for the article on ESFI World and all your job in general, well deserved !

http://www.esfiworld.com/interview-with-eivind-fonn-founder-of-starcraft-2-statistics-and-ranking-site-aligulac/

Boucot

France15997 Posts

February 16 2013 22:58 GMT

#82

I have a question about the way the results are updated on the website. Because I was amazed by how fast the results of Francophone Championship Season 2 were considered this afternoon. Did someone submitted them ?

And a second related question : when someine submit you a result, do you add them manually ?

TheBB

Switzerland5133 Posts

February 16 2013 23:11 GMT

#83

On February 17 2013 07:58 Boucot wrote:
I have a question about the way the results are updated on the website. Because I was amazed by how fast the results of Francophone Championship Season 2 were considered this afternoon. Did someone submitted them ?

And a second related question : when someine submit you a result, do you add them manually ?

I think that was just kiekaboe being really fast. (No guarantees for the future.)

When someone non-admin submits games, we have to review them before they are properly added, yes.

MCXD

Australia2738 Posts

February 16 2013 23:27 GMT

#84

Would it be possible for you to implement byes into the single elimination bracket prediction? Like if you write # or BYE or something. The ESF IPL6 seeding tourney going on at the moment, for example, has 7 players with one receiving a first round bye. Lots of other tourneys like the IEMs do it too, with 12-man playoffs.

In the case of 12-man, I assume you would format it like this:

Seed 1
BYE
Seed 8
Seed 9

Seed 4
BYE
Seed 5
Seed 12

Seed 2
BYE
Seed 7
Seed 10

Seed 3
BYE
Seed 6
Seed 11

Thus giving you:

1 vs. (8 vs. 9)
4 vs. (5 vs. 12)
2 vs. (7 vs. 10)
3 vs. (6 vs. 11)

Perhaps implementing BYE as a fake player with a rating of 0 in all match-ups?

TheBB

Switzerland5133 Posts

February 16 2013 23:35 GMT

#85

On February 17 2013 08:27 MCXD wrote:
Would it be possible for you to implement byes into the single elimination bracket prediction? Like if you write # or BYE or something. The ESF IPL6 seeding tourney going on at the moment, for example, has 7 players with one receiving a first round bye. Lots of other tourneys like the IEMs do it too, with 12-man playoffs.

Good point, that's another one of those features that the backend code already has but I haven't worked it into the frontend. It's on the list.

In the meantime you can use nemuke as a placeholder. He is currently the lowest ranked player (even counting the inactive ones), and so he will have the least impact. Poor guy. :D

There's no restriction on using the same player more than once, either.

MCXD

Australia2738 Posts

February 16 2013 23:46 GMT

#86

As far as I can tell, putting in a dummy player then going down to the 'update the results' section and giving them a 2-0 win in their first match works too, it turns out. You can then just crop off the dummy player in the little code block you post on the forums too, for cleanliness.

Normal

Please or register to reply.

Aligulac Feb 6 Update: Oops

Completed

Ongoing

Upcoming