Specific feedback and feature requests should go here: Aligulac.com changelog and feedback thread
We have a FAQ here. FAQ means Frequently Asked Questions. Odds are your Question is Frequently Asked.
Visit us aligulac.com
If you have an idea on how to improve the ranking, please read this first:
+ Show Spoiler +
One of the most common forms of feedback we get are about how to improve the rating. These are not bad ideas, they're pretty good, but they might be more complicated than you think. Here are some common ones.
One thing to keep in mind is that a good mathematical model has few parameters. The current rating system has three. Those three parameters are sufficient to give Aligulac the predictive power it has. If you want to suggest complicating the model, the additional parameters must be chosen with care.
Games in regular online tournaments shouldn't count as much as Code S. Well, first we have to realize that games are weighted, in a sense, by opponent skill. You get more points for beating a higher rated opponent than a lower, and you lose more points by losing to a lower rated opponent than a higher. In addition it is worth considering that simply weighing games higher will not automatically increase the rating of those playing. The winners will gain more points, true, but the losers will also lose more. The mean rating of the players playing will not change.
So aside from this, how should this weighing work?
Even stronger weighing by opponent? The "weighing" is a result of a Bayesian inversion formula depending on the underlying probability model chosen. It's not something that can just be changed, that is, there's no parameter encoding this. It's a much deeper mathematical concept.
Weighing by mean rating of opponent in a round? Well, why should this be any better than weighing by the actual opponents faced, which is what we already do?
Weighing by prize pool? The theory goes that strong players are likely to "try harder" if the prize is higher. There is some merit to this idea, but there are also problems. Some tournaments offer prizes in equipment, and not money. Some offer qualification to a higher tier. For example, there is no monetary prize in the GSL Up and Down groups, but nobody would question the incentive to win there. In addition, there are internal team incentives which are not generally public knowledge. And, additionally, if a player knowingly plays weaker in some games, should that not be reflected in the ratings?
Weighing by tournament? These arguments usually involve some classification of events into tiers of importance with coefficients associated with each level. This approach runs into the complexity problem. With five levels (say), the model becomes far more complicated for what is not shown (yet, anyway) to be reasonable benefit.
Weighing by online and offline? Yes, this is a legitimate idea and probably the one closest to being implemented. We have working experimental code with this feature already.
Rating gap cap. The idea here is to prevent players from "farming" much lower rated players. It is possible to artificially inflate a player's rating if he never plays other players close to himself in skill. The ideas usually consist of ignoring matches where players are farther away in rating than a given threshold. However, this can be seen as unfair to the lower rated players, who will have their wins against good players discarded. Capping the gap to a given value will make the problem worse. Say a 1700-rated player plays a 1000-rated player, but the cap is 500, so for the purposes of updating the stronger player's rating the lower rated player is assumed to be rated 1200. Then it will be easier for the 1700-rated player to overperform than it was previously.
One thing to keep in mind is that a good mathematical model has few parameters. The current rating system has three. Those three parameters are sufficient to give Aligulac the predictive power it has. If you want to suggest complicating the model, the additional parameters must be chosen with care.
Games in regular online tournaments shouldn't count as much as Code S. Well, first we have to realize that games are weighted, in a sense, by opponent skill. You get more points for beating a higher rated opponent than a lower, and you lose more points by losing to a lower rated opponent than a higher. In addition it is worth considering that simply weighing games higher will not automatically increase the rating of those playing. The winners will gain more points, true, but the losers will also lose more. The mean rating of the players playing will not change.
So aside from this, how should this weighing work?
Even stronger weighing by opponent? The "weighing" is a result of a Bayesian inversion formula depending on the underlying probability model chosen. It's not something that can just be changed, that is, there's no parameter encoding this. It's a much deeper mathematical concept.
Weighing by mean rating of opponent in a round? Well, why should this be any better than weighing by the actual opponents faced, which is what we already do?
Weighing by prize pool? The theory goes that strong players are likely to "try harder" if the prize is higher. There is some merit to this idea, but there are also problems. Some tournaments offer prizes in equipment, and not money. Some offer qualification to a higher tier. For example, there is no monetary prize in the GSL Up and Down groups, but nobody would question the incentive to win there. In addition, there are internal team incentives which are not generally public knowledge. And, additionally, if a player knowingly plays weaker in some games, should that not be reflected in the ratings?
Weighing by tournament? These arguments usually involve some classification of events into tiers of importance with coefficients associated with each level. This approach runs into the complexity problem. With five levels (say), the model becomes far more complicated for what is not shown (yet, anyway) to be reasonable benefit.
Weighing by online and offline? Yes, this is a legitimate idea and probably the one closest to being implemented. We have working experimental code with this feature already.
Rating gap cap. The idea here is to prevent players from "farming" much lower rated players. It is possible to artificially inflate a player's rating if he never plays other players close to himself in skill. The ideas usually consist of ignoring matches where players are farther away in rating than a given threshold. However, this can be seen as unfair to the lower rated players, who will have their wins against good players discarded. Capping the gap to a given value will make the problem worse. Say a 1700-rated player plays a 1000-rated player, but the cap is 500, so for the purposes of updating the stronger player's rating the lower rated player is assumed to be rated 1200. Then it will be easier for the 1700-rated player to overperform than it was previously.
Remember, even though your idea sounds great, it isn't necessarily useful, we actually test many of the ideas proposed, and so far, not many of them improve anything.
This week, I will bring you the write-up, courtesy of TheBB being busy with that PhD of his and my 3 month holiday which started yesterday. Now before we get into this, remember that a difference of around 250 or less means that there is no statistically significant difference between two players, but who am I kidding, you just want to rage and yell at our stats, so here we go.
The list:
- Life 1883 (No change)
- LucifroN 1857 (+44: WCS EU, EMS, RSL gNations)
- ForGG 1840 (-23: WCS EU, ATC)
- Polt 1834 (-25: WCS AM,ZOTAC Cup)
- INnoVation 1827 (+84: WCS KR, Proleague)
- Flash 1804 (+11: Proleague)
- Leenock 1790 (-43: WCS KR, GSTL, )
- viOLet 1763 (No change)
- Bomber 1760 (-28: WCS KR, GSTL)
- PartinG 1753 (-38: WCS KR, Proleague)
Honorable mentions: HyuN, Mvp, Rain, Soulkey, Symbol, First, RorO, sOs.
The biggest "winner" this period is definitely INnoVation who is working his way up to that #1 spot. With Life not having played the last 2 weeks and ForGG and Polt losing points, the #1 spot looks up for grabs, especially since LucifroN didn't qualify for WCS Season 1 Finals. However LucifroN could also take it by "farming" the EU scene, however with the level of opponents, he has to win a lot to get there. Hopefully INnoVation can get #1 so we doesn't have a Lucifrocalypse on our hands.
WHY IS FORGG SO HIGHLY RANKED? YOUR SYSTEM SUCKS!!!!!
Well, glad you asked, ForGG is so highly ranked, mainly because he has a pretty good TvT. With a 52-5 in games and 29-2 in matches in HotS TvT, he has accumulated a great deal of points. However, with the WCS Season 1 finals coming up, let's see how it measures up to the TvT of INnoVation (and maybe Ryung and aLive). Actually INnoVation's ranking is equally dominated by his stellar TvZ. We are still waiting for his TvP and TvT to catch up, in point-stealing, not in skill.
WHERE ARE THE REST OF THE KESPA PLAYERS? YOUR SYSTEM SUCKS!!!!!
As has also been discussed a great deal of times, the Kespa players are slowly but surely catching up, with the Kespa players entry into the GSL/OSL/WCS, more and more points are transferred between scenes, and as such, the "Kespa point pool" grows. It is however an issue that most of these points ends up with a few players that constantly bash the rest of the players (Flash, INnoVation, Soulkey, sOs, PartinG, Rain etc.) meaning that non-aces are still highly underrated, as for now, we are still contemplating what to do.
The foreigner list:
- LucifroN 1857 (+44: WCS EU, EMS, RSL gNations)
- Sen 1638 (-46: WCS AM)
- Snute 1632 (+39: WCS AM, Norwegian Starleague, ZOTAC, ATC)
- VortiX 1629 (-24: WCS EU, EMS)
- Kas 1623 (-43: WCS EU, EMS, ZOTAC, Gigabyte Quals)
- Happy 1615 (-11: WCS EU, Ener J, EMS)
- Welmu 1609 (-20: WCS EU, Go4SC2 )
- Jim 1596 (+13: WCS AM)
- Stephano 1579 (+4: WCS EU)
- Scarlett 1562 (-22: WCS AM, GSTL)
Honorable mentions: Nerchio, Dayshi, NaNiwa, Bunny, TLO, Bly, HasuObs.
Biggest winner on the foreigner list is LucifroN, although he didn't qualify for WCS Season 1 Finals, his performance the last 2 weeks have still been solid and have given him a lot of points. Snute also did well in a lot of smaller events, granting him points while most of the WCS EU players lost points.
Shout outs this week for TheAmazombie, he has been slowly but steadily working his way through our old matches and categorizing them into events (which is the most boring job to do). Shout out to the new guys in our team, more programmers means more new features in the pipeline. Shout out to all the WCS EU hosts and casters, all their retweets have given us around 150 new twitter followers :D. Shout out to Madals for giving great feedback on how to make the site easier/better to use for casters. As always, special shout out to Kaelaris for shamelessly promoting our site during WCS EU <3. Finally shout out to Waxangel and the rest of "The Pitchfork Preserver"-crew for this gem.
If you want to get in contact with us or have suggestion feel free to use:
- This thread.
- Twitter @Sc2Aligulac.
- PM. Me or TheBB.
- IRC: #aligulac on quakenet.
- E-mail to evfonn(at)gmail(dot)com.
- Issue list on GitHub.
- Pilgrimage to Zürich. TheBB will find you.
We will as always answer questions regarding the write-up in this thread.
Remember, this is not the greatest large dynamic paired comparison experiment system, this is just a tribute