|
monad: http://www.teamliquid.net/forum/viewmessage.php?topic_id=142211¤tpage=2#26
"No mined data. We don't have that for SC2. It is based on what we know about WoW Arena, what we know about SC2 ladder, and how Bayesian inference ranking systems work. This isn't necessarily how it works, though obviously I believe this is close to the truth."
So yeah, it's not particularly rigorous, to say the least. What I took away from the post were basically the hypotheses about how different observed behavior might be implemented in a Glicko- or TrueSkill-style system; i.e. a threshold rating for each division that takes into account ratings deviation, and the idea that the "favored" might be bugged for people who have a different real rating from their displayed rating.
I think the post was intended primarily as an introduction for people who haven't previously been exposed to the concept of a rating being paired with a certainty that improves or deteriorates over time.
I look forward to someone collecting some data regarding wins, losses, rating changes, and promotions, and making a model.
|
United States12224 Posts
On August 10 2010 00:07 Glacierz wrote: Assuming this is what blizz have, it's probably reasonable to assume people with win/loss ratio that differ significantly from 50% (lets assume 90%) will have a very high sigma, the system will keep matching these guys to players with much higher MMR, and they will keep beating them, causing further inflation in the sigma until they reach a point where their MMR peaks at the top of the pack. This could partially be a reason why people with these win records are stuck in a certain league.
Well, yes and no, I think. We'll use CauthonLuck as an example. His MMR is pretty clearly at the very top, matching him against the best players on the ladder. Say he maintains his 85% win ratio against those top players, causing sigma to decrease because those players are below his MMR. However, because he occasionally loses against players who have a lower MMR, sigma increases and offsets each decrease. Sigma never decreases below the threshold required for promotion.
In an upset, sigma will increase for both players, but that's not universally true. If sigma is large enough, it can decrease instead. How large that is, we don't know. I'll edit that into the OP.
|
United States12224 Posts
On August 10 2010 04:24 catamorphist wrote:monad: http://www.teamliquid.net/forum/viewmessage.php?topic_id=142211¤tpage=2#26"No mined data. We don't have that for SC2. It is based on what we know about WoW Arena, what we know about SC2 ladder, and how Bayesian inference ranking systems work. This isn't necessarily how it works, though obviously I believe this is close to the truth." So yeah, it's not particularly rigorous, to say the least. What I took away from the post were basically the hypotheses about how different observed behavior might be implemented in a Glicko- or TrueSkill-style system; i.e. a threshold rating for each division that takes into account ratings deviation, and the idea that the "favored" might be bugged for people who have a different real rating from their displayed rating. I think the post was intended primarily as an introduction for people who haven't previously been exposed to the concept of a rating being paired with a certainty that improves or deteriorates over time. I look forward to someone collecting some data regarding wins, losses, rating changes, and promotions, and making a model.
I've been tracking my 1v1 matches since I started playing yesterday (from 0 games). Once I get enough information I'll post it here.
|
ExcaIibur(or anyone for that matter) can your model explain this interesting situation that i can't figure out: Player A is 39 and 27 and has 460 points in plat while Player B is 38-28 and has 420 points in plat. Player A is 5 in his division while player B is 1st in a different division. Player B get promoted to diamond before player A. Both players have had the the game for the same amount of time so their bonus pools should be about that same(if that has any effect on it, i doubt it).
|
On August 10 2010 04:39 Calidus wrote: ExcaIibur(or anyone for that matter) can your model explain this interesting situation that i can't figure out: Player A is 39 and 27 and has 460 points in plat while Player B is 38-28 and has 420 points in plat. Player A is 5 in his division while player B is 1st in a different division. Player B get promoted to diamond before player A. Both players have had the the game for the same amount of time so their bonus pools should be about that same(if that has any effect on it, i doubt it).
If we're assuming there's a hidden rating, that's trivial. Displayed points don't necessarily have much connection to real rating, and obviously win/loss record doesn't give a good perception of rating either, so for all you know, player B hit the threshold and player A didn't.
|
United States12224 Posts
On August 10 2010 04:39 Calidus wrote: ExcaIibur(or anyone for that matter) can your model explain this interesting situation that i can't figure out: Player A is 39 and 27 and has 460 points in plat while Player B is 38-28 and has 420 points in plat. Player A is 5 in his division while player B is 1st in a different division. Player B get promoted to diamond before player A. Both players have had the the game for the same amount of time so their bonus pools should be about that same(if that has any effect on it, i doubt it).
Completely dependent upon their match history and opponents. Was Player A more volatile in match outcome (did he lose matches he was expected to win and vice versa)? Did he go on a long losing streak followed by a win streak? Was Player B, by contrast, more stable and predictable in performance? Their MMRs -- and equally as important, their sigmas -- may be drastically different which we expect to be a prime factor in promotion.
|
Ok, ty for your quick and elegant response.
Side note:i think you would make an excellent proffessor lol
|
On August 10 2010 04:06 monad wrote: Not trying to downplay your work here, but I don't see any evidence anywhere that it actually works this way. I know you said it's just speculative, but your speculation should be based on some sort of evidence right? That led you to believe it works this way? Right now all I see is "I'm about to describe one of many possible ranking systems that SC2 may or may not use with equal probability". This is a theory that attempts to explain the behavior seen on battle net. It attempts to provide a model with which we can understand how the ladder works. A major problem here is we don't have the data with which to verify this conjecture. Unless we are given access to more data much of this theory cannot be definitively proven. Even still a decent working theory can be quite useful, despite its speculative nature. I think I have been clear about what evidence I am basing this off of: blizzard's own statements about wow, the available materials on a bayesian system in a videogame, the observations about the sc2 ladder, and blizzards track record in polishing existing systems instead of creating new ones from while cloth. As a result I would not think that the system detailed here is one of many and is unlikely to be true. On the contrary, I think it is highly likely that sc2 uses a system very much like this one.
So, this theory may be incorrect in parts, and I would be quite interested in reading a different theory not based on bayesian inference with gaussian skill distributions if one exists.
|
On August 10 2010 05:53 vanick wrote:So, this theory may be incorrect in parts, and I would be quite interested in reading a different theory not based on bayesian inference with gaussian skill distributions if one exists.
What observations have you made about the ladder that would lead you to think that, for example, it can't be just a plain zero-sum Elo system? Or perhaps, as in chess, there's a big "provisional" period where the system amplifies rating changes as it tries to place you, and then your rating changes much less with each game after that; that could help explain people's wildly variable experience with getting promoted or not promoted, depending on how long the provisional period is. I mean, there are lots of different rating systems in place somewhere, and I don't really see any public data about the ladder that seems to disqualify any reasonable system.
I'm not really saying that you're likely to be wrong. I think it's reasonable to expect that they've carried over a system very similar to WoW's system, since they spent a lot of time and effort learning from their experiences with that. I wouldn't be shocked if it were different, though; you know programmers love to reinvent wheels.
|
United States12224 Posts
On August 10 2010 07:15 catamorphist wrote:Show nested quote +On August 10 2010 05:53 vanick wrote:So, this theory may be incorrect in parts, and I would be quite interested in reading a different theory not based on bayesian inference with gaussian skill distributions if one exists. What observations have you made about the ladder that would lead you to think that, for example, it can't be just a plain zero-sum Elo system? Or perhaps, as in chess, there's a big "provisional" period where the system amplifies rating changes as it tries to place you, and then your rating changes much less with each game after that; that could help explain people's wildly variable experience with getting promoted or not promoted, depending on how long the provisional period is. I mean, there are lots of different rating systems in place somewhere, and I don't really see any public data about the ladder that seems to disqualify any reasonable system. I'm not really saying that you're likely to be wrong. I think it's reasonable to expect that they've carried over a system very similar to WoW's system, since they spent a lot of time and effort learning from their experiences with that. I wouldn't be shocked if it were different, though; you know programmers love to reinvent wheels.
It's definitely not zero sum/Elo because players receive and lose different point amounts.
WoW's system used to use Elo. Teams used to start at 1500 and that was their only rating. They changed to a Bayesian inference system when they realized it didn't have the problems that the Elo system did, and they explained the change over several forum posts and FAQs. Is the SC2 system an exact mirror of WoW's system? No, very clearly not, because there are parts of the SC2 system that are unique, such as promotions. However, it's similar enough fundamentally that it makes the system easier to understand.
|
On August 10 2010 07:15 catamorphist wrote:Show nested quote +On August 10 2010 05:53 vanick wrote:So, this theory may be incorrect in parts, and I would be quite interested in reading a different theory not based on bayesian inference with gaussian skill distributions if one exists. I'm not really saying that you're likely to be wrong. I think it's reasonable to expect that they've carried over a system very similar to WoW's system, since they spent a lot of time and effort learning from their experiences with that. I wouldn't be shocked if it were different, though; you know programmers love to reinvent wheels.
There's also the mantra that good programmers write good code, great programmers steal great code. Not saying it's one or the other because I know a lot about reinventing the wheel
As a followup to Excal's post, some of the problems Elo suffers from is it gives players incentive to change how they play, instead of giving them a pure incentive to win. Going back to the chess example, it often creates the incentive to play to a draw, not a win. In addition, Elo has a problem with streaks, win or lose. The system proposed here, and the one used by Arena, buffers against that.
|
I don't know how the system is set up but I won't actually ever get a good flow of games. I will either go on 15 game winning streaks or 15 game loosing streaks and just alternate between the two. Either I get to play people who are insanely ahead of me or I get to play scrubs and there is no inbetween.
|
O.O are you a Wizard? this is truly an amazing post, props to you for all the effort, just wish my brain could comprehend it better.
|
United States12224 Posts
On August 10 2010 11:05 VanGarde wrote: I don't know how the system is set up but I won't actually ever get a good flow of games. I will either go on 15 game winning streaks or 15 game loosing streaks and just alternate between the two. Either I get to play people who are insanely ahead of me or I get to play scrubs and there is no inbetween.
Can you verify? Can you post your in-game match history along with the profiles of people you've played, their rating and league, and the number of points lost or gained?
|
So I have a question, would it become harder to be promoted if you've played lots of games? Assuming someone was in silver for instance, having played a large amount of games (say a 100 with a 50% win/loss ratio). If he were to start winning 70%(an arbitrary amount) of his games, would it be harder for him to get to gold than someone with similar percentages but only 15 games played?
|
United States12224 Posts
On August 11 2010 08:58 Hekmat wrote: So I have a question, would it become harder to be promoted if you've played lots of games? Assuming someone was in silver for instance, having played a large amount of games (say a 100 with a 50% win/loss ratio). If he were to start winning 70%(an arbitrary amount) of his games, would it be harder for him to get to gold than someone with similar percentages but only 15 games played?
It would take longer, yes. If you've played 100 games and gone 50-50, your sigma is probably fairly small because the system feels confident that it's put you where you belong. If someone else has played 16 games and gone 8-8, that person's sigma is going to be larger. The exact scale is something that we don't know, but we do know that your MMR never truly gets "locked" in place (it's always changing to some degree after each win or loss). Depending on where you are in the ladder, you may need to play quite a few more games to increase your sigma before you can decrease it again within the threshold of a higher league, which would make you eligible for promotion.
|
One thing to think about which I have not seen explicitly explained.
Player A is very good against Terran and Zerg, but almost always loses to Protoss. Player B is consistently good against all three races. Players A and B play 20 games and both go 15-5. The matchmaking system sets both players against 10 players of even 'skill' (i.e. MMR) and 10 of higher skill. Player B lost to 5 opponents of higher skill, beat 5 opponents of higher skill and all 10 of even skill. His sigma has decreased enough such that he is promoted. Meanwhile Player A beats all ten highly skilled Players (which by chance are all Terran and Zerg) and loses to 5 of even skill (all Protoss), the sigma remains high as beating players of higher MMR whilst going 50/50 at even MMR makes the 'correct' MMR uncertain. Player A is not promoted. As a caveat, because of the bonus pool Player A could additionally have a higher displayed rating than Player B [For example A gets (10+10) x 5 for his even match wins - 10 x 5 for even match losses and (15+15) x 10 = 350pts. B gets (10+10) x 10 for his even match wins - 5 x 5 for even match losses and (15+15) x 5 = 325pts.]
Or both players are equally good against all 3 race distributions and the matchmaking system puts them against equal amounts of the races. However Player A is susceptible to cheese and happens to lose a few games to lower MMR players cheesing him thus keeping a high sigma. Player B is also susceptible to cheese but fortunately none of his opponents cheese him and he only loses to players of high MMR. Still they both go 15-5 and Player B gets promoted whilst A does not.
Because the matchmaking system does not take into account race, or buildorder, or anything else except wins and losses this may go someway towards explaining how people get 'stuck' in Platinum with records like 30-6.
|
On August 11 2010 23:00 ThunderGod wrote:
Because the matchmaking system does not take into account race, or buildorder, or anything else except wins and losses this may go someway towards explaining how people get 'stuck' in Platinum with records like 30-6.
The obvious downside of this system is that it ONLY takes in account win/lose ratio. If you read the TrueSkill system in the Microsoft paper it says:
it merely assumes that the outcome is due to some unobserved performance that varies around the skill of a player. If one is playing a point based game and the winner beats all the other players by a factor of ten, that player’s victory will be scored no differently than if they had only won by a single point.
+ Show Spoiler +http://research.microsoft.com/en-us/projects/trueskill/details.aspx
Normally a player isn't good with all three races and all matchups. One step in the right direction from blizz would be to give something like you can have one character per race if you feel that your skill isn't equal in different races.
Also say that my TvP win ratio is 70% and my TvZ ratio for some reason is 30% the system won't take this in account in my matchmaking and could (and possibly would?!) hold me as favourite in too many TvZ matchups despite my win/lose ratio says otherwise.
I am by no means an expert in matchmaking systems but reading/seeing how the laddering behaves I think there is a lot of room for improvement.
|
United States12224 Posts
Made an important edit to the Matchmaking section per Vanick's post on the Battle.net forums:
In an upset sigma does not always increase. That is, if a lower-MMR player wins then what happens depends a lot more on their precise equations they are using. If a player's sigma is large in an upset (whether he's the winner or loser) it can decrease. That is because given the right MMR and sigma values it's possible in theory for the system to learn about that player's skill and rate him more accurately. If a player's sigma is small, however, it can become larger after an upset if that upset was truly unexpected.
|
On August 08 2010 12:41 SnakeChomp wrote: I don't think the displayed rating value has any bearing on match making at all.
Totally agree, from as low as 250 silver league I was matched against 450 diamonds. As 500 gold I was matched against "slightly favorite" 550+ diamonds, and then I was promoted straight from gold to diamond after a bunch of games against only diamonds.
all this time I was gaining at least 20 points for each win (40 with bonus) and losing 1 - 3 points for any loss..
|
|
|
|