|
Sc2 Ladder-Balance-Data If you want to help to get more data: MMR-Stats Beside this little balance data, this program can do his main task: Show your real MMR!
This is the diffrence to the average MMR, of my Ladder-Data, per Race. Not more not Less.
You can not see on any statistic game-data, if the reason is game design or social aspects. Not you, not me, not blizzard, not a single game designer! So we have the choice of paying for a global sociology study to find it out (if you can call it this way in sociology ^^) or just ASSUME it comes from game design like every game company does.
The data is biased towards EU/US and towards higher skill-rate. +16/-16 MMR is the average for a single win/loose on ladder. Result: Difference to average MMR:
MMR Filter: 2000 MMR+ ( above master) TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -15.77 Z: -0.77 P: 12.23
MMR Filter: 2000 MMR+ ( above master) TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -15.24 Z: -7.24 P: 17.76
MMR Filter: No TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -45.24 Z: 28.76 P: 6.76
MMR Filter: No TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -46.82 Z: 23.18 P: 14.18
Old/First Results+ Show Spoiler +Source Main Data + Show Spoiler + - The data is biased towards EU/US and towards higher skill-rate.
Gamescount: 125976 Sc2-Accounts: 45203
-worst to best player: 3200 MMR -one average win/loose on Ladder: +16 / -16 MMR
TIME Filter: 13 Jun 2012 02:34:54 GMT - 15 Jul 2012 16:05:41 GMT Games Left: 109028 MMR Filter: Above Master Games Left: 19688
Average MMR per Race + Show Spoiler + TIME Filter: 13 Jun 2012 02:34:54 GMT - 15 Jul 2012 16:05:41 GMT Race account count: 15814 Data average MMR: 1539.46 Difference to average MMR per Race: T-P: -62.14 T-Z: -117.03 P-Z: -54.89
TIME Filter: 13 Jun 2012 02:34:54 GMT - 15 Jul 2012 16:05:41 GMT MMR Filter: Above Master Race account count: 2840 Data average MMR: 2265.5 Difference to average MMR per Race: T-P: -35.0 T-Z: -21.0 P-Z: 14.0
Win-ratio per Race over Game-Time + Show Spoiler +
Work:+ Show Spoiler + Preamble+ Show Spoiler +I thought long about if i should publish this data or not because i know the balance mentality on tl / reddit / sc2-forums. I fight against balance whine on this forum since i joined it and even created an script to ignore balance whiners on tl. However i had the data to calculate objective balance data and the balance threads will pop up so or so. Work + Show Spoiler +Some month ago, i created an program to calculate the Hidden match making rating. To find out more about mmr and functions behind it, i included an upload function, that uploads gamedata on our server. "Not that" and me used this data to find out more about the MMR. We were able to solve the most secrets and calculate it very accurate. One day decided to uploaded the race value too, without thinking much about it. Than i realised that with race value and MMR and a lot of data, i can calculate the average MMR for an race. This post is about such an calculation. -I took all users and opponents and calculated their MMR. -After that, i created a list of bnet accounts with their last MMR and race. -Over this list i took the average of each race! This steps are very easy to understand but everything else than easy to calculate. It took us over 3 month and hundreds work hours to calculate accurate MMR. Proof of Concept+ Show Spoiler +
1) Why average race mmr = data balanceWe saw many ways to calculate balance in the past. Some can be indicators some are total useless. Win/loose statistic of pro players was often used because they are easy accessible and can indicate balance problems. However, this data dont take into account how strong the players are depending to each other. The Skillfunction behind MMR is invented to do exactly this. Arguments like "Race x players are stronger than race y players" are invalid because if this is the case we can call this already imbalanced. 2) Why mistakes in the MMR calculation don't affect the result or affect itFirst: the accuracy of my mmr calculation is very good. But i can be wrong in some points or for some users. However nothing in the calculation takes the race into account. Theoretical it can be that my mmr calculation is race biased without even knowing the race. However at the moment i dont see any indicator for that. But i will watch it closely 3) Why race populations dont change the result.Because i take the average. This point is obvious but i better point it out. 4) Statistic independenceif you take the average of the race you must make sure that you don't have and depending factors in the data. So what can be such a factor: 1) race/skill distribution of the user-group of my program is not representing battlenet user group There is no reason why 1. should be true. Also the data includes to 96% the opponents of my users and only to 4% of the users himself. So the data have a random user base. so we can exclude point 1. 2) Skill-range of my users is not skill-range of the battlenet The users of my program have a way higher average skill than the bnet usergroup. Also the opponent is allways in the range of the user. So point 2 is true! We have to remember that our result is not valid for the hole ladder, its only valid for our skillgroup. Diamand, Master and Grandmaster are overrepresented in my usergroup. This means this data show the balance on higher skillevels! 5) prove of small deviation and significancelolcanoe make a nice analyses of the data. thanks for that! On July 13 2012 07:41 lolcanoe wrote:US DATA ONLYTerran Average MMR, STD 1559.214909, 546.131097 Protoss Average MMR, STD 1620.764863, 509.5809733 Zerg Average MMR, STD 1672.129547, 495.3121321 TWO SAMPLE T-TEST RESULTST-Stat, T vs Z T-Stat = -5.693 P = .0000001386 T-Stat, P vs Z T-Stat : -2.872 P = 0.00472 T-Stat, T vs PTstat = -3.03 p = .00238 Histogram of T MMR for normality check: ![[image loading]](http://i.imgur.com/FNzvx.png) Anderson-Darling Test for Normality (T only) ![[image loading]](http://i.imgur.com/7kVqA.png) With a p slightly greater than .05, we cannot reject normality of the data. However, the weakness of this statistic indicates that normality should be scrutinized in the interpretation. Assumptions- MMR is an independent, fair indicator of skill. - MMR is approximately normal. - There is no sampling bias between races, however there is a sampling bias towards higher average skill. - Cause-effect cannot be established by this test. With over 99% confidence, we can reject the null hypothesis that the averages are equal in all 3 matchups. This is not surprising given the quantity of data, in addition to a maximum 7% difference between T and Z in average MMR. The data for T appears approximately normal, but the study does not conclusively show that MMR is normal. 6) Because some people have a problem understanding this: -I calculate the unbalance of skill not the reasons for this unbalance! -I calculate the average skill of an race not the general popularity of an race 7) Data: Datafile Concluding word + Show Spoiler + Please keep in mind that the imbalance result is very small and there will be never perfect balance. You only improve, in the long term, by ignoring the balance. Your race can be underpowered today and overpowered tomorrow.
Also, the users make most of the balance, not the game designer. But this is a different topic....
MMR distribution by races.+ Show Spoiler +On July 11 2012 14:39 Not_That wrote:Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran Same graph normalized, each bar representing the percentage of players of each race in the bin: ![[image loading]](http://s10.postimage.org/aut44y479/image.jpg) The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. README before writing a long post why you think that is no scientific statistic prove. + Show Spoiler +This is not an university paper about sc2 balance I dont get money for this. I dont personal care which race is op or not I publish the data i collected with my own program that i wrote to back calculate mmr. I found a very interesting anomalies in the race data. So i published the result here. If you want to do a more complex test or analyse with the data. Feel free to do so!datafileIf you read the text careful, i think will agree that this is not perfect but a way better method than tldp win-ratios or random tournament results.
If you want to discuss the method and the significant of the data , first read my op and the analyses in it + the analyses and discussion of other people in this thread. I did the best i can and willing to do to prove the significant of the data. If you misunderstand the result this is not my problem.
GL & HF Skeletor
|
|
This is some excellent data from our own little ladder world. Good to see it all in once place!
|
Great work Skeldark. Thank you for contributing your hours to give us a better understanding of the situation.
|
Hmm, pretty cool.
Also, I highly approve of your nick. (The one you signed with at the end.)
|
Whoah, rechecked that, you have 149,000 games of data. And you are claiming 4% of that is you as well?
So you have 5900 games of your own in this?
And why did you run the random deviation tests than only running 1,000 games, and not at least equal to the 149,000. (You actually should run random monte carlo's for whatever the estimated current userbase is to get some mock battle.net ladders from a perfectly balanced game). I could easily pick 1,000 games out of your current data and show significant imbalance towards any of the three races.
Also what are the dates your data is from?
This is a cool idea... the deviation bit is just not nearly enough random games to be accurate. What program did you use to run these tests, and was it length of the test that prevented you from doing a few hundred thousand?
|
Excellent contribution, skeldark, as always.
I just hope the community doesn't take this for more than it is...
|
On July 11 2012 01:50 1st_Panzer_Div. wrote: How the hell did you get 45,000 games of data? Just really curious where you got that, it's a ton of data to work with.
Edit: 4% of that data is you?
So you have 1,800 of your own games in the data? Damn...
http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
----------------------------- Thanks for your time and effort man, I think that the mmr difference of only 3 victories is there mainly because TvZ/ZvT isn't the only matchup we have there at the moment, the imbalance for ZvT is probably larger and the balance for TvP and ZvP is probably fine, that results in these numbers I'd guess.
|
Ok time for me as a random player to start crying about balance. Buff random already Blizzard!!
|
On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters?
|
On July 11 2012 01:58 SDream wrote:Show nested quote +On July 11 2012 01:50 1st_Panzer_Div. wrote: How the hell did you get 45,000 games of data? Just really curious where you got that, it's a ton of data to work with.
Edit: 4% of that data is you?
So you have 1,800 of your own games in the data? Damn... http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561----------------------------- Thanks for your time and effort man, I think that the mmr difference of only 3 victories is there mainly because TvZ/ZvT isn't the only matchup we have there at the moment, the imbalance for ZvT is probably larger and the balance for TvP and ZvP is probably fine, that results in these numbers I'd guess.
Project is from May 3rd. And I rechecked, 4% would be 5900 games.
And to show what that would do, his current claim is that terran is 2.7% lower mmr than they should be if the game is balanced. 4% of the games are from him personally.
Also see my above post asking why he only ran 1,000 random sample (or x4, so 4,000) and not at the very least 145,000, though the full 1v1 population of b.net would be the most accurate to run for his balanced random samples.
|
On July 11 2012 02:01 Malaz wrote: Ok time for me as a random player to start crying about balance. Buff random already Blizzard!!
I agree. If you go random your opponents should be set to 90% handicap. 
+ Show Spoiler +
|
|
It would be interesting if you could divide this by league too.
|
Great work on your program!
Looking at your statistics I can come up with a few criticisms.
Namely you inherently have very strong response bias due to the data collection using your program. Additionally, your "Proof of Concept" section is rather confusing and I'm not sure I agree with your postulates there, namely numbers 1 and 3.
I think if you included a section with your calculations, findings and descriptions in a Methods section that you findings would be much easier to follow!
|
On July 11 2012 02:08 Stiluz wrote: It would be interesting if you could divide this by league too.
Or... Divide by zero ?!
|
This is very interesting. Thx for all the work!
Do you skeldark think it is possible to make a ladder/ total rank of the MMR data? Like sc2ranks just with MMR instead of points? I would be interested who is the best on the servers.
On July 11 2012 01:50 1st_Panzer_Div. wrote: 4% of the games are from him personally.
That's not what he is saying...
|
This shows us what everyone already knew. Terrans are winning less verses zerg. Zergs are winning more versus Terran. Keep doing the data though. It would be interesting to see what it looks like after Terran has some sort of breakthrough.
Edit: Also, good work!
|
This is excellent work but could you please combine it with different skill ranges?
For instance top 2% of Protoss player compared to top 2% of Terran Players etc.
|
Calgary25967 Posts
On July 11 2012 02:03 Mendelfist wrote:Show nested quote +On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum?
|
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players.
|
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? As far as i know he only/mainly uses data from diamond, master, grandmaster. These are the only players whose MMR is accurate.
|
I read it I cant say I understood all of it but I have one question.
If the race you are playing is under powered and MMR is created to match you with an opponent which have on an average the same MMR as you, would you continiously be matched up with opponents that technically are worse than you but compensated by the more powerful race?
Put it this way (possibly a bit simplified, but I hope you understand where Im getting at)
MMR = Skill * race power
Winrate = 50% regardless of your MMR.
|
Data:
i have currently 106374 games. When i talk about my users i dont mean myself i mean: http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561 For this calculation i use diffrence users: i have 60.000 diffrent mmr values of ladder users worldwide. Most of them i dont know the race because i added the race in a later version. So racedata is about 9000 sc2 accounts. I get near 5000 games daily with ^ 1000 -2000 valid mmr results for new account.
Matchups: I dont look on single game results. I look for the skill rating of the hole account and what main race this account playes. So i can not say anything about specific matchups!
On July 11 2012 02:03 Mendelfist wrote:Show nested quote +On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters?
My data includes bronce to gm. But most of the accounts i track are diamond - gm. With more data in future i could do this data for each league and server.
On July 11 2012 01:50 1st_Panzer_Div. wrote: Whoah, rechecked that, you have 149,000 games of data. And you are claiming 4% of that is you as well?
of my users not of myself
And why did you run the random deviation tests than only running 1,000 games, and not at least equal to the 149,000. (You actually should run random monte carlo's for whatever the estimated current userbase is to get some mock battle.net ladders from a perfectly balanced game). I could easily pick 1,000 games out of your current data and show significant imbalance towards any of the three races.
Also what are the dates your data is from?
This is a cool idea... the deviation bit is just not nearly enough random games to be accurate. What program did you use to run these tests, and was it length of the test that prevented you from doing a few hundred thousand?
the program i use is selfwriten. Before running a test i have to analyse the mmr of all accounts. this alone need 1 hour. Also it does not care how often i run the random values at all. It would be enough to do it 100 time.
. I could easily pick 1,000 games out of your current data and show significant imbalance towards any of the three races. Yes you could, and if you check the result, you see that the chance that this happens randomly in my case is under 0.0001 %
should run random monte carlo's for whatever the estimated current userbase ..... ahhhhhhh .... monte carlos can be wrong . statistic can be wrong.... I dont see any other point why you mention them. Also you dont say random monte carlo. You call this algorithem family monte carlo because they all work with random...
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum?
Than ANY balance is dismissed in your eyes. This is not a question about my data this is a question about the definition of balance! like i said in my post the data is very biased toward skill. The average of the ladder is gold. The average of my data is high diamond! Also last time i check everyone on ladder cared for balance at his level ^^
On July 11 2012 02:23 Moonalisa wrote: I read it I cant say I understood all of it but I have one question.
If the race you are playing is under powered and MMR is created to match you with an opponent which have on an average the same MMR as you, would you continiously be matched up with opponents that technically are worse than you but compensated by the more powerful race?
Put it this way (possibly a bit simplified, but I hope you understand where Im getting at)
MMR = Skill * race power
Winrate = 50% regardless of your MMR. skillrating is your skill. Means if you are underpowered you have less skill. So you play less skill players. If they are from an overpowered race than they are race independent worse then you . Yes.
|
Calgary25967 Posts
On July 11 2012 02:19 OrbitalPlane wrote:Show nested quote +On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? As far as i know he only/mainly uses data from diamond, master, grandmaster. These are the only players whose MMR is accurate. He said "mainly". But he doesn't say what ratio.
|
On July 11 2012 02:29 Chill wrote:Show nested quote +On July 11 2012 02:19 OrbitalPlane wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? As far as i know he only/mainly uses data from diamond, master, grandmaster. These are the only players whose MMR is accurate. He said "mainly". But he doesn't say what ratio.
"The average of the ladder is gold. The average of my data is high diamond!"
http://www.teamliquid.net/forum/viewmessage.php?topic_id=351786¤tpage=2#24
|
Calgary25967 Posts
On July 11 2012 02:18 Shiori wrote:Show nested quote +On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. We can't just dimiss it. Imagine, for whatever reason, that there is a strong bias for new players to automatically choose Terran. The remaining players try all the races and determine which of the three fit their styles, making them more likely to win. Because you can imagine a situation where Zerg and Protoss average win rates are higher than Terran it must be addressed.
|
Posts like these are what makes TL so exceptional. Very nice OP, most impressive analysis.
|
On July 11 2012 02:11 Huragius wrote:Show nested quote +On July 11 2012 02:08 Stiluz wrote: It would be interesting if you could divide this by league too. Or... Divide by zero ?!
Let's not collapse the universe until HotS is out, at least.
|
Calgary25967 Posts
On July 11 2012 02:25 skeldark wrote: Than ANY balance is dismissed in your eyes. This is not a question about my data this is a question about the definition of balance! like i said in my post the data is very biased toward skill. The average of the ladder is gold. The average of my data is high diamond! Also last time i check everyone on ladder cared for balance at his level ^^ I guess so. I'm just wondering what effects from these results you want. Common sense says that balance at the highest level can be different from the average balance. If this is just for information then it's good to know, but I think a more focused approach would be more useful.
|
This thread is an embarrassment. I think it's been adequately explained *why* it would be impossible to balance for all levels, but tbh as a low level player I couldn't really care less. Obviously MMR and matchmaking mean that I don't lose more than I win, but I'm not more mad if the game is imbalanced and I lose to a "less skilled" player, I don't see why it has any meaning. A good analogy would perhaps be martial arts. A guy who's a lot bigger and stronger than me has an obvious advantage in a fight, he is "imba". But I know that if I improve my skills I can still beat him, and thats what I get the satisfaction from, not from ensuring I only fight guys my exact height and weight. Similarly in SC2 just winning more and knowing that I'm getting better is where the satisfaction comes from. Besides, until high diamond macro is by far the most important reason people suck. Improve your macro and stop whining about balance.
|
On July 11 2012 02:31 Chill wrote:Show nested quote +On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. We can't just dimiss it. Imagine, for whatever reason, that there is a strong bias for new players to automatically choose Terran. The remaining players try all the races and determine which of the three fit their styles, making them more likely to win. Because you can imagine a situation where Zerg and Protoss average win rates are higher than Terran it must be addressed. That sounds incredibly unlikely in the sense that you're suggesting, though. When Blizzard says that more new players choose Terran, they don't say or even suggest that those people choose their race in a different way than the people who choose other races. It's entirely possible (and likely, I'd say) that most people pick Terran because they're the protagonists and because they're human beings. The people who pick Zerg/Protoss at the noob level don't know enough about their "styles" to make a choice that really affects their ability to win, because it's not actually very clear from the start what the styles of the races even are.
The people who pick P/Z are probably motivated by the same thing that motivates players to pick Terran: they think the race is cool. It just so happens that there are fewer of them because aliens are less appealing than humans.
Besides, even if there were a strong bias for new players to automatically choose Terran, talented RTS players are talented RTS players, and weak RTS players are weak. I don't believe that players have an inbuilt magical bias to one race that influences them so much that they'd be incapable of playing the other races at a high level. If Terran is just more appealing to human beings in general, then it's going to attract all sorts with random distribution, meaning that the average Terran player isn't going to be any better than the average P/Z player because it's just a larger sample but is still evenly distributed.
Until you can show that there's a race which has a baffling number of good players but almost no bad players, the point is moot.
|
On July 11 2012 02:18 Shiori wrote:Show nested quote +On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea.
|
On July 11 2012 02:35 Chill wrote:Show nested quote +On July 11 2012 02:25 skeldark wrote: Than ANY balance is dismissed in your eyes. This is not a question about my data this is a question about the definition of balance! like i said in my post the data is very biased toward skill. The average of the ladder is gold. The average of my data is high diamond! Also last time i check everyone on ladder cared for balance at his level ^^ I guess so. I'm just wondering what effects from these results you want. Common sense says that balance at the highest level can be different from the average balance. If this is just for information then it's good to know, but I think a more focused approach would be more useful. Read my frist statement in the thread. I dont want anything. The last thing i want to do is to talk about how to blance or whine about units. I got the data, i publish the result. Thats all. What to do with it can everyone decide for themself.
About balance on skilllevels. At the moment i get enough data that i double the data i used here in 1 week. So in 2-3 weeks i should have enough to do it only for master+
On July 11 2012 02:40 Mendelfist wrote:Show nested quote +On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea.
There is one way to remove this type of biases. Its very complicated but i try to explain it. you have different amount of group a and b but want to compare them. You first add all the values of a and all values of b. now comes the trick: divide them by the amount! And suddenly it dont care how many there are in the group any more. The miracle of average....
This have nothing to do with my data directly but about this hole blance and high skill. Go on ladder and play 3 games. In at least one of them a guy crys over balance. And he means the balance of YOUR games not the pro game he saw yesterday! and you know who this players are ? most of them have an account on this site! So dont act like tl only cares for pro balance... They say they do if they post but most of them only care for blance of their game. And i understand that ( not the whiners on ladder tho... ) , in the end : Sc2 is our game . Not the game of the pro players. I know some people are here to only watch pro players. Im here because i like the game. For myself!
But like i said thats off topic.
I agree its interesting to see the balance at different levels and if i have more data i will do this for each league and server. Also the change that patches bring!
|
On July 11 2012 02:35 Chill wrote:Show nested quote +On July 11 2012 02:25 skeldark wrote: Than ANY balance is dismissed in your eyes. This is not a question about my data this is a question about the definition of balance! like i said in my post the data is very biased toward skill. The average of the ladder is gold. The average of my data is high diamond! Also last time i check everyone on ladder cared for balance at his level ^^ I guess so. I'm just wondering what effects from these results you want. Common sense says that balance at the highest level can be different from the average balance. If this is just for information then it's good to know, but I think a more focused approach would be more useful.
The reason why balance at the highest level is emphasized in common sense is the (quite valid imo) assumption that these players make fewest mistakes and balace discrepencies will be more easily distinguished from player errors when looking at a win/loss scenario.
But in a random samples, such variables do get ironed out. Assuming perfect (or close) balance, a diamond Terran, on avg, will make no more mistakes than a diamond Zerg. Thus, making this result valid.
I'm open to any input if I've made an error here.
|
On July 11 2012 02:41 skeldark wrote: About balance on skilllevels. At the moment i get enough data that i double the data i used here in 1 week. So in 2-3 weeks i should have enough to do it only for master+ That would be very interesting. Please keep up the good work, and don't take our criticism too hard. This is how science is made, after all. :-)
|
On July 11 2012 02:40 Mendelfist wrote:Show nested quote +On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea.
The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower, he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance.
OP please correct me if I'm mistaken.
|
On July 11 2012 02:41 skeldark wrote:Show nested quote +On July 11 2012 02:35 Chill wrote:On July 11 2012 02:25 skeldark wrote: Than ANY balance is dismissed in your eyes. This is not a question about my data this is a question about the definition of balance! like i said in my post the data is very biased toward skill. The average of the ladder is gold. The average of my data is high diamond! Also last time i check everyone on ladder cared for balance at his level ^^ I guess so. I'm just wondering what effects from these results you want. Common sense says that balance at the highest level can be different from the average balance. If this is just for information then it's good to know, but I think a more focused approach would be more useful. Read my frist statement in the thread. I dont want anything. The last thing i want to do is to talk about how to blance or whine about units. I got the data, i publish the result. Thats all. What to do with it can everyone decide for themself. About balance on skilllevels. At the moment i get enough data that i double the data i used here in 1 week. So in 2-3 weeks i should have enough to do it only for master+ Show nested quote +On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. There is one way to remove this type of biases. Its very complicated but i try to explain it. you have different amount of group a and b but want to compare them. You first add all the values of a and all values of b. now comes the trick: divide them by the amount! And suddenly it dont care how many there are in the group any more. The miracle of average.... In that case I don't think you understood me. You ignored my sentence about race switching.
Take 100 coins. Place them on the table so that 70% are heads and 30% are tails. Then grab a random amount and throw them up in the air. You will now find that most likely there are less than 70% heads and more than 30% tails. This is what's happening on the ladder.
|
On July 11 2012 02:51 sevencck wrote:Show nested quote +On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. No it's not, because the bias is dependent on time (because in a certain time period there always a chance that a player will switch race), and MMR is also time dependent. Therefore you will find a correlation between MMR and race.
|
On July 11 2012 02:51 sevencck wrote:Show nested quote +On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table.
I saw some post about winrates. The advantage of this method is: I ignore winrates! I use the skill function of blizzard, the same method they use to get ladder balance data!
Another method to get Pro trends is to watch the MMR - CHANGE of the highest players on ladder compare to their race.
|
Nice work,
We need more people like you, and less idiots that talk about shit they don't know.
|
On July 11 2012 03:06 Ambre wrote: Nice work,
We need more people like you, and less idiots that talk about shit they don't know.
Unfortunately the way the world works, we're getting less of the former while the latter are breeding like rats.
|
On July 11 2012 02:58 skeldark wrote:Show nested quote +On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table.
No, but the ladder throws coins (edit: or rather the users of the ladder). I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right?
|
On July 11 2012 03:09 Mendelfist wrote:Show nested quote +On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring.
|
On July 11 2012 03:09 Mendelfist wrote:Show nested quote +On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? As his data is already biased to the users of his MMR calculation (and their opponents), we can rule out the "I'm so new I know only terran crowd". If that's not enough for you, a cut-off eliminating all data points below diamond league should do it, right?
|
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum?
It's only taken from Diamond, Masters, and Grandmasters of US/EU, he said that in his conclusion.
|
On July 11 2012 03:12 Shiori wrote:Show nested quote +On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring.
Do you really mean that I need evidence for that new players who just bought the game often choose terran as their first race?
|
I think ladderbalance is important. It really lowers the amount of fun I have when I play against clearly worse players and still only have 50% WR.
|
Is it possible to create a graph displaying your balance data/figures vs skill level (MMR is a nice blizzard proxy for that or perhaps per level [bronze, platinum, etc])?
|
On July 11 2012 03:16 Mendelfist wrote:Show nested quote +On July 11 2012 03:12 Shiori wrote:On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring. Do you really mean that I need evidence for that new players who just bought the game often choose terran as their first race? You need evidence for new players who just bought the game and choose Terran as their race only to switch a few months into playing and contrast it with evidence showing that players who pick P/Z after buying switch less or not at all.
|
On July 11 2012 03:12 Shiori wrote:Show nested quote +On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring.
The point is valid. that would lower the terran avg mmr.
But
A) http://sc2ranks.com/stats/league/all/1/all
B) And even if, i have so few bronce - gold users and they dont ladder much = i have even less bronce - gold opponents. I think i could filter them and dont loose much accounts.
C) i still get the right imbalance data. because this would be inbuild imbalance! to show you this point: imagine you are not allowed to play z anymore if are top 1k on kr ladder once. We would not see any zerg players in big tournament. = the game would be imbalanced! Its impossible to calculate this factors out correct. Impossible for every system not only the one i use. No method could do somthing against this, because it valid imbalance data.
|
Hmm this is pretty interesting. I'm still a little bit skeptical because you didn't get the data from an SRS from all battle.net players. However, the sample data is probably going to be good enough.
I completely agree that the "race x has better players" argument doesn't hold in high sample sizes.
|
On July 11 2012 03:12 Shiori wrote:Show nested quote +On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring.
There is not a lot of evidence to chew on in the OP either. I see a ton of numbers with no league basis or any grounded data for me to latch on to. The MMR stats do not even tell me who is in what league or any information on the players themselves, beyond their primary race. It does not even show if they were off racing in that specific match. The data itself has is filled with players who have only play one game in a specific time frame.
I am not sure what to think of the findings, but my efforts to dig into his methods have not yielded the results I was expecting.
|
On July 11 2012 03:15 furerkip wrote:Show nested quote +On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? It's only taken from Diamond, Masters, and Grandmasters of US/EU, he said that in his conclusion. Mainly not only!
On July 11 2012 03:17 archonOOid wrote: is it possible to create a graph displaying your balance data vs the skill level (MMR is a nice blizzard proxy for that or perhaps per level [bronze, platinum, etc])? need more dataaaaaaaaaaa. Help me and send it: http://www.teamliquid.net/forum/viewmessage.php?topic_id=351748
|
On July 11 2012 03:14 Thrombozyt wrote:Show nested quote +On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote: The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? As his data is already biased to the users of his MMR calculation (and their opponents), we can rule out the "I'm so new I know only terran crowd". If that's not enough for you, a cut-off eliminating all data points below diamond league should do it, right?
No, we can't rule that out. His data includes all leagues. And as I already said, if there is a bias for beginners to choose terran, there is no easy way to tell how high this bias persists. My gut feeling tells me if we include only masters and up any remaining bias would be utterly negligible, but that's only a feeling. I have no numbers to back it up. It depends on how often people race switch, for example.
|
I must say that this is some impressive work!
|
On July 11 2012 02:31 Chill wrote:Show nested quote +On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. We can't just dimiss it. Imagine, for whatever reason, that there is a strong bias for new players to automatically choose Terran. The remaining players try all the races and determine which of the three fit their styles, making them more likely to win. Because you can imagine a situation where Zerg and Protoss average win rates are higher than Terran it must be addressed.
how about the fact that, this season, Terran is the least represented race in GM in the history of the league? Isn't GM purely based on MMR?
|
On July 11 2012 02:58 skeldark wrote:Show nested quote +On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table.
Wait. Was this supposed to make me laugh? Cuz I did.
I laughed real hard.
|
On July 11 2012 03:20 Plansix wrote:Show nested quote +On July 11 2012 03:12 Shiori wrote:On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring. There is not a lot of evidence to chew on in the OP either. I see a ton of numbers with no league basis or any grounded data for me to latch on to. The MMR stats do not even tell me who is in what league or any information on the players themselves, beyond their primary race. It does not even show if they were off racing in that specific match. The data itself has is filled with players who have only play one game in a specific time frame. I am not sure what to think of the findings, but my efforts to dig into his methods have not yielded the results I was expecting. The data is linked. if you want the full data over all games you find it here : http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561 I show you the mmr. This is way more accurate and correct than the blinking league icons that comes with an mmr range of over +- 600 I dont show single games! i show sc2 accounts not games in this analyse! How many games one player have is total unimportant! The mmr show the skill of the account
On July 11 2012 03:23 TsGBruzze wrote: I must say that this is some impressive work!
Thank you!
On July 11 2012 03:28 danl9rm wrote:Show nested quote +On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. Wait. Was this supposed to make me laugh? Cuz I did. I laughed real hard. depends about the part that let you laugh 
@mendel http://sc2ranks.com/stats/league/all/1/all so we can end this discussion. the all terran in low leagues was in the first 6 month of sc2. Not any more.
|
On July 11 2012 03:17 Shiori wrote:Show nested quote +On July 11 2012 03:16 Mendelfist wrote:On July 11 2012 03:12 Shiori wrote:On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:
I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring. Do you really mean that I need evidence for that new players who just bought the game often choose terran as their first race? You need evidence for new players who just bought the game and choose Terran as their race only to switch a few months into playing and contrast it with evidence showing that players who pick P/Z after buying switch less or not at all. There is no need for P/Z switching less for this phenomenon to occur. Try the experiment with the coins. If there is an initial bias for whatever reason, and people then randomly switch race for whatever reason, it's unlikely that the switches will preserve the initial bias. This will cause a change of race distribution over time. That's all it takes.
|
On July 11 2012 03:29 Mendelfist wrote:Show nested quote +On July 11 2012 03:17 Shiori wrote:On July 11 2012 03:16 Mendelfist wrote:On July 11 2012 03:12 Shiori wrote:On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:
I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring. Do you really mean that I need evidence for that new players who just bought the game often choose terran as their first race? You need evidence for new players who just bought the game and choose Terran as their race only to switch a few months into playing and contrast it with evidence showing that players who pick P/Z after buying switch less or not at all. There is no need for P/Z switching less for this phenomenon to occur. Try the experiment with the coins. If there is an initial bias for whatever reason, and people then randomly switch race for whatever reason, it's unlikely that the switches will preserve the initial bias. This will cause a change of race distribution over time. That's all it takes. Which only matters if some of the coins are more valuable than others and only the more valuable ones are showing up, so to speak.
|
United States4991 Posts
It's interesting data, but is it possible to get the same results of people over (say) 2200 MMR? That's a bit above the master league cutoff IIRC (would exclude "diamond / low master"), and it would probably help address a lot of the complaints that lower leagues may be biasing it / are irrelevant to the question of balance.
|
On July 11 2012 03:33 Shiori wrote:Show nested quote +On July 11 2012 03:29 Mendelfist wrote:On July 11 2012 03:17 Shiori wrote:On July 11 2012 03:16 Mendelfist wrote:On July 11 2012 03:12 Shiori wrote:On July 11 2012 03:09 Mendelfist wrote:On July 11 2012 02:58 skeldark wrote:On July 11 2012 02:51 sevencck wrote:On July 11 2012 02:40 Mendelfist wrote:
I don't follow you here. The point is that new players choose terran because of the campaign, and some of them later in their career switch race. The switch is important, because that WILL cause an over-representation of terrans at lower MMR. It has been known for a long time, for example by looking at sc2ranks statistics. There is no easy way to remove these types of biases from any data that we have. We also don't know how far up the leagues this bias persists. Is it only in bronze/silver? I have no idea. The bias you're addressing is (presumably) independent of skill. The over representation of Terran at lower MMR has no bearing on an analysis correlating skill by race with MMR. He's not taking a snapshot of race at all skill levels and saying the Terran average MMR is lower (that would be easy to do, and wouldn't prove anything since the ladder is biased for 50% anyway), he's analyzing winrates between races but factoring in the "hidden" MMR rating. In other words, a 50% TvZ winrate may at face value appear balanced, but if the average MMR of the Terrans in that sample is statistically significantly higher than that of the Zergs, it suggests imbalance. OP please correct me if I'm mistaken. No. thats basic it. Mendelfist i dont throw this coins. What i do is: take the avg weight of the 70% head coins and the avg weight of the the 30% tail coins . And than i say : the coins that show head are 10g heavier than the coins that show tail. I don't care how many of each are on the table. No, but the ladder throws coins. I don't understand how you can't see this. Imagine that when you buy the game you don't even have a choice. You are forced to choose terran. Then when you have played for a month the other choices open up. It's pretty clear then that low MMR's will be over-represented by terrans, right? Except there's absolutely no evidence that this is occurring. Do you really mean that I need evidence for that new players who just bought the game often choose terran as their first race? You need evidence for new players who just bought the game and choose Terran as their race only to switch a few months into playing and contrast it with evidence showing that players who pick P/Z after buying switch less or not at all. There is no need for P/Z switching less for this phenomenon to occur. Try the experiment with the coins. If there is an initial bias for whatever reason, and people then randomly switch race for whatever reason, it's unlikely that the switches will preserve the initial bias. This will cause a change of race distribution over time. That's all it takes. Which only matters if some of the coins are more valuable than others and only the more valuable ones are showing up, so to speak.
Er, what? Now you lost me again. The point is that it's impossible to tell the change of race distribution due to random race switching from the change of race distribution due to some races having it easier to move up the ladder.
|
I am not a statistics guru but...
I think the question that really needs to be answered with this data, is as follows:
Does Race X on average win against Race Y even when Race Y has a higher MMR.
In other words the Null Hypothesis is that Race does not matter. A higher skilled player should be a lower skilled player regardless of the race.
This would show that certain races win over others even if the other player is more skilled. I don't think a flat average or races really is going to say anything.
|
On July 11 2012 03:41 Smancer wrote: I am not a statistics guru but...
I think the question that really needs to be answered with this data, is as follows:
Does Race X on average win against Race Y even when Race Y has a higher MMR.
In other words the Null Hypothesis is that Race does not matter. A higher skilled player should be a lower skilled player regardless of the race.
This would show that certain races win over others even if the other player is more skilled. I don't think a flat average or races really is going to say anything.
Better skilled is the key word. Who is better skilled? the guy with higher mmr.? The point is the imbalance is already in the mmr. Because you play a strong race you have more mmr than your (real skill) but we dont know what your realskill is, all we know is your mmr. A method to detect your realskill in a balanced game!
So i detect imbalanced by searching for difference of the avg race mmr by assuming that all races should have equal skill. Some people say this assumption is wrong, but if its wrong the game is per defaulted imbalanced over the playerbase. Also there is no objective point that disagree with it.
an example: You are race X underpowered. Your ingameskill ( mmr) is under your "realskill" because of this. You face now a player of race Y that is overpowered = you both have the same ingame skill = MMR and trade win/looses. Hes realskill is under your realskill but the imbalance let you both have the same ingameskill ( mmr) . You would not detect the problem with your method. The system will always show you a perfect world.
There is one other method that you can use to show trends: you look at the change of mmr of an race over time! Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.
The method i use is the only one that can give you an balance indicator. People also look on tournement results. But tournement brackets are just another MMR system. The winnner playes vs the winner = the better players are matched against each other. In an tournament = everybody start by 0 mmr and everytime you advance you raise in "mmr". So looking at races placements in tournements is exactly what i do here on ladder. The problem is there are way more laddergames than tournament games and the ladder mmr system is way more accurate than the tournament (min-max) system.
|
On July 11 2012 03:41 Smancer wrote: I am not a statistics guru but...
I think the question that really needs to be answered with this data, is as follows:
Does Race X on average win against Race Y even when Race Y has a higher MMR.
In other words the Null Hypothesis is that Race does not matter. A higher skilled player should be a lower skilled player regardless of the race.
This would show that certain races win over others even if the other player is more skilled. I don't think a flat average or races really is going to say anything.
I think that would mainly show the instability in a race's play. Not the "imbalance", because the player with the higher mmr is already benefiting from the imbalance of his race.
|
I'm not sure what you are trying to show me. There is a clear over-representation of terrans in bronze and silver compared to the other leagues.
|
I find it sad that a community member must do these calculations and post them. Then Dustin B. just says in an interview that everything in every ladder and server is 50-50 and in winrates in every matchup early game late game what ever still 50-50. Then he says they are monitoring a situation where last month there was a 0,5% imbalance. And everything without zero facts.
I hope you get more data (masters+). Presenting clear facts based on data is never wrong. Everybody can then make up their own mind about what's the cause and balance and so on. Keep doing what you're doing.
|
It's impressive work, however there are some flaws with it.
Your calculation doesn't take practice time into account. In general the foreign scene plays less time and in a less disciplined way then the koreans. If protoss and zerg can show better results with less time-input than terran, does this mean T is underpowered? In my opinion, no it doesn't (although maybe the race needs some changes), the only balance that really matters is the balance when players of every race are putting as much time and effort into the game as possible i.e. the professional players.
Additonally the metagame follows the pro-scene, and there are hardly any foreign terran community figureheads to lead the scene. I know there a lot of korean terrans, but a lot of foreign players seem to be disinterested or biased against the korean players for whatever reasons (maybe cos gsl is on at odd times for some people).
|
On July 11 2012 04:58 Zrana wrote: Your calculation doesn't take practice time into account. In general the foreign scene plays less time and in a less disciplined way then the koreans. If protoss and zerg can show better results with less time-input than terran, does this mean T is underpowered? In my opinion, no it doesn't (although maybe the race needs some changes), the only balance that really matters is the balance when players of every race are putting as much time and effort into the game as possible i.e. the professional players.
You nailed it. Worse macro (effective multitasking) due to lower time spent practicing, i.e., consistently showing higher avg min unspent/min shouldn't land players at the same MMR just based on different race choice.
le: completely anecdotical, most of my income/min // unspent/min in tvp is somewhere around 2200/1200 vs 2200/2400, in nailbitingly close games.
|
On July 11 2012 04:44 Mendelfist wrote:I'm not sure what you are trying to show me. There is a clear over-representation of terrans in bronze and silver compared to the other leagues.
I think the point was that this game isn't new any more. There aren't that many new playes being added to the ladder, relative to release day. So a much smaller proportion of this "over-representation" could be made up of the bad/new players. Further, there is no way to say why there are more Terran at Bronze and proceed to drop off as you go up in leagues. At time progresses, the "noob" bias becomes much less significant. It's definitely still there, only OP with leagues breakdowns can tell us.
|
On July 11 2012 05:39 slane04 wrote:Show nested quote +On July 11 2012 04:44 Mendelfist wrote:I'm not sure what you are trying to show me. There is a clear over-representation of terrans in bronze and silver compared to the other leagues. I think the point was that this game isn't new any more. There aren't that many new playes being added to the ladder, relative to release day. So a much smaller proportion of this "over-representation" could be made up of the bad/new players. Further, there is no way to say why there are more Terran at Bronze and proceed to drop off as you go up in leagues. At time progresses, the "noob" bias becomes much less significant. It's definitely still there, only OP with leagues breakdowns can tell us. you have way better data than league breakdown. you have mmr.... if i go for leagues this data would be so inaccurate i could do nothing with it.
I found a mistake. I put many opponent more than a single time in the file. This is race independent mistake and should not affect the result. However i will correct it and run the calculation. Could take several hours until he is done....
|
On July 11 2012 05:43 skeldark wrote:Show nested quote +On July 11 2012 05:39 slane04 wrote:On July 11 2012 04:44 Mendelfist wrote:I'm not sure what you are trying to show me. There is a clear over-representation of terrans in bronze and silver compared to the other leagues. I think the point was that this game isn't new any more. There aren't that many new playes being added to the ladder, relative to release day. So a much smaller proportion of this "over-representation" could be made up of the bad/new players. Further, there is no way to say why there are more Terran at Bronze and proceed to drop off as you go up in leagues. At time progresses, the "noob" bias becomes much less significant. It's definitely still there, only OP with leagues breakdowns can tell us. you have way better data than league breakdown. you have mmr.... if i go for leagues this data would be so inaccurate i could do nothing with it. I found a mistake. I put many opponent more than a single time in the file. This is race independent mistake and should not affect the result. However i will correct it and run the calculation. Could take several hours until he is done....
Yep, sorry that's what I meant. I meant the percentage off MMR's which would place you in Bronze that was included in the basket of MMR's that you used to calculate imbalance.
|
|
can look this up in the file. but i know my datapool so i can already tell you bronze and silver is forgettable small
Its accounts not games and i dont have race data for most of them so i cant use them. Also like i said i have and mistake and added many accounts double in it, Correct it at the moment.
the gamefile is in the other thread and includes 100.000 games. and 0 of this games are from me personal!
And the random runs have nothing to do with the result. I dont know what you guys are talking about but i you are on the wrong road...
|
Well, nothing we didn't know, but damn you seems like a smart guy. I'm impressed by your work for this and previous ones.
|
um... your random run and the terran so called "statistically significant" difference... Um, can I get some t-statistics to check for the random error? I mean.. I don't see any in your post, so i'm wondering where you're deriving the "significance" from. Also, the notion of skill independent of balance is a tricky matter in terms of interpretting data...but that's outside the scope of the post I believe anyway...
|
On July 11 2012 06:17 Aletheia27 wrote: um... your random run and the terran so called "statistically significant" difference... Um, can I get some t-statistics to check for the random error? I mean.. I don't see any in your post, so i'm wondering where you're deriving the "significance" from. Also, the notion of skill independent of balance is a tricky matter in terms of interpretting data...but that's outside the scope of the post I believe anyway... um...calculate ... um .... yourself.
its all there. i thought its enough to show its above 99.9% but feel free to get the exact number..
|
Nice job. But balance at (average) diamond isn't that interesting. Even high master/gm at NA/EU/SEA almost don't matter, but it would nice to do it there.
|
On July 11 2012 06:23 Tuczniak wrote: Nice job. But balance at (average) diamond isn't that interesting. Even high master/gm at NA/EU/SEA almost don't matter, but it would nice to do it there. you play sc2? you are korean grandmaster? if yes and no. Would you like it to have EU grandmaster skill and play against bronce guys because you play the wrong race? I extrapolate to show your, your extrapolation.
This game is not for pro player! this game is for everyone! If this game would be only for pro players than , i would not be here 
Btw most people ignore how close the numbers actually are.
|
MMR is the best representation of data he could have possibly gathered for this particular project..I honestly think there are a lot of angry people here who are reading what they want to read and making a lot of silly moot points.
You know who you are.
|
u really should get a Quality Poster Star
|
So... are these stats a product of imbalance or metagame?
|
On July 11 2012 06:30 nkr wrote:So... are these stats a product of imbalance or metagame? 
Neither? The data used is very broad and has to many factors. As it is not based on games, but active accounts, there are an overwhelming number of factors that could be in effect. I don't disgree with the idea, but the execution is not focused enough to provide any real findings.
|
On July 11 2012 06:30 nkr wrote:So... are these stats a product of imbalance or metagame? 
Ah, but is there a difference?
|
Is it possible to show your results for specific matchups? PvZ, TvZ, and TvP? I feel like without that this data could be slightly flawed. Also, are you weighing certain matchups more often than others because of race distribution? Or do you have an even number of matchups when you take this data?
|
i think do masters only as a subgroup, GM as a subgroup, and the same for korea only.
|
On July 11 2012 06:26 CrY. wrote: MMR is the best representation of data he could have possibly gathered for this particular project..I honestly think there are a lot of angry people here who are reading what they want to read and making a lot of silly moot points.
You know who you are.
^^
On July 11 2012 06:28 Zeon0 wrote: u really should get a Quality Poster Star I think i collected to many "enemys" for this .... check my posthistory ^^
On July 11 2012 06:40 Itsmedudeman wrote: Is it possible to show your results for specific matchups? PvZ, TvZ, and TvP? I feel like without that this data could be slightly flawed. Also, are you weighing certain matchups more often than others because of race distribution? Or do you have an even number of matchups when you take this data? NO. i look at race only not at single games or matchups.
On July 11 2012 06:42 ThePlayer33 wrote: i think do masters only as a subgroup, GM as a subgroup, and the same for korea only. master would be possible with more data. But korean i dont see. I just have to less users on KR server.
|
Glad you decided to release these, skeldark.
Any plans to retroactively gather people's race?
|
On July 11 2012 06:50 InfCereal wrote: Glad you decided to release these, skeldark.
Any plans to retroactively gather people's race? would need an bot for that and spam the webserver but technical no problem.
|
|
I dont think its that terran is underpowered, but I do think that its much harder to play terran because of the higher micro requirement. Think about it:
TvZ: Marine splitting versus surrounding and fungals. TvP: Late game, terran has to #1 move back from storms #2 focus fire vikings, #3 EMP templars. Versus... storming. (and maybe FF)
Dont get me wrong, its not that there is TONS of different things Z and P can do too, but the CONSEQUENSES for failing this micro is a lot more severe for terran, they often flat out loose! While mistakes for the other races is much less punished.
Yes i play terran, i kinda had to get it out there =) thats my 2 cents on why the data looks like it does.
|
|
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum?
I know the mindset around here is to ignore balance outside of Code-S and pro tournaments, but even though average balance is insignificant, it is not uninteresting. At least I find it very interesting as a casual competitor. I'll never experience top-level balance and I rarely watch tournaments, so really all that's left for me is average balance. Maybe that's why I find it more interesting than most. Also it's refreshing to see some hard data instead of just whining. I think this forum should encourage that kind of data-driven analysis.
|
Isnt it possible to make a statement about balance just by looking at the winrates of GM/Master players in each match-up?
If due to a patch you lose more when playing a certain race your MMR should drop until you get back to an average of 50% winrate all three matchups combined. So lets say someone has 55% TvT, 55% TvP and 40% TvZ winrate he averages out at 50% winrate (like he would with the current ladder system). Now a patch happens and TvZ becomes much harder (lets say 30% winrate) what would happen is that the players MMR would drop therefore increasing his winrates in all 3 match-ups because he plays weaker players. You could just see how 'balanced' the game is by looking at the differences in winrates between the match-ups.
If one matchup is favoring a certain race what would happen is that you get a lot of people that have a lower winrate in that matchup and a higher in the other two since theyre kind of artificially doomed to play players below their level even though they might be able to win 50% or more vs players of higher mmr.
|
On July 11 2012 04:45 Jarree wrote: I find it sad that a community member must do these calculations and post them. Then Dustin B. just says in an interview that everything in every ladder and server is 50-50 and in winrates in every matchup early game late game what ever still 50-50. Then he says they are monitoring a situation where last month there was a 0,5% imbalance. And everything without zero facts.
I hope you get more data (masters+). Presenting clear facts based on data is never wrong. Everybody can then make up their own mind about what's the cause and balance and so on. Keep doing what you're doing. This is so true.... Blizzard obviously has access to this information. Instead it takes someone to reverse engineer and datamine to get some real information. Hearing statements like that from DB compared to the data available just screams incomeptence.
Kudos to the OP for the hard work though.
|
-Terran is statistic significant underpowered. -Zerg is statistic significant overpowered. -Protoss in possible overpowered but not significant. ggwp, nothing more to add
|
On July 11 2012 06:26 skeldark wrote:Show nested quote +On July 11 2012 06:23 Tuczniak wrote: Nice job. But balance at (average) diamond isn't that interesting. Even high master/gm at NA/EU/SEA almost don't matter, but it would nice to do it there. you play sc2? you are korean grandmaster? if yes and no. Would you like it to have EU grandmaster skill and play against bronce guys because you play the wrong race? I extrapolate to show your, your extrapolation. This game is not for pro player! this game is for everyone! If this game would be only for pro players than , i would not be here  Btw most people ignore how close the numbers actually are. Yes i play, i'm not GM. But as long as i have 50% winrate and games are fun it's all that matters for non-pro player like me.
Of course it would be stupid if you were losing to some 20apm guy, but the game is pretty good balanced and it's not the case. At mid master (me) and below you can't really tell your opponents skill, since your play is covered in mistakes and your opponents too.
|
United Arab Emirates439 Posts
This is awesome, props OP. You done work.
But everyone needs to keep in mind this reflects METAGAME balance. I look forward to, if OP or someone else can keep this thing updated, seeing how new Metagame changes effect this balance. I would love to have seen the effect of Infestor/Ling vs T, or the 1/1/1, or "Stephano Style" Roach Max vs Protoss, and all the other crazy meta game changes we have had.
|
UPDATE After 3 hour calculation:
Removed some broken accounts Removed double accounts Removed random players Removed All accounts with less than 1000 MMR ( bronce silver) Removed all data without race information Left accounts 5590
Result: Derivation very high because of low account count But im still to 99,31 % secure Terran is underpowered -30.09
Dont forget this are only 2 average wins on ladder
I will wait for more data and perhaps collect the racedata of the other 30k accounts before i make a new run
|
|
I might not understand because it is late, and I am quite tired, but isn't that well known already ? Sc2ranks has shown a deficit of Terran in the upper leagues for months now, and a surplus of Terran in the lower leagues. And MMR is related to the league, isn't it? So the Terran MMR is gonna be lower. Am I missing something ?
|
|
On July 11 2012 07:57 Psychobabas wrote: skeldark, thank you your welcome 
On July 11 2012 08:00 VyingsP wrote: I might not understand because it is late, and I am quite tired, but isn't that well known already ? Sc2ranks has shown a deficit of Terran in the upper leagues for months now, and a surplus of Terran in the lower leagues. And MMR is related to the league, isn't it? So the Terran MMR is gonna be lower. Am I missing something ?
mmr is not related to leagues. Leagues are related to mmr. ^^
Less population dont have to mean less avg mmr. But it is an indicator. Just wanted to show a new method that is more reliable than win/loose ratio.
|
I think the maps would affect these results a bit. The current map pool doesn't seem very prone to early Terran aggressions.
On the other note, excellent work skeldark!
|
SoCal8908 Posts
you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant.
|
pretty damn good post, thx bro !
|
|
SoCal8908 Posts
On July 11 2012 08:34 monkybone wrote:Show nested quote +On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance. That doesn't make any sense.
updated my post to provide clarity. MMR is quite literally a rating assigned to you based on how well you play and takes in no other factors except the wins and losses you have against other players with comparable MMRs and adjusts based on those wins and losses. it has nothing to do with balance and has been that way since the beta. i honestly dont understand why we're still looking at it.
|
On July 11 2012 08:38 BluemoonSC wrote:Show nested quote +On July 11 2012 08:34 monkybone wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance. That doesn't make any sense. updated my post to provide clarity. MMR is quite literally a rating assigned to you based on how well you play and takes in no other factors except the wins and losses you have against other players with comparable MMRs and adjusts based on those wins and losses. it has nothing to do with balance and has been that way since the beta. i honestly dont understand why we're still looking at it. Come on, man, at least read his post. He went out of his way to explain this.
|
SoCal8908 Posts
On July 11 2012 08:40 Shiori wrote:Show nested quote +On July 11 2012 08:38 BluemoonSC wrote:On July 11 2012 08:34 monkybone wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance. That doesn't make any sense. updated my post to provide clarity. MMR is quite literally a rating assigned to you based on how well you play and takes in no other factors except the wins and losses you have against other players with comparable MMRs and adjusts based on those wins and losses. it has nothing to do with balance and has been that way since the beta. i honestly dont understand why we're still looking at it. Come on, man, at least read his post. He went out of his way to explain this.
i read his post. MMR is a number based on wins and losses, nothing more. anything could happen in those games. i could play 10 games vs cheese and lose all 10, thus my MMR drops.
i could use one build that the metagame hasn't caught up with for 10 games and win all 10, thus causing my MMR to rise.
these are just a few examples (as i explained above) that make this post relatively unbased.
|
He missed the massive systematic error due to metagame issues.
|
On July 11 2012 06:37 Zrana wrote:Show nested quote +On July 11 2012 06:30 nkr wrote:So... are these stats a product of imbalance or metagame?  Ah, but is there a difference?
Yes. Absolutely. If a race is dominating another because a strategy or timing or micro trick or whatever hasn't been figured out, that doesn't mean it's broken and needs to be fixed by Blizzard. However, that's what has ended up happening since launch and they step in any time players get stuck on something for too long. The result has left us with....a rather strange game. All rushes and timing attacks and openers have been nerfed into the ground which has seriously screwed terran as I believe the race was literally built around them.
So in short, I do believe the issues we see these days are all very real balance problems but it's only because the game has changed so much from what it was originally meant to be.
|
On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out.
|
SoCal8908 Posts
On July 11 2012 08:52 canikizu wrote:Show nested quote +On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out.
you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game.
MMR only looks at wins and losses.
|
On July 11 2012 08:57 BluemoonSC wrote:Show nested quote +On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses.
Skill can easily be inferred from wins and losses.
|
SoCal8908 Posts
On July 11 2012 09:01 Shiori wrote:Show nested quote +On July 11 2012 08:57 BluemoonSC wrote:On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses. Skill can easily be inferred from wins and losses.
that is not necessarily true, as blizzard's matchmaking system on the ladder is designed to give you a 50% win ratio.
|
wowowow, htats a pretty big statement.
|
|
i dont see how MMR has anything to do with balance, balance means assuming skill equal each race wins 50% of the time, since you have no way to show that skill is equal (all the Zergs in your data could be high diamond, all your protoss mid diamond and all Terrans high platinum) you have no way to infer balance from them
obviously your races are not split like that but my point is, since you have no way to tell that the games you used in your data had equal level opponents you cant say you made a formula for determinding balance
|
Blizzard probably has better statistical methods than OP. I'll give them the benefit of the doubt.
|
Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread.
|
SoCal8908 Posts
On July 11 2012 09:19 sCCrooked wrote: Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread.
then he shouldnt use words like "balance" "underpowered" and "overpowered" if he's not talking about balance ;;
|
You forgot to account for Blizzard's "player skill," which states that all races are always balanced, so whoever has the worth MMR is actually just a bunch of noobs.
|
On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant.
Dude, you dont understand what OP wrote or didnt read it.
People that cheese/do one build also don't matter since they should be equally represented in all three races.
You suggest to watch some highlevel play. Well, thats what I did and it supports what the OP has concluded from his statistics.
|
I'm confused here - why are there 4 random groups? Why not just lump them together?
But im still to 99,31 % secure Terran is underpowered -30.09
Where the hell are you getting these numbers for? I'm assuming this 99.3 comes from some application of the normal distribution - have you been thorough established normality in MMR? Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here.
|
I dont think many people understand what he was trying to accomplish, no where was it stated that this was absolute or that there werent other factors to consider when discussing race balance but the amount of data he used makes this a very valid indicator of balance. Sure its a higher average skill than the actual average but the fact is that as skill lowers balance playes a smaller role in who wins or looses. Its the same reason why higher level players hate it when a bronze starts calling one race OP from their experience, if both players are making many many mistakes balance is a far smaller issue than who plays better and makes less mistakes, in fact one could say that at this point in the game any balance whine is pointless as there are very few players who actually play flawlessly and probably zero players who have completely figured out the game/race they play. The fact is metagame plays a much bigger role in game balance than the actuall units themselfs, no one thought snipe was overpowered untill terrans started abusing it in other word the metagame changed so the balance changed, blizz decided to nerf snipe but who is to say if this was a fair call the, whole debate is very nebulous and i would put my trust in solid data as an example of balance before some random guys opinion based on his experience and biased by his race any day.
|
Not sure what type of stats is this overgeneralisation
1) Can you show the breakdown by race for some MMR intervals to know the distribution of data used to derived these results? 2) Is it possible to further separate into submitter vs opponent data (I believe the MMR is calculated for both players in a replay?) for breakdown in (1)? 3) If you can identify submitter, then should we look at w/l ratio? At least to be certain that they submit at least a representative sample of their games and not ONLY their wins? 4) What does deviation mean?? Just means the deviation of MMR of the players within the replay? Seems to be some distortion going on just by looking at random either through submission bias or imbalanced distribution of "MMR".
Timeline : Season 8 , start - 10/07
Deviation : 1000 times 4 random groups:
100% are in the area -+ 33.81: 99.15% are in the area +-25 96.6% are in the area +-20 88.88% are in the area +-15 70.3% are in the area +-10 39.7% are in the area +-5
Race groups: Terran: -44.83 Protoss: +19.83 Zerg: + 38.53 Random: -187.70
|
On July 11 2012 08:57 BluemoonSC wrote:Show nested quote +On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses. Every stocks you buy and sell, every money exchange international companies make, or even every items you buy, are independent transactions. There're many reasons you do it, whether it's impulse, invest, take care of tax, secret's company motive, ...v.v.v, but if you collect the data(which of course is dependent on participant's data) over the course of one month, one year, you will have a pretty good idea of which company stock is good, how strong an economy is, or how popular an item is. That's how statistic and chart is supposed to do. No matter what an individual does, as long as the huge majority have 1 common goal (invest to make money, buy items for self-satisfaction), you should be able to produce a data good enough to have an idea how it is going.
It works the same way in here. As long as the huge majority players have one common goal which is to win the game, the data should give you an idea (a trend) of how the races are doing. MMR = player skill + race features. So if average Terran's MMR is lower than Zerg's, it's either Terran players are dumber than Zerg players, or their race features that they can take advantage of is lower than Zerg's. It's that simple.
Now, like any other data, this data will not give you exactly which Zerg unit is overpowered, or which Terran unit is underpowered. That's not its job. So if you looking for answers for the questions such as : "On average, what units Terran make in order to win?", "On average, how long the game is for Terran to win the most?", "On average, what units Terran need to make to counter BL/infestors that prove the most effective?", you won't find it here. This only answers: With this data pool, on average, Terran is xx MMR lower than he's supposed to be, he has xx win less than he's supposed to be; or on average, Zerg is yy MMR higher than he's supposed to be, he has yy win higher than he's supposed to be. Does that indicate imbalance? maybe, maybe not, you decide.
|
On July 11 2012 09:20 BluemoonSC wrote:Show nested quote +On July 11 2012 09:19 sCCrooked wrote: Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread. then he shouldnt use words like "balance" "underpowered" and "overpowered" if he's not talking about balance ;;
Yes, the word underpowered and balance infer things that the OP is not addressing. You cannot take into account a player like "BadHabit" who made it into GM by only six pooling(on a bet, but he still did it). There are to many other factors that effect MMR to say anything beyond "players of X skill level with Y race seem the a dip in MMR at Z point". But this data does not prove that terran is harder than protoss or zerg, like some posters are saying.
|
On July 11 2012 10:25 Plansix wrote:Show nested quote +On July 11 2012 09:20 BluemoonSC wrote:On July 11 2012 09:19 sCCrooked wrote: Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread. then he shouldnt use words like "balance" "underpowered" and "overpowered" if he's not talking about balance ;; Yes, the word underpowered and balance infer things that the OP is not addressing. You cannot take into account a player like "BadHabit" who made it into GM by only six pooling(on a bet, but he still did it). There are to many other factors that effect MMR to say anything beyond "players of X skill level with Y race seem the a dip in MMR at Z point". But this data does not prove that terran is harder than protoss or zerg, like some posters are saying. 1 player is statistically irrelevant, on average players will play standard games, that's why we need a big sample which is what we have here, people making very bad points to deny this without thinking them through made this thread really annoying to read.The other matter has long since been settled, regardless of data(you should just try them out),you had better give up by now.
|
Higher MMR does not imply higher skill. MMR is just a point system based on wins (with an elo system), like the normal point system. The only difference is that it's a global rating, rather than having divisions and crap like that. It is also more likely to change dramatically than the normal point system -- so if you lose 20 games in a row, your MMR will plummet, but your actual points won't fall that much. This provides the ability for a player to go against more appropriately skilled opponents immediately, without their points exploding or suffering as a result.
MMR was used in WoW, and was visible. The idea that "warriors have higher MMR on average, therefore they are more skilled" would have been a laughable statement. There's no difference in SC2, except that MMR is mystified here because Blizzard hid it.
Furthermore, you can't go the other way either, to say that higher MMR on average means less skill, because of: A) Metagame. Players don't use the best possible strategies available to their race (in the future, newer better strats will be developed, meaning that worse ones are being used in the present) B) Unequal distribution of race among skill levels (for example, very skilled players don't play random as often, because it's not viable for competition, so you'll get a lower average MMR for random necessarily as a result)
|
On July 11 2012 10:34 IshinShishi wrote:Show nested quote +On July 11 2012 10:25 Plansix wrote:On July 11 2012 09:20 BluemoonSC wrote:On July 11 2012 09:19 sCCrooked wrote: Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread. then he shouldnt use words like "balance" "underpowered" and "overpowered" if he's not talking about balance ;; Yes, the word underpowered and balance infer things that the OP is not addressing. You cannot take into account a player like "BadHabit" who made it into GM by only six pooling(on a bet, but he still did it). There are to many other factors that effect MMR to say anything beyond "players of X skill level with Y race seem the a dip in MMR at Z point". But this data does not prove that terran is harder than protoss or zerg, like some posters are saying. 1 player is statistically irrelevant, on average players will play standard games, that's why we need a big sample which is what we have here, people making very bad points to deny this without thinking them through made this thread really annoying to read.The other matter has long since been settled, regardless of data(you should just try them out),you had better give up by now.
Well you showed me with that clever word play right there. You mean to tell me that a sample size of "1" does is to small to provide and real evidence. I guess my example is invalid. Wait, what is the definiation of example again?
Example - one of a number of things, or a part of something, taken to show the character of the whole
So one could say that the one player I named in my previous post was just one of many players who are playing strangely on the ladder for any number of reasons. The data does not account for people who just play weird or "wrong". Like players who always go mech, regardless of the match up.
|
On July 11 2012 02:17 Chill wrote:Show nested quote +On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum?
This could go the other way too. The most skilled players could choose what they perceive to be the most powerful race which, if it were Terran, as it has been for many seasons, could offset the results (although not really because there are naturally fewer of the more skilled than the lesser). That being said, with enough samples at dia+ i think that effect of lower skilled players picking terran as not really applying as much.
I think members of this forum will obsess about the relative difficulty of one race to another at all levels, and I think OP chose a great way to attack the problem as opposed fickle ranting.
(avg: 3 victorys behind) OP: Could you explain this please? If I played a 100 games as Terran v Zerg with same MMR, about what would be my win%?
|
On July 11 2012 08:57 BluemoonSC wrote:Show nested quote +On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses.
In a perfectly balanced game, if you averaged MMR data by race, you would expect the average MMR of each race to be about the same, allowing for small random variation. As I understand it, OP's MMR data shows that the game is on average basically rating Terrans as being less skilled than Protoss or Zerg, and his results show this difference to be much greater than what would be expected from random variation. Assuming that in reality all three races are equally skilled (a reasonable assumption), how do you rationalize this MMR disparity? It suggests that Terran might indeed be UP, or at the very least is really struggling due to the metagame?
Since any imbalance gets inherently factored into the MMR, by extension we can infer certain things. Put simply, if all three races are equally skilled in reality (an assumption), and the game is on average rating Terrans as less skilled than Zergs (through MMR), then it implies that on average a Terran at equal MMR to a Zerg is in reality at least equally skilled, and quite probably more skilled than that Zerg. Thus, hypothetically, if this "averaged" T and Z at equal MMR ratings trade wins at a 50% rate, it in no way demonstrates that the game is balanced, despite a 50% winrate at equal MMR, since the imbalance is already factored into the MMR. I believe OP's point is that the way they're guaging "balance" may inherently be flawed.
Not really completely sure about any of it though, I'm still thinking about it all.
|
On July 11 2012 10:54 sevencck wrote:Show nested quote +On July 11 2012 08:57 BluemoonSC wrote:On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses. In a perfectly balanced game, if you averaged MMR data by race, you would expect the average MMR of each race to be about the same, allowing for small random variation. As I understand it, OP's MMR data shows that the game is on average basically rating Terrans as being less skilled than Protoss or Zerg, and his results show this difference to be much greater than what would be expected from random variation. Assuming that in reality all three races are equally skilled (a reasonable assumption), how do you rationalize this MMR disparity? It suggests that Terran might indeed be UP, or at the very least is really struggling due to the metagame? Since any imbalance gets inherently factored into the MMR, by extension we can infer certain things. Put simply, if all three races are equally skilled in reality (an assumption), and the game is on average rating Terrans as less skilled than Zergs (through MMR), then it implies that on average a Terran at equal MMR to a Zerg is in reality at least equally skilled, and quite probably more skilled than that Zerg. Thus, hypothetically, if this "averaged" T and Z at equal MMR ratings trade wins at a 50% rate, it in no way demonstrates that the game is balanced, despite a 50% winrate at equal MMR, since the imbalance is already factored into the MMR. I believe OP's point is that the way they're guaging "balance" may inherently be flawed. Not really completely sure about any of it though, I'm still thinking about it all. There's no reason to believe that the races have equally skilled players on average at any level of play.
|
I just started to quote some people that posted recently and try to explain what i did and argue but than i realised most of it is explained in op and most people lack of minimum understanding of skillsystem statistic and math. I know it must sound cokey but i cant bring myself to answer most of the post because its so obvious the writer will not understand it anyway. If someone have a valid point or a specific understanding problem i will argue over my methods and explain it. But i will not explain fundamental basics and argue what skill is or why skillsystem dont show skill numbers. If you have a better system to measure skill, sell it to blizzard and all the others out there. you will get rich i promise!
I was sceptical to publish it and i can not say im surprised by the reactions but many people here should really sit back and think a little bit before they post. If you did not understand the basic perhaps its no good idea to criticise points because most likely you dont know what you are talking about ...
Sorry but had to say that.
Now to some critical post that have specific points:
On July 11 2012 10:15 lazyitachi wrote: Not sure what type of stats is this overgeneralisation
1) Can you show the breakdown by race for some MMR intervals to know the distribution of data used to derived these results? 2) Is it possible to further separate into submitter vs opponent data (I believe the MMR is calculated for both players in a replay?) for breakdown in (1)? 3) If you can identify submitter, then should we look at w/l ratio? At least to be certain that they submit at least a representative sample of their games and not ONLY their wins? 4) What does deviation mean?? Just means the deviation of MMR of the players within the replay? Seems to be some distortion going on just by looking at random either through submission bias or imbalanced distribution of "MMR".
1) there is a datafile in the op but it had some bugs. I will create a new one with more data. 2) it is 3) quantity dont care. Only quality. One good game is enough to calculate MMR. Also i explained why problems of the calculation dont effect the result 4) i create random groups to proof the data is not biased in any way towards something else than race. and to measure the % of chance that my results are wrong
On July 11 2012 09:34 lolcanoe wrote:I'm confused here - why are there 4 random groups? Why not just lump them together? Where the hell are you getting these numbers for? I'm assuming this 99.3 comes from some application of the normal distribution - have you been thorough established normality in MMR? Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. 4 because i had 4 races ( with random ) 2 run had only 3 . Dont have be this number only to get kind of close to the amount of accounts of the races. MMR is normal by definition ( if nothing goes wrong)
If this data is normal i can not tell i did not test yet
Its just the % of the chance that random data would create such an result. depending on tests. Im not 100% sure if you can do it this way but i dont see why not. Im no statistic guru in any way. I am not terrible in it but its many years ago i worked with it.
Basic i just produce random data to show that the data is not biased and how "stable" it is. I think there are better ways to show this but this is the fastest with an reasonable result.
On July 11 2012 10:48 cskalias.pbe wrote: (avg: 3 victorys behind) OP: Could you explain this please? If I played a 100 games as Terran v Zerg with same MMR, about what would be my win%? you get in average 16 mmr for a win. I just pointed that out to show you the number in context
For every run my comuter calculates 3 h and it will get way more with more data. I have to optimise the datastruckture sooner or later. But its not easy to just change a little part and post the result...
If i have more data i will put more afford in outputing a good source file so people who want can calculate them-self. At the moment the hole calculation is in the memory and its not that easy to output between steps.
|
On July 11 2012 02:01 Malaz wrote: Ok time for me as a random player to start crying about balance. Buff random already Blizzard!! Yes, buff it's chance to play each race. 80% Protoss, 19% Zerg, 1% Terran
|
On July 11 2012 10:59 Buddhist wrote:Show nested quote +On July 11 2012 10:54 sevencck wrote:On July 11 2012 08:57 BluemoonSC wrote:On July 11 2012 08:52 canikizu wrote:On July 11 2012 08:31 BluemoonSC wrote: you had me until you claimed that MMR has something to do with balance - MMR is an indication of skill, not balance.
i could cheese my way to a better MMR, but does that mean that the game is balanced? uncertain.
i could use one build to get a better MMR, but does that mean that the game is balanced? uncertain.
you take the average MMR of players across gold-GM (if i read correctly), yet somehow this is an indication of balance? that is absurd. balance can only be determined under ideal conditions at the highest level of play and even then, humans are all capable of mistakes.
these are just a few of the problems i have with anyone using MMR (calculated skill) as a means of attempting to prove balance. watch some high level gameplay if you want to figure out whether or not the game is balanced bc, in my opinion, this study is completely irrelevant. You and the likes are outliners, which should represent an insignificant minority that will barely affect the overall number of accounts. Majority of people will play the game as it is, yes, some Terran might only use ladder to practice builds, but so do some Zergs, and some Protoss. Overall in the bigger picture where we have a lot of data, those small spikes should be smoothen, and balance out. you can throw as much math as you want at the problem, it doesn't change the fact that you're looking at data that is dependent on player skill and doesn't look what actually happens inside each individual game. MMR only looks at wins and losses. In a perfectly balanced game, if you averaged MMR data by race, you would expect the average MMR of each race to be about the same, allowing for small random variation. As I understand it, OP's MMR data shows that the game is on average basically rating Terrans as being less skilled than Protoss or Zerg, and his results show this difference to be much greater than what would be expected from random variation. Assuming that in reality all three races are equally skilled (a reasonable assumption), how do you rationalize this MMR disparity? It suggests that Terran might indeed be UP, or at the very least is really struggling due to the metagame? Since any imbalance gets inherently factored into the MMR, by extension we can infer certain things. Put simply, if all three races are equally skilled in reality (an assumption), and the game is on average rating Terrans as less skilled than Zergs (through MMR), then it implies that on average a Terran at equal MMR to a Zerg is in reality at least equally skilled, and quite probably more skilled than that Zerg. Thus, hypothetically, if this "averaged" T and Z at equal MMR ratings trade wins at a 50% rate, it in no way demonstrates that the game is balanced, despite a 50% winrate at equal MMR, since the imbalance is already factored into the MMR. I believe OP's point is that the way they're guaging "balance" may inherently be flawed. Not really completely sure about any of it though, I'm still thinking about it all. There's no reason to believe that the races have equally skilled players on average at any level of play.
Then there was no reason to believe Terran needed to be nerfed a year ago based on the GSL results, Terran might well have had the more skilled players. Basically, there is a very good reason to believe that the races have equally skilled players on average. It's the basis for any attempt to balance the game.
|
On July 11 2012 10:25 Plansix wrote:Show nested quote +On July 11 2012 09:20 BluemoonSC wrote:On July 11 2012 09:19 sCCrooked wrote: Where the hell are you guys getting the idea that this is somehow a formula for balance? Its a simple tool that was used for calculating the hidden MMR trait that we've been trying to find out how to see for years.
He's just releasing the info, not saying anything about balance. Talk about reading into what's not there... its got to be at least half the moronic posts already in this thread. then he shouldnt use words like "balance" "underpowered" and "overpowered" if he's not talking about balance ;; Yes, the word underpowered and balance infer things that the OP is not addressing. You cannot take into account a player like "BadHabit" who made it into GM by only six pooling(on a bet, but he still did it). There are to many other factors that effect MMR to say anything beyond "players of X skill level with Y race seem the a dip in MMR at Z point". But this data does not prove that terran is harder than protoss or zerg, like some posters are saying.
Sure it does. Your example of a player like BadHabit are an incredibly small part of an overall sample. The vast majority of players do not have a one trick pony that would normally remove race balance from the game.
I don't see how people can look at such a staggering set of data like this and still draw the conclusion that it doesn't prove anything. I guess people can stare at overwhelming evidence and still disagree with it because it contrasts with their own personal opinion.
I'm a Protoss player, and I would argue Terran is noticeably weaker than the other 2 races. In tournaments Zerg seems to be doing the best, but in my opinion PvZ is slight favored for Protoss.
|
What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do?
|
On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? i gree protoss players very smart players pick protoss because they already skilled thats why terran only bronce, if you look bitbybit win many game when lurker under land.
|
I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us.
|
On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do?
I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is....
On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting.
|
On July 11 2012 11:17 skeldark wrote:Show nested quote +On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is....
Meh, don't worry about posts like that. This thread is above my head, but I've enjoyed thinking about it all. I'm sure lots of others have as well. I'm glad you posted it.
|
oh is this not master league only, i was worried for a sec
|
On July 11 2012 11:17 skeldark wrote:Show nested quote +On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... Show nested quote +On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting.
How are you adjusting for skill? MMR/Average MMR is not what Blizzard is talking about when they say adjusting for skill.
|
On July 11 2012 11:24 BuddhaMonk wrote:Show nested quote +On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting. How are you adjusting for skill? MMR/Average MMR is not what Blizzard is talking about when they say adjusting for skill. Yeah i know they do it with white rabbit blood. oO I use the blizzard skill calculation system. You perhaps know it under the name MMR...
On July 11 2012 11:20 sevencck wrote:Show nested quote +On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... Meh, don't worry about posts like that. This thread is above my head, but I've enjoyed thinking about it all. I'm sure lots of others have as well. I'm glad you posted it. cant... cant hold it. The totalbiscuit in me wants out 
|
|
On July 11 2012 11:24 BuddhaMonk wrote:Show nested quote +On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting. How are you adjusting for skill? MMR/Average MMR is not what Blizzard is talking about when they say adjusting for skill.
What is MMR other than (relative) skill?
|
On July 11 2012 11:28 skeldark wrote:Show nested quote +On July 11 2012 11:24 BuddhaMonk wrote:On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting. How are you adjusting for skill? MMR/Average MMR is not what Blizzard is talking about when they say adjusting for skill. Yeah i know they do it with white rabbit blood. oO I use the blizzard skill calculation system. You perhaps know it under the name MMR... Show nested quote +On July 11 2012 11:20 sevencck wrote:On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... Meh, don't worry about posts like that. This thread is above my head, but I've enjoyed thinking about it all. I'm sure lots of others have as well. I'm glad you posted it. cant... cant hold it. The totalbiscuit in me wants out 
Blizzard has stated that they do additional adjustments for skill beyond MMR. Is it really so inconceivable that Blizzard would do that? They do have access to way more data than you after all...
|
On July 11 2012 11:31 Jimmeh wrote:Show nested quote +On July 11 2012 11:24 BuddhaMonk wrote:On July 11 2012 11:17 skeldark wrote:On July 11 2012 11:15 ErAsc2 wrote: What a pointless thread, you call Terran underpowered just because they're higher populated in bronze. People pick Terran when they start the game because it's the standard race, the race from the campaign. WTF does that have with balance to do? I just dont understand people like you. You come in this thread have no idea what it is about. obvious did not read the op and if did not understand a single word but still you think you have to post how pointless it is.... On July 11 2012 11:16 BuddhaMonk wrote: I would just like people to keep in mind that when Blizzard talks about statistical balance analysis they do account for skill. I don't know how they do that, but when they say 50/50 win rates etc., according to their statements they do adjust for skill, something that OP has not done.
Nevertheless, good job OP very nice of you to gather this data for us. I account only for skill not for winratio! blizzard does the same. The point is the guys that give the interviews are not the guys who calculate it and most of the time just repeat the few words they can remember from last meeting. How are you adjusting for skill? MMR/Average MMR is not what Blizzard is talking about when they say adjusting for skill. What is MMR other than (relative) skill?
It's one measure of relative skill, not the only one. I'm only telling you what Blizzard has stated in the past.
|
Wow impressive amount of work done. I was so sick and tired of hearing how most races are 50% win rate knowing that is completely meaningless. MMR actually means something.
It's funny to see some zergs come in here and yell at you with nonsensical crap
|
I'm really not understanding how the OP draws his conclusions.
Is he comparing the win rates of races where players have different MMRs? As in, Zerg is overpowered because players with lower MMRs are beating players with higher MMRs? If so, the conclusions are laughably overreaching. Despite all the esteem given to MMR, it's a terrible indicator of skill because it's based on win rates and averaged across the race. To put it concisely: balance dictates win rates, which dictate MMR, which the OP is using to determine balance. It's totally circular.
Also, certain races are just plain easier to win with using lower skill. Some races rely more on luck. How many Protoss wins can be attributed to a lucky DT timing? How many TvZs have been won by getting one medivac in the right place at the right time? Its widely accepted that Protoss is the easiest race to play and Zerg is the hardest. How does that factor into the OPs findings? Naniwa, for one, has said that the immortal sentry PvZ allin is far easier to execute than it is to stop, (though I think this description could be applied to most Protoss attacks, and to attacking in general, which helps Protoss the most since they have the strongest attacks).
It's also easier to cheese with certain races, and, assuming that a cheese win is a non-skill based win, that would give Protosses another undeserved boost in win rates, since they are doubtless the biggest cheesers. The OP treats all wins as equally legitimate, when many are clearly bullshit. I play Terrans on the ladder all the time who refuse to guard against a 6 pool, saying they'd rather lose. They go for a super greedy opening that plain straight up loses to a potential counter build. Others refuse to guard against DT openings. How are those games legitimate? These players will never be able to win against the same opponent twice!
I also totally reject the notion that each race receives an equal degree of skilled and unskilled players. Heck, just comparing the Korean to the foreigner Terrans one can see a readily apparent skill gap, one that isn't there with Protoss and Zerg.
Even then, most newcomers gravitate to Terran or Protoss (because of the campaign/because of the instant easiness). I have more than one friend who has abandoned SC2 entirely because Zerg was just too difficult to play.
Last, Zerg recently received a fairly significant buff, which means that, if the buff did what it was supposed to do, Zergs SHOULD be winning over higher MMR opponents right now. That was the point of the buff! To move Zergs up the ladder and give them higher tournament representation! In other words, something would be wrong if Zergs WEREN'T winning more! Did the OP take this into account? Did he calculate the win rates before and after the patch separately?
These are the issues I have the OPs method.
Edit: I would also love to see how this relates to the maps. Many of the maps in the pool have severe balance issues, which always affect Zerg most heavily. But those maps are being slowly weeded out and as more balanced maps enter the pool we see Zergs winning more. Most recently Korhal Compound and Metalopolis were removed (both of which were terrible for Zerg if they spawned close positions on Metalopolis). Every season the map changes have been a subtle buff to Zerg. How do the recent map changes affect the OPs findings?
|
On July 11 2012 11:35 Iron_ wrote:Wow impressive amount of work done. I was so sick and tired of hearing how most races are 50% win rate knowing that is completely meaningless. MMR actually means something. It's funny to see some zergs come in here and yell at you with nonsensical crap 
We cant let other races think that we are too strong, there is no other choice left for us but to throw crap ! But many users of teamliquid helps us to disguise our crap in tons of their.
|
Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
You also seem to have ignored the possibility of confounding factors (again, just say so if I missed it).
EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
|
On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
You also seem to have ignored the possibility of confounding factors (again, just say so if I missed it). if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391
confounding factors : i tryed to prove with the random runs that they, if exist are smaller than the "imbalance" Its most likely not the best way to do so but the first and fastest way that come to my mind.
EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program:
![[image loading]](http://i.imgur.com/9Ag8n.jpg)
If this race data is , i dont know but i think so. Someone can test it.
|
So...
On average, protoss have +20 MMR, zerg have +40 MMR and terran have -30 MMR. Which is, in fact, a very small difference (1 or 2 games).
Yet if you include lower leagues, then the terran get further behind. But it is claimed that many new comers will auto-pick the terran, thus pulling the whole race down MMRwise.
Well... sigh. I wanted to very subtly use these data to legitimately cry about how hard it is to play terran (am gold...) and feel better about myself. Not this time apparently. Still, FU protoss and zerg mid-game AOE. FU with your T3 too.
|
you don't have to be a genius to figure out that Terran is really under powered.... keep nerfin terran blizzard, i haven't played a TvT in ages...
|
On July 11 2012 11:50 XenOsky- wrote: you don't have to be a genius to figure out that Terran is really under powered.... keep nerfin terran blizzard, i haven't played a TvT in ages...
Uh... this is a *woosh* if I ever saw one. Read the OP at least before you post.
|
The MMR deviation is very tiny compared to overall MMR (it's about a win with some bonus pool for terran being UP), and with an average of 2 wins per ladder account it means nothing.
|
On July 11 2012 11:55 shabinka wrote: The MMR deviation is very tiny compared to overall MMR (it's about a win with some bonus pool for terran being UP), and with an average of 2 wins per ladder account it means nothing.
Yeah... I guess one has to play Terran to actually feel the pain T__T
Or look at how many Terrans make it to the TSL qualifiers... well i'm sorry, I digress.
|
On July 11 2012 11:55 shabinka wrote: The MMR deviation is very tiny compared to overall MMR (it's about a win with some bonus pool for terran being UP), and with an average of 2 wins per ladder account it means nothing. The deviation of mmr is independent mistake and equals itself out in average... Oh and MMR is clean from bonuspool!
|
On July 11 2012 02:39 Shiori wrote:Show nested quote +On July 11 2012 02:31 Chill wrote:On July 11 2012 02:18 Shiori wrote:On July 11 2012 02:17 Chill wrote:On July 11 2012 02:03 Mendelfist wrote:On July 11 2012 01:34 skeldark wrote: The chance that -race independent- stronger players pick a specific race is near 0.
That's an unsupported statement. I don't know where I've heard it, but I'm pretty sure some Blizzard representative, maybe Josh, has explicitly stated that there is a preference for low level players to choose terran. Does it look the same if you exclude for example everyone below masters? Agreed. The whole basis for this project is defeated by one realistic (in my eyes) claim that is dismissed. Also, why do we care about average balance? If Zerg is easier than Terran from Bronze - Masters, does it really matter to the members of this forum? The fact that low level players might have a preference to choose Terran is not the same as saying that low level talented players have a tendency to choose Terran. If it's just low level players in general choosing Terran, then the average will be sustained by the fact that more untalented and more talented players will be choosing Terran. So the OP's claim is correct, because he qualified it by saying "stronger" players aren't more likely to choose Terran in the sense that they're no more likely to choose it than weak players. We can't just dimiss it. Imagine, for whatever reason, that there is a strong bias for new players to automatically choose Terran. The remaining players try all the races and determine which of the three fit their styles, making them more likely to win. Because you can imagine a situation where Zerg and Protoss average win rates are higher than Terran it must be addressed. That sounds incredibly unlikely in the sense that you're suggesting, though. When Blizzard says that more new players choose Terran, they don't say or even suggest that those people choose their race in a different way than the people who choose other races. It's entirely possible (and likely, I'd say) that most people pick Terran because they're the protagonists and because they're human beings. The people who pick Zerg/Protoss at the noob level don't know enough about their "styles" to make a choice that really affects their ability to win, because it's not actually very clear from the start what the styles of the races even are. The people who pick P/Z are probably motivated by the same thing that motivates players to pick Terran: they think the race is cool. It just so happens that there are fewer of them because aliens are less appealing than humans. Besides, even if there were a strong bias for new players to automatically choose Terran, talented RTS players are talented RTS players, and weak RTS players are weak. I don't believe that players have an inbuilt magical bias to one race that influences them so much that they'd be incapable of playing the other races at a high level. If Terran is just more appealing to human beings in general, then it's going to attract all sorts with random distribution, meaning that the average Terran player isn't going to be any better than the average P/Z player because it's just a larger sample but is still evenly distributed. Until you can show that there's a race which has a baffling number of good players but almost no bad players, the point is moot.
This whole talent thing you're going on about is pretty silly to me. Talent means nothing until the absolute highest level of play. I'm talking S class level of play. Like Flash level of play. The idea of a "talented RTS player" vs an "untalented RTS player" is nonsense.
"Until the very top, in almost anything, all that matters is how much work you put in. The only problem is most people can't work hard, even at the things they do enjoy, much less the things they don't have a real passion for." -Greg 'IdrA' Fields
|
Skeldark is an awesome guy.
Whatever people say, good contribution nonetheless.
Also, I really feel like I have 80 IQ while reading your discussions about calculation, deviation etc. etc.
|
balance gets better over time? user make the most of the balance, not the game producer? your race weaker now might be better tomorrow? not with the constant buffing/nerfing imo
|
thanks for your contribution and effort.
|
This is the best,awesome, cool, great post of TL in recent time. I hate those ppl who comes in with 0 knowledge about statistic and start flaming just like in other balance thread.
To prove something, OP need i) independent variables ii) dependent variables iii) logical assumptions (KEYWORD)
I think skeldark have done enough logical assumptions and use the most suitable calculation method to reduce the number of dependent variables here. What he left here is just the true-skill (dependent variables) and races (independent variables). Apart from some unknown quantities which he assumed just play a minor role, the stats shown pretty much the truth of the current state.
Every stats means something. Underpowered and overpowered 'argument' is just one of the possible the explanation of his stats conclusion. Don't just look at the 'underpower' and 'overpower' words alone and start the argument. Understand the concept behind this study first.
Good work OP, really appreciated the thinking and the time you spent on it
|
On July 11 2012 11:37 _Search_ wrote: I'm really not understanding how the OP draws his conclusions.
Is he comparing the win rates of races where players have different MMRs? As in, Zerg is overpowered because players with lower MMRs are beating players with higher MMRs? If so, the conclusions are laughably overreaching. Despite all the esteem given to MMR, it's a terrible indicator of skill because it's based on win rates and averaged across the race. To put it concisely: balance dictates win rates, which dictate MMR, which the OP is using to determine balance. It's totally circular.
Also, certain races are just plain easier to win with using lower skill. Some races rely more on luck. How many Protoss wins can be attributed to a lucky DT timing? How many TvZs have been won by getting one medivac in the right place at the right time? Its widely accepted that Protoss is the easiest race to play and Zerg is the hardest. How does that factor into the OPs findings? Naniwa, for one, has said that the immortal sentry PvZ allin is far easier to execute than it is to stop, (though I think this description could be applied to most Protoss attacks, and to attacking in general, which helps Protoss the most since they have the strongest attacks).
It's also easier to cheese with certain races, and, assuming that a cheese win is a non-skill based win, that would give Protosses another undeserved boost in win rates, since they are doubtless the biggest cheesers. The OP treats all wins as equally legitimate, when many are clearly bullshit. I play Terrans on the ladder all the time who refuse to guard against a 6 pool, saying they'd rather lose. They go for a super greedy opening that plain straight up loses to a potential counter build. Others refuse to guard against DT openings. How are those games legitimate? These players will never be able to win against the same opponent twice!
I also totally reject the notion that each race receives an equal degree of skilled and unskilled players. Heck, just comparing the Korean to the foreigner Terrans one can see a readily apparent skill gap, one that isn't there with Protoss and Zerg.
Even then, most newcomers gravitate to Terran or Protoss (because of the campaign/because of the instant easiness). I have more than one friend who has abandoned SC2 entirely because Zerg was just too difficult to play.
Last, Zerg recently received a fairly significant buff, which means that, if the buff did what it was supposed to do, Zergs SHOULD be winning over higher MMR opponents right now. That was the point of the buff! To move Zergs up the ladder and give them higher tournament representation! In other words, something would be wrong if Zergs WEREN'T winning more! Did the OP take this into account? Did he calculate the win rates before and after the patch separately?
These are the issues I have the OPs method.
Edit: I would also love to see how this relates to the maps. Many of the maps in the pool have severe balance issues, which always affect Zerg most heavily. But those maps are being slowly weeded out and as more balanced maps enter the pool we see Zergs winning more. Most recently Korhal Compound and Metalopolis were removed (both of which were terrible for Zerg if they spawned close positions on Metalopolis). Every season the map changes have been a subtle buff to Zerg. How do the recent map changes affect the OPs findings?
Im sorry bro but you couldve saved a lot of time. You arent even close to understanding how the OP came to his results yet you post this wall of text.
Also your rant about luck is pretty retarded tbh. If getting a medivec in the right postion/dts are luck I guess we all should just roll the dice at the beginning of the game.
|
First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be!
Some comments:
1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see.
I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between.
2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above.
I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind.
Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously. 
3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious. 
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point. Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP. Cheers.
|
On July 11 2012 11:55 shabinka wrote: The MMR deviation is very tiny compared to overall MMR (it's about a win with some bonus pool for terran being UP), and with an average of 2 wins per ladder account it means nothing.
Statistics don't work that way. The T/Z MMR values DO mean something. P maybe, maybe not.
|
This is very very interesting. I'll wait until a higher confidence is achieved before I state anything about balance based on this data.
To the people talking about maps having an effect on the balance, they do, except not as much as your giving it weight for. As your matched up against they player first and then the map is picked.
|
On July 11 2012 12:57 Niazger wrote:Show nested quote +On July 11 2012 11:37 _Search_ wrote: I'm really not understanding how the OP draws his conclusions.
Is he comparing the win rates of races where players have different MMRs? As in, Zerg is overpowered because players with lower MMRs are beating players with higher MMRs? If so, the conclusions are laughably overreaching. Despite all the esteem given to MMR, it's a terrible indicator of skill because it's based on win rates and averaged across the race. To put it concisely: balance dictates win rates, which dictate MMR, which the OP is using to determine balance. It's totally circular.
Also, certain races are just plain easier to win with using lower skill. Some races rely more on luck. How many Protoss wins can be attributed to a lucky DT timing? How many TvZs have been won by getting one medivac in the right place at the right time? Its widely accepted that Protoss is the easiest race to play and Zerg is the hardest. How does that factor into the OPs findings? Naniwa, for one, has said that the immortal sentry PvZ allin is far easier to execute than it is to stop, (though I think this description could be applied to most Protoss attacks, and to attacking in general, which helps Protoss the most since they have the strongest attacks).
It's also easier to cheese with certain races, and, assuming that a cheese win is a non-skill based win, that would give Protosses another undeserved boost in win rates, since they are doubtless the biggest cheesers. The OP treats all wins as equally legitimate, when many are clearly bullshit. I play Terrans on the ladder all the time who refuse to guard against a 6 pool, saying they'd rather lose. They go for a super greedy opening that plain straight up loses to a potential counter build. Others refuse to guard against DT openings. How are those games legitimate? These players will never be able to win against the same opponent twice!
I also totally reject the notion that each race receives an equal degree of skilled and unskilled players. Heck, just comparing the Korean to the foreigner Terrans one can see a readily apparent skill gap, one that isn't there with Protoss and Zerg.
Even then, most newcomers gravitate to Terran or Protoss (because of the campaign/because of the instant easiness). I have more than one friend who has abandoned SC2 entirely because Zerg was just too difficult to play.
Last, Zerg recently received a fairly significant buff, which means that, if the buff did what it was supposed to do, Zergs SHOULD be winning over higher MMR opponents right now. That was the point of the buff! To move Zergs up the ladder and give them higher tournament representation! In other words, something would be wrong if Zergs WEREN'T winning more! Did the OP take this into account? Did he calculate the win rates before and after the patch separately?
These are the issues I have the OPs method.
Edit: I would also love to see how this relates to the maps. Many of the maps in the pool have severe balance issues, which always affect Zerg most heavily. But those maps are being slowly weeded out and as more balanced maps enter the pool we see Zergs winning more. Most recently Korhal Compound and Metalopolis were removed (both of which were terrible for Zerg if they spawned close positions on Metalopolis). Every season the map changes have been a subtle buff to Zerg. How do the recent map changes affect the OPs findings? Im sorry bro but you couldve saved a lot of time. You arent even close to understanding how the OP came to his results yet you post this wall of text. Also your rant about luck is pretty retarded tbh. If getting a medivec in the right postion/dts are luck I guess we all should just roll the dice at the beginning of the game.
No u.
Rather than just saying, "you're wrong" why don't you say something that might show how I'm wrong?
And if you think there is no risk-to-reward skew in this game than you've never faced a TvP where the Protoss hid a pylon in your base.
|
On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical.
Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +
Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree.
2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance.
3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious. 
4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance.
5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is.
@ _Search_ the awnser to your frist question is : No. So i stoped reading there because it was explained in the op. Im sorry, thats not your fault but i get bored explaining the same thing over and over again. Perhaps i should write more text in op but i thought its pretty clear what i did. Also if one race is easyer to play and to win this is per definition inbalance.
|
I believe the team at Blizzard have already done something like this, anyways wow so much effort put into this.
When Blizzard PR say game is balanced, obviously they know its not, in fact they have came to the same conclusion that terran is weakest then protoss then zerg strongest. If you look at all the tourneys and qualifiers(osl included) pretty much the exact same zerg dominance, but you can just imagine that Blizzard employees would say everything's fine to keep their jobs and damage control. Unless you done this work as part of your work portfolio...(statistical gurus already employed at Blizz)
|
Unfortunately the conclusions being drawn from these statistics are completely unfounded as the assumptions made are too broad. To say that MMR deviation between races represents an accurate depiction of balance would have the precondition of every player being the same skill level as every other player at all points in time. It can just as easily (and in the same unfounded manner) be concluded that terran players are on average less skilled.
Some well performed data collection though, so kudos.
In order to draw any form of semi-accurate balance based conclusions from the data you would need to compare average variation in rate of change of each _individual_ player's (in relation to THEMSELVES) MMR on a timescale and the find trends that can be correlated to balance patches. Even then the results would not necessarily be a depiction of actual racial balance, rather a combination of both racial and meta-game balance.
An interesting study nonetheless, and from it we can conclude that... T MMR is on average slightly lower than P / Z in the gold-gm leagues. Good to know (I guess).
|
On July 11 2012 13:30 skeldark wrote:Show nested quote +On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical. YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games. And I think most people would use similar definitions.
If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word. 
Show nested quote +Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree. ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method. edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.
Show nested quote +2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance. haha, yes, not really sure where I wanted to go with the randoms. 
Show nested quote +3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.  4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^
I agree that it is "good enough" despite maybe not being perfect.Show nested quote +
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance. Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with.
Let's see if you can reproduce blizzards result first. After that the sky is the limit!  Unless, ofc, you are lazy. 
Show nested quote +5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's... Mmm, I hear you. I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt.
|
On July 11 2012 13:57 Cascade wrote:Show nested quote +On July 11 2012 13:30 skeldark wrote:On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical. YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games.  And I think most people would use similar definitions. If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word.  Show nested quote +Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree. ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method. edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.Show nested quote +2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance. haha, yes, not really sure where I wanted to go with the randoms.  Show nested quote +3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.  4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^ I agree that it is "good enough" despite maybe not being perfect. Show nested quote +
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance. Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with. Let's see if you can reproduce blizzards result first. After that the sky is the limit!  Unless, ofc, you are lazy.  Show nested quote +5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's... Mmm, I hear you. I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt.
- league is not near to 1000MMR look at the picture. gold to platinum is only 250 mmr -yes perhaps i should use a diffrent word . But witch one. - so you can calculate this very fast? in this case lastest datafile: skeletor.jimmeh.com/mmr/balance.csv
New results are ( after removing everyone under 1k) Maxerror : 38.7191574666374 ERRORCOUNT : 41.54333333333333% in 5 72.81111111111112% in 10 89.8111111111111% in 15 97.02666666666667% in 20 99.36666666666667% in 25 99.88666666666667% in 30 Race... T: -28.938886080105476 P 23.43063954261379 Z 0.36671387478577344 Analyse DONE Zerg and Protoss switch role! halppend in 2 run also but everytime terran stay way under.
|
When people can choose from any color for their car, silver generally "wins" (i.e. silver is the most popular car color). Therefore the color silver is imba, because definitionally things that are popular and have a high "win" rate are imba.
When asked how it could be that any color could be imba, we say - people pick silver, so it is by definition imba! (Weren't you following the argument?)
--
More or less this whole thread is about redefining imbalance. Go through the OP, see how many assumptions are made.
|
MMR distribution by races. Click for full version.
![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg)
Amount of players: 2014 Zerg 1784 Protoss 1516 Terran
The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from.
|
On July 11 2012 14:29 redruMBunny wrote: When people can choose from any color for their car, silver generally "wins" (i.e. silver is the most popular car color). Therefore the color silver is imba, because definitionally things that are popular and have a high "win" rate are imba.
When asked how it could be that any color could be imba, we say - people pick silver, so it is by definition imba! (Weren't you following the argument?) More or less this whole thread is about redefining imbalance. Go through the OP, see how many assumptions are made.
The diffrence is i dont look what car is popular i look what car drives faster. So silver cars drive faster than black so i come to the conclusion that the color affects the speed. And now you say: thats because more people drive silver. I would now ask: why do you think that more people choosing a car makes it faster? Can you back this assumption up? like i did with mine?
|
On July 11 2012 14:18 skeldark wrote:Show nested quote +On July 11 2012 13:57 Cascade wrote:On July 11 2012 13:30 skeldark wrote:On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical. YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games.  And I think most people would use similar definitions. If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word.  Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree. ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method. edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance. haha, yes, not really sure where I wanted to go with the randoms.  3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.  4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^ I agree that it is "good enough" despite maybe not being perfect.
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance. Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with. Let's see if you can reproduce blizzards result first. After that the sky is the limit!  Unless, ofc, you are lazy.  5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's... Mmm, I hear you. I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt. - league is not near to 1000MMR look at the picture. gold to platinum is only 250 mmr -yes perhaps i should use a diffrent word . But witch one. - so you can calculate this very fast? in this case lastest datafile: skeletor.jimmeh.com/mmr/balance.csv New results are ( after removing everyone under 1k) Maxerror : 38.7191574666374 ERRORCOUNT : 41.54333333333333% in 5 72.81111111111112% in 10 89.8111111111111% in 15 97.02666666666667% in 20 99.36666666666667% in 25 99.88666666666667% in 30 Race... T: -28.938886080105476 P 23.43063954261379 Z 0.36671387478577344 Analyse DONE Zerg and Protoss switch role! halppend in 2 run also but everytime terran stay way under. How about "uneven player distribution"? Or just write out "different MMR averages" each time, you don't use the B-word that many times.
Hmm, to calculate the errors I just need one run, but I need both averages: <MMR> and <MMR^2>, where <X> means average value of X for all players. Then you calculate variance V = <MMR^2> - <MMR>^2 and standard deviation the square root of that: S = sqrt(<MMR^2> - <MMR>^2) then you get the standard error by dividing by the square root of number of samples (for that variable, ie race) error = S/sqrt(N)
And here it is important that the samples are really independent. No duplicates etc. As there are some systematic effects (due to same player sending in a lot of games for example), N doesn't really represent the number of independent samples. So this will be an underestimate of the errors. Add an (arbitrary) factor 2 to the error (ie, use N/4) and you should be pretty safe. 
Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though.
|
On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway.
|
On July 11 2012 14:53 Cascade wrote:Show nested quote +On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Oh, and can you normalize the plots as well? So that the bins read "% of players" or something instead. Makes it a lot easier to compare. Sorry. :o)
Now all you can see is that there are more zerg players.
|
On July 11 2012 14:51 Cascade wrote:Show nested quote +On July 11 2012 14:18 skeldark wrote:On July 11 2012 13:57 Cascade wrote:On July 11 2012 13:30 skeldark wrote:On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical. YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games.  And I think most people would use similar definitions. If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word.  Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree. ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method. edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance. haha, yes, not really sure where I wanted to go with the randoms.  3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.  4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^ I agree that it is "good enough" despite maybe not being perfect.
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance. Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with. Let's see if you can reproduce blizzards result first. After that the sky is the limit!  Unless, ofc, you are lazy.  5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's... Mmm, I hear you. I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt. - league is not near to 1000MMR look at the picture. gold to platinum is only 250 mmr -yes perhaps i should use a diffrent word . But witch one. - so you can calculate this very fast? in this case lastest datafile: skeletor.jimmeh.com/mmr/balance.csv New results are ( after removing everyone under 1k) Maxerror : 38.7191574666374 ERRORCOUNT : 41.54333333333333% in 5 72.81111111111112% in 10 89.8111111111111% in 15 97.02666666666667% in 20 99.36666666666667% in 25 99.88666666666667% in 30 Race... T: -28.938886080105476 P 23.43063954261379 Z 0.36671387478577344 Analyse DONE Zerg and Protoss switch role! halppend in 2 run also but everytime terran stay way under. How about "uneven player distribution"?  Or just write out "different MMR averages" each time, you don't use the B-word that many times. Hmm, to calculate the errors I just need one run, but I need both averages: <MMR> and <MMR^2>, where <X> means average value of X for all players. Then you calculate variance V = <MMR^2> - <MMR>^2 and standard deviation the square root of that: S = sqrt(<MMR^2> - <MMR>^2) then you get the standard error by dividing by the square root of number of samples (for that variable, ie race) error = S/sqrt(N) And here it is important that the samples are really independent. No duplicates etc. As there are some systematic effects (due to same player sending in a lot of games for example), N doesn't really represent the number of independent samples. So this will be an underestimate of the errors. Add an (arbitrary) factor 2 to the error (ie, use N/4) and you should be pretty safe.  Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though. Like i said. I linked you the data. When do you post the result 
I understand you can not just post something about balance without backing it up. But i think i backed it up reasonable. Not perfect but reasonable. And to be honest im running out of time for today.
|
On July 11 2012 14:55 skeldark wrote:Show nested quote +On July 11 2012 14:51 Cascade wrote:On July 11 2012 14:18 skeldark wrote:On July 11 2012 13:57 Cascade wrote:On July 11 2012 13:30 skeldark wrote:On July 11 2012 13:06 Cascade wrote: First: nice work on putting these together! Must have been a lot of job.
I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.
Why are there more terrans at lower MMR? I don't know. Because they are UP? Maybe. Because casual (bad) players are more likely to pick terran due to single player? Maybe. Because the good players switch away from terran as they perceive them as UP? Maybe. Because people switch race from terran as they get better? Maybe. Something else? Could be
True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical. YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games.  And I think most people would use similar definitions. If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word.  Some comments: 1) Your result is essentially the same as in the sc2ranks link you provide. I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see. I even did a short calculation: + Show Spoiler +Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much. Take the average number of points for each race: Toss: 1.026 terran: 1.023 zerg: 1.047 Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between. a league is not 1000 MMR Not 100% (promotion offset != league offset ) but close : + Show Spoiler +Thee main point is valid. You can do it with leagues in generell but someone could come with the argument (all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree. ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method. edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above. I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind. Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.  Its like the first point more a question of definition of balance. haha, yes, not really sure where I wanted to go with the randoms.  3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.  4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average. This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ... My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same. I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^ I agree that it is "good enough" despite maybe not being perfect.
4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.
I have this data. mmr of both players the matchup and the result. And i agree that setting the +- 5% allow for great inbalance. Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with. Let's see if you can reproduce blizzards result first. After that the sky is the limit!  Unless, ofc, you are lazy.  5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.  Also, I should mention that I don't want to claim anything about balance. + Show Spoiler + I don't want to say that any race is or is not UP or OP.  Cheers. No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's... Mmm, I hear you. I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt. - league is not near to 1000MMR look at the picture. gold to platinum is only 250 mmr -yes perhaps i should use a diffrent word . But witch one. - so you can calculate this very fast? in this case lastest datafile: skeletor.jimmeh.com/mmr/balance.csv New results are ( after removing everyone under 1k) Maxerror : 38.7191574666374 ERRORCOUNT : 41.54333333333333% in 5 72.81111111111112% in 10 89.8111111111111% in 15 97.02666666666667% in 20 99.36666666666667% in 25 99.88666666666667% in 30 Race... T: -28.938886080105476 P 23.43063954261379 Z 0.36671387478577344 Analyse DONE Zerg and Protoss switch role! halppend in 2 run also but everytime terran stay way under. How about "uneven player distribution"?  Or just write out "different MMR averages" each time, you don't use the B-word that many times. Hmm, to calculate the errors I just need one run, but I need both averages: <MMR> and <MMR^2>, where <X> means average value of X for all players. Then you calculate variance V = <MMR^2> - <MMR>^2 and standard deviation the square root of that: S = sqrt(<MMR^2> - <MMR>^2) then you get the standard error by dividing by the square root of number of samples (for that variable, ie race) error = S/sqrt(N) And here it is important that the samples are really independent. No duplicates etc. As there are some systematic effects (due to same player sending in a lot of games for example), N doesn't really represent the number of independent samples. So this will be an underestimate of the errors. Add an (arbitrary) factor 2 to the error (ie, use N/4) and you should be pretty safe.  Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though. Like i said. I linked you the data. When do you post the result  I understand you can not just post something about balance without backing it up. But i think i backed it up reasonable. Not perfect but reasonable. And to be honest im running out of time for today. ahaha, ok. Let me dust off my MS office analysis skills, brb. 
I need to leave soon as well, just quick analysis!!
|
T is underrepresented (28.5%) and skewed towards lower league while Z is overrepresented (37.8%) in sample, Max and Min MMR can deviate by more than 1500 for a single submitter. The data is right skewed to higher league i.e. not normal.
Seems like data or methodology is not producing any consistent point for analysis. Remember, first rule of modelling garbage in garbage out.
|
On July 11 2012 14:53 Cascade wrote:Show nested quote +On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway.
Here you go:
![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg)
We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc.
|
On July 11 2012 15:13 Not_That wrote:Show nested quote +On July 11 2012 14:53 Cascade wrote:On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Here you go: ![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg) We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc. Thanks!
I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs.
|
ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.html
results: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12
terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13
zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9
Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9
The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15
Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data. 
edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.
|
On July 11 2012 15:35 Cascade wrote:Show nested quote +On July 11 2012 15:13 Not_That wrote:On July 11 2012 14:53 Cascade wrote:On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Here you go: ![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg) We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc. Thanks! I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs.
Good thinking.
Same graph normalized, each bar representing the percentage of players of each race in the bin:
![[image loading]](http://s10.postimage.org/aut44y479/image.jpg)
Edit: Corrected colors
|
On July 11 2012 16:05 Not_That wrote:Show nested quote +On July 11 2012 15:35 Cascade wrote:On July 11 2012 15:13 Not_That wrote:On July 11 2012 14:53 Cascade wrote:On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Here you go: ![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg) We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc. Thanks! I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs. Good thinking. Same graph normalized, each bar representing the percentage of players of each race in the bin: ![[image loading]](http://s11.postimage.org/utfwxg0vz/image.jpg) Nice!
Now just put the error bars back on that plot, and it's perfect! *leaving*
|
On July 11 2012 15:47 Cascade wrote:ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.htmlresults: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12 terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13 zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9 Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9 The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15 Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.  edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.
This is a two-tail test. You did not account for P-value. Assuming high df, 1.036 SE is equivalent to only being 70% certain, 1.96 SE is 95% certain and 3.08 SE is 99.8% certain.
Assuming the data is correct, one might argue that terran is 95% certain to be underpowered BUT as I mentioned the methodology and data seems to be suspect. I f you show me that each person has less than 100 - 200 MMR movement then I would be more assured but seems that the calculation of the MMR is so suspect to generate movement of MMR up to 1000+++. Lol.. Math. GIGO
|
Good job, nice statistics.
Is it possible to determine whether a matchup is random or not (espacially mirrors) by looking at winrates by MMR? What I mean: in ZVZ, how often does a player with (lets say 200 points) smaller MMR win ? Would that be a good way to detect randomness?
|
On July 11 2012 16:20 lazyitachi wrote:Show nested quote +On July 11 2012 15:47 Cascade wrote:ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.htmlresults: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12 terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13 zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9 Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9 The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15 Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.  edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file. This is a two-tail test. You did not account for P-value. Assuming high df, 1.036 SE is equivalent to only being 70% certain, 1.96 SE is 95% certain and 3.08 SE is 99.8% certain. Assuming the data is correct, one might argue that terran is 95% certain to be underpowered BUT as I mentioned the methodology and data seems to be suspect. I f you show me that each person has less than 100 - 200 MMR movement then I would be more assured but seems that the calculation of the MMR is so suspect to generate movement of MMR up to 1000+++. Lol.. Math. GIGO the standard deviation is over the distribution of all players, not for a single player. each sample is one player.
|
On July 11 2012 16:15 Cascade wrote:Show nested quote +On July 11 2012 16:05 Not_That wrote:On July 11 2012 15:35 Cascade wrote:On July 11 2012 15:13 Not_That wrote:On July 11 2012 14:53 Cascade wrote:On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Here you go: ![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg) We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc. Thanks! I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs. Good thinking. Same graph normalized, each bar representing the percentage of players of each race in the bin: ![[image loading]](http://s11.postimage.org/utfwxg0vz/image.jpg) Nice! Now just put the error bars back on that plot, and it's perfect!  *leaving*
How do I figure out error margins for a graph with granularity? Fixed colors btw.
|
On July 11 2012 15:47 Cascade wrote:ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.htmlresults: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12 terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13 zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9 Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9 The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15 Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.  edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.
When I calculated the T-test statistic for comparing the sample means of Zerg and Terran MMR I got 2.32258065 which is something around P < 0.01 for the one-sided test. For the two-sided test P < 0.02, there is significant evidence that Zerg players have different (higher) MMR than terran players.
However, the original data wasn't collected through an SRS of battle.net players (although I don't really think it's going to matter in this case). Additionally, terran mean MMR isn't really independent of zerg mean MMR. I'd have to think more about this to see if this affected the validity of the results. Including bronze league MMR in the calculation of mean MMR is dangerous because bronze league mostly consists of terran, lowering mean terran MMR.
The real question is if that 36 MMR difference between Zerg and Terran is indicative of imbalance. The OP mentioned that this difference was about 3 games in favor of the zerg. I think the MMR difference is pretty negligible even though Zerg do have statistically significant higher MMR than terran.
What we should do is look at the mean MMR for zerg and terran at higher levels of MMR and see if Zerg has statistically significantly higher MMR. Which, according to this graph below, could be possible. However, the assumption that all races have equally skilled players might not hold at smaller sample sizes of higher MMR players. Click on this picture http://i50.tinypic.com/213m4pl.jpg.
|
On July 11 2012 16:23 Cascade wrote:Show nested quote +On July 11 2012 16:20 lazyitachi wrote:On July 11 2012 15:47 Cascade wrote:ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.htmlresults: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12 terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13 zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9 Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9 The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15 Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.  edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file. This is a two-tail test. You did not account for P-value. Assuming high df, 1.036 SE is equivalent to only being 70% certain, 1.96 SE is 95% certain and 3.08 SE is 99.8% certain. Assuming the data is correct, one might argue that terran is 95% certain to be underpowered BUT as I mentioned the methodology and data seems to be suspect. I f you show me that each person has less than 100 - 200 MMR movement then I would be more assured but seems that the calculation of the MMR is so suspect to generate movement of MMR up to 1000+++. Lol.. Math. GIGO the standard deviation is over the distribution of all players, not for a single player. each sample is one player.
So if I take 1000000000 faulty data then my data is now correct? Logic?
|
On July 11 2012 16:36 NoobCrunch wrote:Show nested quote +On July 11 2012 15:47 Cascade wrote:ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.htmlresults: toss: <MMR> = 1662 samples: 1881 standard deviation: 541 standard error: 12 terran: <MMR> = 1619 samples: 1598 standrad deviation: 504 standard error: 13 zerg: <MMR> = 1655 samples: 2113 standard deviation: 419 standard error: 9 Note that the actual error probably is larger than that though, as there are correlations in the sample. Ignoring that, and setting the terran MMR as zero: terran: 0 +- 13 toss: 43 +- 12 zerg: 36 +- 9 The difference in MMR between zerg and terran: 36 +- 16 toss and terran: 43 +- 18 zerg and toss: 7 +- 15 Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.  edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file. When I calculated the T-test statistic for comparing the sample means of Zerg and Terran MMR I got 2.32258065 which is something around P < 0.01 for the one-sided test. Even for at two-sided test P < 0.02. There is significant evidence that Zerg players have different (higher) MMR than terran players. However, the original data wasn't collected through an SRS of battle.net players (although I don't really think it's going to matter in this case). Additionally, terran mean MMR isn't really independent of zerg mean MMR. I'd have to think more about this to see if this affected the validity of the results. Including bronze league MMR in the calculation of mean MMR is dangerous because bronze league mostly consists of terran, lowering mean terran MMR. The real question is if that 36 MMR difference between Zerg and Terran is indicative of imbalance. I think the MMR difference is pretty negligible even though Zerg do have statistically significant higher MMR than terran. What we should do is look at the mean MMR for zerg and terran at higher levels of MMR and see if Zerg has statistically significantly higher MMR. Which, according to this graph below, could be possible. However, the assumption that all races have equally skilled players might not hold at smaller sample sizes of higher MMR players. Click on this picture http://i50.tinypic.com/213m4pl.jpg.
welcome on board 
On July 11 2012 16:23 graNite wrote: Good job, nice statistics.
Is it possible to determine whether a matchup is random or not (espacially mirrors) by looking at winrates by MMR? What I mean: in ZVZ, how often does a player with (lets say 200 points) smaller MMR win ? Would that be a good way to detect randomness?
I dont know if i understand the question.
I could (not in now but theoretic) say you how big the chance is by only watching the mmr of the 2 players. Because the mmr includes the win% the skill system gives the players. I dont understand what you mean with randomness here.
|
This gives us the facts we all knew before. We all know that terran is by far the hardest race to play, especially if you don't have the multitasking and micro that korean terrans have.
|
On July 11 2012 16:42 lazyitachi wrote:Show nested quote +On July 11 2012 16:23 Cascade wrote:On July 11 2012 16:20 lazyitachi wrote: If you show me that each person has less than 100 - 200 MMR movement then I would be more assured but seems that the calculation of the MMR is so suspect to generate movement of MMR up to 1000+++. Lol.. Math. GIGO
the standard deviation is over the distribution of all players, not for a single player. each sample is one player. So if I take 1000000000 faulty data then my data is now correct? Logic?
Warning: I speak physics language when it comes to these matters, so it may take some translation if the terminology among statisticians differs.
Assuming by "faulty" you mean "high uncertainty," then yes, many more samples will give you a much more accurate estimate of the value of the mean. For a normal distribution (maybe for all distributions? I don't know), the uncertainty scales with 1/sqrt(n), so if your one-standard-deviation uncertainty in a single MMR measurement were 1000, and you averaged 10000 of them, you'd get a one-standard-deviation uncertainty of 10 in the value of your average.
I gather, though, based on your prior posts, that you're probably familiar with this, and that by "faulty" you instead mean some kind of systematic bias. To call out that his data includes MMR values tracking up and down over a wide range doesn't suggest systematic bias, though. That's just what you'd expect with a high random variance in an individual's result, which is typical of this kind of system and doesn't invalidate the measurement. (If it did, this type of system wouldn't be used so widely for rating players.)
What he's done is start with the assumption that there is no systematic bias in his data collection that would cause one race to tend to be better than another across the board as a result of how he's collected the data. I think that's reasonable -- even though his players are self-selected, there's nothing to suggest that one race would self-select more or differently than another, let alone in a way that varies across skill level. Then, he's calculated the likelihood that the differences in MMR distribution he's seeing could occur randomly.
If you're going to argue that there is systematic bias in his data collection that means his data collection prefers picking lower-skilled Terrans and higher-skilled Protoss, fair enough, but you haven't yet described how you think that might be the case.
So, since there's a low chance that the variation in race's average scores could occur randomly, that result is either (a) a result of some kind of race-specific systematic bias in his data collection, the nature of which has yet to be described, or (b) due to an outside factor.
I don't see where there could be race-specific bias in his data collection, unless one's going to make an argument about personalities of different race's players, their need for personal validation, and their likelihood to install his software. That'll be a fun argument, so I look forward to it, but let's put that aside for now.
Outside factors could include all kinds of things. I, personally, think that even given the data set's strong emphasis on higher leagues, there could be some residual impact to lower league players preferring Terran. The average difference in race's MMR in his analysis is not large at all, a few games' difference, and if there's a huge preference toward Terran among bronze and silver players, that might weight Terran ever so slightly toward the lower end even among a higher-league population, since some of those new players will no doubt improve quickly, wind up Diamond TL readers, and install his software.
Then there are all the potential game design reasons. In a broad sense, I'd suggest that Protoss as a race seems slightly simpler to play overall than Zerg or Terran, and Zerg's in the middle, with Terran most complex, so that might be all of it right there. (Note that I am NOT saying that these are large differences, and for the kind of numbers the OP is seeing, I don't think they'd have to be.) That kind of general observation seems to me far more likely to explain population-wide differences than fiddly meta-game arguments or this or that unit having been buffed or nerfed, since those kinds of things are most likely to have a big impact at the high end.
Anyway, I think the OP's clear that he's not making an observation about racial balance in the sense most people on here mean it, which is that two players with equal secondary indicators of skill (things like reaction time, click accuracy, APM, etc) would wind up at different MMRs. All he's saying is the distributions differ slightly. No amount of staring at his data, by itself, will show where this difference comes from, we can only guess at it.
Finally, I have to disagree with this guy:
Then Dustin B. just says in an interview that everything in every ladder and server is 50-50 and in winrates in every matchup early game late game what ever still 50-50. Then he says they are monitoring a situation where last month there was a 0,5% imbalance. And everything without zero facts.
I've followed a lot of Blizzard's commentary about racial balance and they've never said anything like that. What they have said is:
1) They have a way to estimate win rates that factors out skill-related biases. The Blizzard matchmaking designer talks a little bit about how this works (in general terms) in his Q&A session after his presentation at UCI that Excalibur_Z linked here:
http://www.teamliquid.net/forum/viewmessage.php?topic_id=195273¤tpage=62#1228
It's worth watching if someone has interest in the statistical techniques Blizzard uses internally.
2) They've said that they don't consider variances of up to 5% either direction in win-loss rates a big deal from a balance standpoint, because the numbers vary by that much from month to month on their own.
3) Just because Blizzard doesn't share their entire data set with you doesn't mean that they don't have knowledgeable statisticians (like the designer in that video Excalibur_Z linked) doing the work.
|
Many people forget that the people who give the interviews are not the same who actually do the stuff. You should never take what dustin say 1-1. He just repeat the parts he can remember or think he remembers from last meeting.
So, since there's a low chance that the variation in race's average scores could occur randomly, that result is either (a) a result of some kind of race-specific systematic bias in his data collection, the nature of which has yet to be described, or (b) due to an outside factor.
I don't see where there could be race-specific bias in his data collection, unless one's going to make an argument about personalities of different race's players, their need for personal validation, and their likelihood to install his software. That'll be a fun argument, so I look forward to it, but let's put that aside for now. This is the point i try to make all the time. And if you check the post here the main point people dont understand and i run out of ideas how to explain it different.
|
The samples have huge differentials in their MMR movement. Why is that so? Shouldn't it be stable if the method is calculating your true MMR? It's obvious then the comparison does not take into account the fact that the MMR is not the true MMR because it is still moving.
Why then take the average of the MMR? Should it not be the true representative MMR i.e. the latest MMR and not the average over the 100000 games submitted given the fact that the MMR calculation itself is suspect to have such big differences in MMR?
If the MMR calculation itself is suspect then how can I even be sure that for those small sampling per person (i.e. 0 MMR movement most likely) which makes up a large portion of the data is even calculating the correct MMR. Those differences itself can contribute way more than the 50 - 60 points of MMR thus invalidating this exercise as a whole.
Interesting look though but I feel it only measures if MMR for certain race is higher which tells nothing of balance. More interesting obs is people who don't even understand the scope just agree or disagree based on their own bias. Typical in TL anyways.
|
This data is completely useless to say if a race is better than other for so many obvious reasons. Just to name a few:
- All races have certain "potential", that in some certain patch is constant. The variable is the human skill. So to find out the race potential, you should minimize the errors coming from the human factor, so if you want to find out what race may be better, you should check the average top players. This is pretty obvious, but for some reason, lower league players do not undertand that the difference between top players and low leaguers is exponential.
- Better players uses ladder in different ways, almost always not playing at their best (like in a tourney), some day they may train an unpolished new build, some days they test some weak points in their game, and so. Ask any top player. They never play their best in ladder, ladder is to test, train and improve for real games (tourneys)
- A lot of top level active players use ladder in about 50% of their games (aka a lot of custom train with partners). So for this players (as i said, gm/top masters, aka the relevant players for balance) their mmr is almost always not updated.
Do this same statistics but with every ro32 of top tournaments and there you will have some significant data for balance.
|
Edit: It's a mistake to call the OP's measure MMR, since it's not being used for matchmaking. It's a skill measure that's meant to work similarly to Blizzard's hidden, actual MMR based on what they've said about how that works.
On July 11 2012 18:09 lazyitachi wrote: The samples have huge differentials in their MMR movement. Why is that so? Shouldn't it be stable if the method is calculating your true MMR? It's obvious then the comparison does not take into account the fact that the MMR is not the true MMR because it is still moving.
Elo-like systems such as the OP's or Blizzard's MMR system use a Bayesian model to estimate skill based on the difference between predicted likelihood of a win/loss event before a game and its result. This leads to a few reasons that some players will have much more stable MMRs than others:
1) New players tend to have MMRs that move rapidly. This means their MMRs are very uncertain, but that's also accounted for in the system by taking note of their having played few games.
2) Some players play risky strategies that are more susceptible to minor differences in their opponents' play, so their results are less predictable from game to game, leading to a higher-uncertainty MMR.
3) In the most broken cases, you may have multiple players of different skill levels playing the same account, causing a permanently high MMR uncertainty.
Why then take the average of the MMR? Should it not be the true representative MMR i.e. the latest MMR and not the average over the 100000 games submitted given the fact that the MMR calculation itself is suspect to have such big differences in MMR?
It's definitely not valid to average MMR over multiple of one player's games. Only the latest MMR matters as that's a cumulative estimate based on their entire history. However, that's not what the OP is doing -- he's measuring an average across accounts, not across games.
Edit: I misspoke about this, and I think the OP should have a look at changing how he's looking at multiple data points for each player. See below in the thread.)
If the MMR calculation itself is suspect
It's not -- a Bayesian estimate of skill is going to be well-behaved as long as it's applied to a normal distribution of skills.
then how can I even be sure that for those small sampling per person (i.e. 0 MMR movement most likely) which makes up a large portion of the data is even calculating the correct MMR.
Remember that he's not sampling a particular player's MMR more than once, since MMR is a cumulative statistical estimate already. If a large majority of the players in the data set had only played one or two games, though, that might be a valid concern because Elo-like MMRs need several games to stabilize.
Edit: I misspoke about this too.
Interesting look though but I feel it only measures if MMR for certain race is higher which tells nothing of balance. More interesting obs is people who don't even understand the scope just agree or disagree based on their own bias. Typical in TL anyways.
You're right that using only one skill measure it's not possible to distinguish between race differences that come from balance and race differences that come from other causes (such as, say, too many low-level Terrans due to the campaign's emphasis on that race.)
Blizzard's approach to this (in general terms) is to look at race differences using multiple skill measures, not just MMR. The video in Excalibur_Z's post that I linked above talks about this a bit.
|
On July 11 2012 18:13 Belha wrote: This data is completely useless to say if a race is better than other for so many obvious reasons.
Not exactly. What the data says is that the population playing the races have different likelihoods to win. The OP makes no statement about why this is the case.
- All races have certain "potential", that in some certain patch is constant. The variable is the human skill. So to find out the race potential, you should minimize the errors coming from the human factor, so if you want to find out what race may be better, you should check the average top players. This is pretty obvious, but for some reason, lower league players do not undertand that the difference between top players and low leaguers is exponential.
While I agree it's interesting to ask whether the distributions are different among top players, I don't agree that this tells you anything about each race's "potential." Top players are as susceptible as anyone to fads in race selection or strategy choices that have nothing to do with game design. (Note that following fads isn't necessarily a game-losing strategy, but it might be for a time because the weight of peer reinforcement can be pretty strong even in the face of evidence that what everyone is doing is bad.) Note also that I'm not saying that game choices driven by fads are bad play -- in fact they can be self-reinforcing and nevertheless yield stronger performance because pros are highly aware of what other pros are doing and make decisions about their own play based on that.
- Better players uses ladder in different ways, almost always not playing at their best (like in a tourney), some day they may train an unpolished new build, some days they test some weak points in their game, and so. Ask any top player. They never play their best in ladder, ladder is to test, train and improve for real games (tourneys)
- A lot of top level active players use ladder in about 50% of their games (aka a lot of custom train with partners). So for this players (as i said, gm/top masters, aka the relevant players for balance) their mmr is almost always not updated.
These would be issues for a study that focused on the ladder distribution of very top players. This one does not.
Do this same statistics but with every ro32 of top tournaments and there you will have some significant data for balance.
This has been done to death, and posted elsewhere on TL. However, the data sets are so small that the results are all over the place from month to month and don't say much about either the game or the players.
|
Why then take the average of the MMR? Should it not be the true representative MMR i.e. the latest MMR and not the average over the 100000 games submitted given the fact that the MMR calculation itself is suspect to have such big differences in MMR?
It's definitely not valid to average MMR over multiple of one player's games. Only the latest MMR matters as that's a cumulative estimate based on their entire history. However, that's not what the OP is doing -- he's measuring an average across accounts, not across games.
i do. For the users i use more than one game not the latest. thats like 5-10% of the data Whats wrong with it? The last value would do it to and would be more actual.
PS: i admire your patience. I lost my on site 2....
|
Look at his data file. It says AVG mmr, Min MMR, Max MMR. Each row is one player. There is no deviation for each row if there is only one single game. It is also not possible to have 0 MMR differential for a standard error estimation (I doubt so many people submit gazillion games).
Unless it is not what it says it is, it seems he is taking the average across multiple games and also showing that the MMR calculation is highly unstable thus any player with small sample will have inaccurate MMR (as shown by the high deviation of MMR for a single person).
Please correct me if the data means something else. The header cannot be so badly mislabelled???
|
Lysenko, the uncertainty value associated with SC2 accounts behaves much simpler than you may think. It likely starts high and drops down to a minimum which it reaches after a certain amount of games, or once dipping down to it initially.
Have a look at this graph:
![[image loading]](http://s14.postimage.org/8w8e4juct/astraflame.jpg) (red line his MMR estimate, which raises by average of 16 per win and drop by 16 per loss. Bars are MMR of his opponents and their league)
This is a guy who leveled a low Bronze account to mid Master. He went on something like 138-11 wins-losses streak. His uncertainty value didn't change from the minimum value despite the massive shift in skill. This means that the MMR number alone gives you all the data you need (and that exists) about a player (with the exception of new players). There's also a weighted moving average of his MMR over the last X number of games, but that's only used for league placement, and since leagues are entirely and utterly meaningless, we can safely ignore it.
|
On July 11 2012 18:35 lazyitachi wrote: Look at his data file. It says AVG mmr, Min MMR, Max MMR. Each row is one player. There is no deviation for each row if there is only one single game. It is also not possible to have 0 MMR differential for a standard error estimation (I doubt so many people submit gazillion games).
Unless it is not what it says it is, it seems he is taking the average across multiple games and also showing that the MMR calculation is highly unstable thus any player with small sample will have inaccurate MMR (as shown by the high deviation of MMR for a single person).
Please correct me if the data means something else. The header cannot be so badly mislabelled??? The derivation of the mmr does not care at all for this calculation. How is the derivation of the mmr a race depending factor? Its 100% independent to everything else. So it equals out.
Even if its depended, what its not, its 100% not race depending. I feel like i explain the same fact over and over again. If you want to say my data is wrong you have to bring a RACE DEPENDING mistake, Any race independent mistake does not care at all! I can add random numbers to every mmr point and would come to the same result!
|
On July 11 2012 18:35 lazyitachi wrote: Look at his data file. It says AVG mmr, Min MMR, Max MMR. Each row is one player. There is no deviation for each row if there is only one single game. It is also not possible to have 0 MMR differential for a standard error estimation (I doubt so many people submit gazillion games).
Unless it is not what it says it is, it seems he is taking the average across multiple games and also showing that the MMR calculation is highly unstable thus any player with small sample will have inaccurate MMR (as shown by the high deviation of MMR for a single person).
Please correct me if the data means something else. The header cannot be so badly mislabelled???
No, you're right, and I misspoke. (I edited my posts to reflect this btw.)
Generally these systems provide for a measurement of both value and uncertainty for the MMR value. How that uncertainty value is calculated is beyond my knowledge of these kinds of systems. The short version is that I don't know whether looking at a standard deviation for variation of MMR over time is a valid estimate of that uncertainty. I'm inclined to say it's not that simple because a series of MMR values over time are not independent measurements, but I don't know how close it comes.
Because they aren't independent measurements, though, I'd say using only the latest MMR is the right way to treat the OP's data set. (As a simple case, if you include all the games for player X as their MMR travels from 0 to wherever they stabilize, if you average all of them you'll get half the player's actual skill level, which is not a correct result.)
(Edit: Apologies to any statisticians who are reading, who will already know this, but normal distributions and the use of standard deviation to measure the likelihood of a measurement taking place all assume that each measurement is entirely independent of each other. A series of MMR measurements for one player are never independent, because if my current value is 1500 and I play a game, my new value is guaranteed to be close to 1500 regardless of my result. It's not going to be 250 or 750 or 3000.)
|
On July 11 2012 18:39 Not_That wrote: This is a guy who leveled a low Bronze account to mid Master. He went on something like 138-11 wins-losses streak. His uncertainty value didn't change from the minimum value despite the massive shift in skill. This means that the MMR number alone gives you all the data you need (and that exists) about a player (with the exception of new players). There's also a weighted moving average of his MMR over the last X number of games, but that's only used for league placement, and since leagues are entirely and utterly meaningless, we can safely ignore it.
Thanks for the graph. The cases I was talking about were looking at differences in MMR uncertainties for people with stable MMRs, though. Because a Bayesian system like this uses the error measure to push one's MMR around, the uncertainty should be stable (and high) until a player's MMR curve flattens out at their actual skill, which seems to not happen in the range of your graph.
|
On July 11 2012 16:23 graNite wrote: Good job, nice statistics.
Is it possible to determine whether a matchup is random or not (espacially mirrors) by looking at winrates by MMR? What I mean: in ZVZ, how often does a player with (lets say 200 points) smaller MMR win ? Would that be a good way to detect randomness?
It's theoretically possible. I think this is one correct way of doing it. One way could be to collect data about the proportion of players who beat an opponent 200 MMR higher than them in a mirror match. You would compare that to the proportion of players who beat an opponent 200 MMR higher than them in a all matchups. This is your estimate for, on average, how often you would expect someone of 200 MMR lower to beat that opponent. You don't necessarily need to use the same sample size for both distributions, but assume that you do. You would then do a proportion test to compare the proportions from both samples with a null hypothesis of: is the proportion of players who beat an opponent 200 MMR higher than them in a mirror match different than those for all matchups. If the mirror matchup has a statistically significantly closer to .5 from either direction of people who beat an opponent 200 MMR higher than them than for the non-mirror matchup then you know that the match up is more "random". This is because as the proportion deviates away from 0.5 in either direction, standard deviation goes down and "randomness" goes down as well.
TLDR: 1. Make a sample and obtain a proportion of people of 200 MMR less than their opponent for both all and mirror 2. Do a proportion test to compare two different proportions 3. Because of the standard deviation formula for a single proportion (assuming that you used the same sample size), if the mirror matchup is statistically significantly closer to 0.5 from either direction when compared to all matchups you know that standard deviation is larger and that the matchup is more "random" than all other matchups
|
On July 11 2012 18:35 lazyitachi wrote: Look at his data file. It says AVG mmr, Min MMR, Max MMR. Each row is one player. There is no deviation for each row if there is only one single game. It is also not possible to have 0 MMR differential for a standard error estimation (I doubt so many people submit gazillion games).
Unless it is not what it says it is, it seems he is taking the average across multiple games and also showing that the MMR calculation is highly unstable thus any player with small sample will have inaccurate MMR (as shown by the high deviation of MMR for a single person).
Please correct me if the data means something else. The header cannot be so badly mislabelled???
MMR is not so unstable. On average it changes at 16 per game compared to ~12 for adjusted ladder points. Here's my MMR graph:
![[image loading]](http://s12.postimage.org/ka7706v9l/image.jpg)
That's reasonably stable considering the win and loss streaks I had.
A case can be made why the last MMR value should be used for the purpose of this thread, but in general if you want to describe a player's skill, for players who have been playing for a while, the average of their MMR is probably more accurate description of their skill than the MMR value they happen to sit at in any given moment.
|
On July 11 2012 18:49 Lysenko wrote:Show nested quote +On July 11 2012 18:39 Not_That wrote: This is a guy who leveled a low Bronze account to mid Master. He went on something like 138-11 wins-losses streak. His uncertainty value didn't change from the minimum value despite the massive shift in skill. This means that the MMR number alone gives you all the data you need (and that exists) about a player (with the exception of new players). There's also a weighted moving average of his MMR over the last X number of games, but that's only used for league placement, and since leagues are entirely and utterly meaningless, we can safely ignore it. Thanks for the graph. The cases I was talking about were looking at differences in MMR uncertainties for people with stable MMRs, though. Because a Bayesian system like this uses the error measure to push one's MMR around, the uncertainty should be stable (and high) until a player's MMR curve flattens out at their actual skill, which seems to not happen in the range of your graph.
But it's not high. I'm telling you his uncertainty value throughout the games is the minimum value possible for a player.
|
On July 11 2012 18:52 Not_That wrote: A case can be made why the last MMR value should be used for the purpose of this thread, but in general if you want to describe a player's skill, for players who have been playing for a while, the average of their MMR is probably more accurate description of their skill than the MMR value they happen to sit at in any given moment.
Using mean and standard deviation to characterize a measure like MMR where each measurement is strictly dependent on the previous one is a fundamental statistical error. It might be close to valid but my guess is that it's absolutely enough to screw up measures of the likelihood of random results matching one's data set.
The reason is that independent measurements generally follow normal, or Gaussian, distributions, which peak at the mean and have well-defined behavior away from the mean based on standard deviation, while measurements that depend on previous ones follow other asymmetric distributions like Poisson that require different measures for uncertainty and for which (critically) the mean doesn't define the peak.
|
On July 11 2012 18:52 Not_That wrote:Show nested quote +On July 11 2012 18:35 lazyitachi wrote: Look at his data file. It says AVG mmr, Min MMR, Max MMR. Each row is one player. There is no deviation for each row if there is only one single game. It is also not possible to have 0 MMR differential for a standard error estimation (I doubt so many people submit gazillion games).
Unless it is not what it says it is, it seems he is taking the average across multiple games and also showing that the MMR calculation is highly unstable thus any player with small sample will have inaccurate MMR (as shown by the high deviation of MMR for a single person).
Please correct me if the data means something else. The header cannot be so badly mislabelled??? MMR is not so unstable. On average it changes at 16 per game compared to ~12 for adjusted ladder points. Here's my MMR graph: ![[image loading]](http://s12.postimage.org/ka7706v9l/image.jpg) That's reasonably stable considering the win and loss streaks I had. A case can be made why the last MMR value should be used for the purpose of this thread, but in general if you want to describe a player's skill, for players who have been playing for a while, the average of their MMR is probably more accurate description of their skill than the MMR value they happen to sit at in any given moment.
I don't think it's really going to matter as long as the sample size is large enough.
|
On July 11 2012 18:55 Not_That wrote: But it's not high. I'm telling you his uncertainty value throughout the games is the minimum value possible for a player.
How are you calculating the uncertainty?
Edit: I hope to God that I am not going to have to learn all the math related to uncertainty measures in Elo-like systems to resolve this discussion.... lol
|
On July 11 2012 19:01 Lysenko wrote:Show nested quote +On July 11 2012 18:55 Not_That wrote: But it's not high. I'm telling you his uncertainty value throughout the games is the minimum value possible for a player. How are you calculating the uncertainty? When we searched for it in the data we found out: its just not there. After 40+ games it must be so close to 0 (or better 1) that we can not detect it.
There is no diffrence between a player that have 40 games played and loose 200 in a row and a guy who win loose trade every game. Both get/loose same mmr compare to the mmr diffrence.
For a new player its there but we dont have much data over total new accounts. So for the time being we can just ignore it.
I still dont understand why. - uncertainty lowers with every game and reach neutral factor fast - is way to simple to be helpful. But it makes our mmr calculation a lot easer.
|
On July 11 2012 19:02 skeldark wrote: There is no diffrence between a player that have 40 games played and loose 200 in a row and a guy who win loose trade every game. Both get/loose same mmr compare to the mmr diffrence.
That's not what uncertainty in a player's MMR is. Elo-like systems always deduct the same number of points from the loser that they award to the winner, because if they don't do this then there's score inflation or deflation. (Edit: Some systems like this may fiddle with these numbers in early games to try to get to a stable result faster, and if that's what yours is doing, that would explain why it looks like something is "converging to 0" over 40 or so games.)
Edit: Uncertainty in a player's skill score is a measure of how accurately the Elo-like system is predicting the results of a game. If, over a series of 3 games, the system predicts a 50% win rate and a player wins no games, that places a lower bound on the uncertainty of the player's score.
In Not_That's plot, that player's uncertainty is very high, because the MMR is greatly underestimating chances of a win. (In Blizzard's system, it will usually go for an opponent whose predicted win/loss is close to 50%, and the actual numbers through the whole run were 90%+ wins.)
That high uncertainty means that either the player's MMR is much different from their actual MMR (which is the case there) or they have been extremely lucky.
Edit 2: These types of systems don't adjust how many points they do or don't award based on the players' uncertainties. However, players with higher uncertainties might get matched against a wider range of opponents, so that they're more likely to get even games now and then, or to get them to their optimal score faster. Some of the strategies for this in Blizzard's system are discussed in that UCI talk Excalibur_Z linked in the thread I linked above.
|
On July 11 2012 19:07 Lysenko wrote:Show nested quote +On July 11 2012 19:02 skeldark wrote: There is no diffrence between a player that have 40 games played and loose 200 in a row and a guy who win loose trade every game. Both get/loose same mmr compare to the mmr diffrence. That's not what uncertainty in a player's MMR is. Elo-like systems always deduct the same number of points from the loser that they award to the winner, because if they don't do this then there's score inflation or deflation. Edit: Uncertainty in a player's skill score is a measure of how accurately the Elo-like system is predicting the results of a game. If, over a series of 3 games, the system predicts a 50% win rate and a player wins no games, that places a lower bound on the uncertainty of the player's score. In Not_That's plot, that player's uncertainty is very high, because the MMR is greatly underestimating chances of a win. (In Blizzard's system, it will usually go for an opponent whose predicted win/loss is close to 50%, and the actual numbers through the whole run were 90%+ wins.) That high uncertainty means that either the player's MMR is much different from their actual MMR (which is the case there) or they have been extremely lucky. You missunderstood me.
When we talk about uncertainty we talk about the counter messures of the system against such a case. ( We back-engineered so long that we see the things from the other site ^^ )
And there are no! The player drops his mmr by loosing, than he fight up again and the system still think he would win 50%. Obvious he will win way more. So a system should have a high uncertainty value for such an player to correct this cases. Blizzard system don't have it. Thats why we see the long way to fight up again.
If it have such an value for "old" players we would notice by now. I wanted to say: he win / loose as much points as we expect him to do . So there is no other function ( the uncertainty correction) Thats what we mean with : We can ignore it = There is nothing to back-engineer
|
On July 11 2012 18:58 Lysenko wrote:Show nested quote +On July 11 2012 18:52 Not_That wrote: A case can be made why the last MMR value should be used for the purpose of this thread, but in general if you want to describe a player's skill, for players who have been playing for a while, the average of their MMR is probably more accurate description of their skill than the MMR value they happen to sit at in any given moment. Using mean and standard deviation to characterize a measure like MMR where each measurement is strictly dependent on the previous one is a fundamental statistical error. It might be close to valid but my guess is that it's absolutely enough to screw up measures of the likelihood of random results matching one's data set. The reason is that independent measurements generally follow normal, or Gaussian, distributions, which peak at the mean and have well-defined behavior away from the mean based on standard deviation, while measurements that depend on previous ones follow other asymmetric distributions like Poisson that require different measures for uncertainty and for which (critically) the mean doesn't define the peak.
I'm pretty sure it doesn't really matter if MMR is strictly dependent on previous MMR value. It's the same thing as height or age or something like that. Height is dependent on a previous height value (like you're 5' when you're 10 and you're 5'1" when you're 11). It doesn't really matter it's just data.
I'm not even sure how to begin to frame this problem in a binomial or poisson setting.
|
On July 11 2012 19:16 skeldark wrote: Blizzard system dont have it. Thats why we see the long way to fight up again.
Blizzard's system does maintain such measures internally, but it's tuned to be a lot less than optimal in stabilizing a player's scores because the optimal tuning results in lots of crushing losses early for new players, and they're interested in giving new players a more fun experience. This is all described in that UCI presentation.
Regardless -- averaging a series of a single player's MMRs is always a mistake because the distribution won't be centered on the mean for that player. Unless you go back and correct that in your analysis, it's really not possible to draw any conclusions from your numbers.
|
On July 11 2012 19:24 Lysenko wrote:Show nested quote +On July 11 2012 19:16 skeldark wrote: Blizzard system dont have it. Thats why we see the long way to fight up again.
Blizzard's system does maintain such measures internally, but it's tuned to be a lot less than optimal in stabilizing a player's scores because the optimal tuning results in lots of crushing losses early for new players, and they're interested in giving new players a more fun experience. This is all described in that UCI presentation. Regardless -- averaging a series of a single player's MMRs is always a mistake because the distribution won't be centered on the mean for that player. Unless you go back and correct that in your analysis, it's really not possible to draw any conclusions from your numbers. I understand your point but it dont affect the result at all because its an race independent value that is equal for all races. I will remove it any way because it cost a lot of calculation time and its better to have actual data.
|
On July 11 2012 19:22 NoobCrunch wrote: I'm pretty sure it doesn't really matter if MMR is strictly dependent on previous MMR value. It's the same thing as height or age or something like that. Height is dependent on a previous height value (like you're 5' when you're 10 and you're 5'1" when you're 11). It doesn't really matter it's just data.
One individual's height doesn't follow a normal distribution either!
It's perfectly valid to compare multiple individual's heights as independent measurements, but you can't take (for example) all the lifetime measures of a person's height, average them together, and then say that's a more accurate measure of their height than the latest single measurement. You can certainly get away with measuring a person's height five times in five minutes and averaging those, because the systematic error (due to growth or shrinkage) is likely to be very small in those five minutes.
However, in the case of Elo-like skill numbers, if you average, say, the last 50 values for a player, the differences between those 50 values are NOT random because they contain cumulative changes in skill over those 50 games. You have to take the latest measurement.
Anyway, I'd note a few things at this point. You don't have to figure all of this out for yourself. First is that according to the Wikipedia page on Elo (references are included there of course) the distribution of strict Elo ratings follows a logistical rather than a normal distribution across multiple players.
http://en.wikipedia.org/wiki/Elo_rating_system
Second is that the more advanced Glicko system, which is closer in design to the Starcraft 2 MMR, does explicitly define an uncertainty measure based on accuracy of its matchmaking predictions. This might be useful to at least see how the people doing this for a living think about these issues:
http://en.wikipedia.org/wiki/Glicko_rating_system
Finally, I believe that Starcraft's system is closer still to Microsoft's TrueSkill, though there are some differences. All the math is here, but I haven't read it closely enough to understand it:
http://research.microsoft.com/en-us/projects/trueskill/details.aspx
|
On July 11 2012 19:29 skeldark wrote: I understand your point but it dont affect the result at all because its an race independent value that is equal for all races. I will remove it any way because it cost a lot of calculation time and its better to have actual data.
The problem is that the impact of such a mistake is cumulative depending on the specifics of your data set. So, if you have more players in a race or you have more games per player for certain races, you'll get more or less inaccurate results based on that. If you're not using statistical methods that converge to accurate when you add more data, you are certain to get large systematic errors compared to what you'd expect from a random dataset.
|
|
With last mmr value instead of average and the new accounts that got uploaded.
Maxerror : 35.417392849208 ERRORCOUNT : 44.294444444444444% in 5 76.26333333333334% in 10 92.32222222222222% in 15 98.15333333333334% in 20 99.71111111111111% in 25 99.96555555555555% in 30 Race... T: -26.988169227922526 P 23.766341174516356 Z -1.3573738479972235 Analyse DONE
not much change.
@Lysenko btw you find us often on TL teamspeak. Easier to discuss in voice than in text.
|
I was speaking of simpler systems like Elo. TrueSkill is a hell of a lot more complicated and they may well do that as a means of trying to get the number of games to an accurate estimate down (which was a primary design goal) but they have to be careful to do that in a way that ensures no inflation or deflation.
|
On July 11 2012 19:42 skeldark wrote: @Lysenko btw you find us often on TL teamspeak. Easier to discuss in voice than in text.
Thanks!! I may join you sometime, but I find it worthwhile to discuss here because having to go back and correct any earlier misstatements keeps me honest. :D
|
On July 11 2012 19:45 Lysenko wrote:Show nested quote +On July 11 2012 19:42 skeldark wrote: @Lysenko btw you find us often on TL teamspeak. Easier to discuss in voice than in text.
Thanks!! I may join you sometime, but I find it worthwhile to discuss here because having to go back and correct any earlier misstatements keeps me honest. :D We talk about statistic. There is no room for being honest here 
check edit with new data above.
|
[QUOTE]On July 11 2012 19:37 Lysenko wrote: [QUOTE]On July 11 2012 19:22 NoobCrunch wrote:
It's perfectly valid to compare multiple individual's heights as independent measurements, but you can't take (for example) all the lifetime measures of a person's height, average them together, and then say that's a more accurate measure of their height than the latest single measurement. You can certainly get away with measuring a person's height five times in five minutes and averaging those, because the systematic error (due to growth or shrinkage) is likely to be very small in those five minutes.
[/QUOTE]
I'm pretty sure it doesn't matter if Lysensko only calculated the latest part of someone's mmr rating. If you think of someone's mmr oscillating like a sin wave around some average point, then (for a large sample) when you happen to pick someone at the height of that wave you will also have picked someone that is at the bottom of that wave. In the long run for a large sample, the net effect will be zero. In the case of someone rapidly climbing or declining on the ladder there will be obviously some error due to not using the average. However, there's no reason to believe that this happens more often for zerg, protoss, or terran so the comparative impact is unclear.
[url=http://i50.tinypic.com/213m4pl.jpg]http://i50.tinypic.com/213m4pl.jpg[/url] - that's the distribution
I can kind of see the logarithmic dip at the end but I would say that using a normal probability model is perfectly fine.
|
On July 11 2012 19:44 Lysenko wrote:I was speaking of simpler systems like Elo. TrueSkill is a hell of a lot more complicated and they may well do that as a means of trying to get the number of games to an accurate estimate down (which was a primary design goal) but they have to be careful to do that in a way that ensures no inflation or deflation. I think you're talking out of your ass. There is no inflation or deflation in TrueSkill. Why would there be?
Also, the above poster talking about independence implies mean = peak and dependance implies mean != peak is also talking out of his ass.
Independence is a property that is possessed by sets of random variables. Any set of independent or dependent random variables can have any probability distribution you want, regardless of whether mean = peak or not. There is no connection between independence and mean = peak.
Finally, in TrueSkill, the skill (or MMR) is explicitly modeled by a normal distribution.
|
On July 11 2012 20:22 NoobCrunch wrote:On July 11 2012 19:37 Lysenko wrote:On July 11 2012 19:22 NoobCrunch wrote:It's perfectly valid to compare multiple individual's heights as independent measurements, but you can't take (for example) all the lifetime measures of a person's height, average them together, and then say that's a more accurate measure of their height than the latest single measurement. You can certainly get away with measuring a person's height five times in five minutes and averaging those, because the systematic error (due to growth or shrinkage) is likely to be very small in those five minutes. I'm pretty sure it doesn't matter if Lysensko only calculated the latest part of someone's mmr rating. If you think of someone's mmr oscillating like a sin wave around some average point, then (for a large sample) when you happen to pick someone at the height of that wave you will also have picked someone that is at the bottom of that wave. In the long run for a large sample, the net effect will be zero. In the case of someone rapidly climbing or declining on the ladder there will be obviously some error due to not using the average. However, there's no reason to believe that this happens more often for zerg, protoss, or terran so the comparative impact is unclear. http://i50.tinypic.com/213m4pl.jpg - that's the distribution I can kind of see the logarithmic dip at the end but I would say that using a normal probability model is perfectly fine. MMR should not be sampled. A sample of MMR isn't a set of independent observations like the height of a random group of people for which the usual techniques of statistical inference can be applied to.
MMR is an updated belief. It is the prior belief of skill, updated by the evidence given by whether you win or lose a game. It's the "best Bayesian belief" about a player's skill.
Take the last recorded MMR. Do not sample MMR and average it.
|
Good work Skeldar. It's obvious Terran is severely underpowered now with all the tournament results lately. However, your statistics confirm how weak Terran really is.
|
On July 11 2012 13:27 _Search_ wrote:Show nested quote +On July 11 2012 12:57 Niazger wrote:On July 11 2012 11:37 _Search_ wrote: I'm really not understanding how the OP draws his conclusions.
Is he comparing the win rates of races where players have different MMRs? As in, Zerg is overpowered because players with lower MMRs are beating players with higher MMRs? If so, the conclusions are laughably overreaching. Despite all the esteem given to MMR, it's a terrible indicator of skill because it's based on win rates and averaged across the race. To put it concisely: balance dictates win rates, which dictate MMR, which the OP is using to determine balance. It's totally circular.
Also, certain races are just plain easier to win with using lower skill. Some races rely more on luck. How many Protoss wins can be attributed to a lucky DT timing? How many TvZs have been won by getting one medivac in the right place at the right time? Its widely accepted that Protoss is the easiest race to play and Zerg is the hardest. How does that factor into the OPs findings? Naniwa, for one, has said that the immortal sentry PvZ allin is far easier to execute than it is to stop, (though I think this description could be applied to most Protoss attacks, and to attacking in general, which helps Protoss the most since they have the strongest attacks).
It's also easier to cheese with certain races, and, assuming that a cheese win is a non-skill based win, that would give Protosses another undeserved boost in win rates, since they are doubtless the biggest cheesers. The OP treats all wins as equally legitimate, when many are clearly bullshit. I play Terrans on the ladder all the time who refuse to guard against a 6 pool, saying they'd rather lose. They go for a super greedy opening that plain straight up loses to a potential counter build. Others refuse to guard against DT openings. How are those games legitimate? These players will never be able to win against the same opponent twice!
I also totally reject the notion that each race receives an equal degree of skilled and unskilled players. Heck, just comparing the Korean to the foreigner Terrans one can see a readily apparent skill gap, one that isn't there with Protoss and Zerg.
Even then, most newcomers gravitate to Terran or Protoss (because of the campaign/because of the instant easiness). I have more than one friend who has abandoned SC2 entirely because Zerg was just too difficult to play.
Last, Zerg recently received a fairly significant buff, which means that, if the buff did what it was supposed to do, Zergs SHOULD be winning over higher MMR opponents right now. That was the point of the buff! To move Zergs up the ladder and give them higher tournament representation! In other words, something would be wrong if Zergs WEREN'T winning more! Did the OP take this into account? Did he calculate the win rates before and after the patch separately?
These are the issues I have the OPs method.
Edit: I would also love to see how this relates to the maps. Many of the maps in the pool have severe balance issues, which always affect Zerg most heavily. But those maps are being slowly weeded out and as more balanced maps enter the pool we see Zergs winning more. Most recently Korhal Compound and Metalopolis were removed (both of which were terrible for Zerg if they spawned close positions on Metalopolis). Every season the map changes have been a subtle buff to Zerg. How do the recent map changes affect the OPs findings? Im sorry bro but you couldve saved a lot of time. You arent even close to understanding how the OP came to his results yet you post this wall of text. Also your rant about luck is pretty retarded tbh. If getting a medivec in the right postion/dts are luck I guess we all should just roll the dice at the beginning of the game. No u. Rather than just saying, "you're wrong" why don't you say something that might show how I'm wrong? And if you think there is no risk-to-reward skew in this game than you've never faced a TvP where the Protoss hid a pylon in your base.
Well, In your initial post you make it pretty obvious that you don't understand the basis of the OP so it's kind of pointless to argue with you.
The risk/reward might be skewed but thats not luck in my opinion. The risk of losing a pylon in your opponents base, which will win you the game if you get 4 gates up, is kind of skewed but then again that has nothing to do with luck if you forget to check for pylons/dont watch the probe.
Also MMR is =/= skill. If MMR was = skill all three races should have the exact same MMR unless somehow all the "skilled" players only played one race.
|
Yes it is great to have statistics to support what most players knew all along, Terran is UP, especially against Zerg.
But statistics is just part of the picture, just look at how much more the Terran needs to do each game in order to win. Terran need to attack constantly in order to avoid late game at all cost. Meanwhile Protoss and even more Zerg can just lean back, defend and macro up.
The difference in required micro is close to absurd. What Zerg and Protoss really would need are units that are worthless without micro but good with micro. That in itself would even out the game.
|
Meh. Doesn´t say a lot in my opinion. Nobody should use anything below korean GM for statistics. Below Gold Zerg is very weak because the players are lacking the mechanics. Protoss ist quite easy on the other hand so lower level players are much better with it. I play Protoss and Terran btw.
|
Reaffirms and confirms that blizzard is most qualified to gauge balance statistically.
|
On July 11 2012 21:55 Aunvilgod wrote: Meh. Doesn´t say a lot in my opinion. Nobody should use anything below korean GM for statistics. Below Gold Zerg is very weak because the players are lacking the mechanics. Protoss ist quite easy on the other hand so lower level players are much better with it. I play Protoss and Terran btw.
So only korean progamers are alllowed to play a balanced game? Sorry but SC2 is also for gold zergs. If you need to practice 14h a day to enjoy a balanced game, there is a huge problem.
I'm a mid master Terran on EU and even if I enjoy a lot the pro scene, I don't give a fuck of the korean GM ladder. I play for myself and I wish I can play an even game with a protoss or a zerg.
Anyway, awesome thread, I enjoyed quite a lot reading the all thing.
|
Best approach to meassure racial balance i came across. I'm relieved to see that the differences seem to be relativly small.
Hope you can gather enough data to calculate solid numbers for master+ soon. In the end that is the level most people here care about.
I wonder wheter we will be able to see changes caused by future patches as well. Or even see which regions can adopt faster to changes. Keep it up.
ps: If you expand your OP with information regarding the methods you might be able to avoid some discussions.
|
On July 11 2012 09:34 lolcanoe wrote: I'm confused here - why are there 4 random groups? Why not just lump them together?
But im still to 99,31 % secure Terran is underpowered -30.09
Where the hell are you getting these numbers for? I'm assuming this 99.3 comes from some application of the normal distribution - have you been thorough established normality in MMR? Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here.
4 because i had 4 races ( with random ) 2 run had only 3 . Dont have be this number only to get kind of close to the amount of accounts of the races. MMR is normal by definition ( if nothing goes wrong)
If this data is normal i can not tell i did not test yet
Its just the % of the chance that random data would create such an result. depending on tests. Im not 100% sure if you can do it this way but i dont see why not. Im no statistic guru in any way. I am not terrible in it but its many years ago i worked with it.
Basic i just produce random data to show that the data is not biased and how "stable" it is. I think there are better ways to show this but this is the fastest with an reasonable result.
Man.. have you really taken stats before? Does that data look normal to you? Look at the graph for fucks sake. You have a pretty severe right skew without a very defined bell!
You can't use your deviation calculations if you don't establish normality. At least make note of the huge gaping inaccuracy here! You really need to leave detailed instructions of what each value means and how each one was calculated. Where exactly does each number come from? Where does 99% confidence come from? What do you mean by "-29"...?
Lastly, if you don't now what tests you are running or if they are the correct, how the fuck can you make conclusions from them? I respect good statistical work, but the largest mistake in stats is to make claims that don't have their nuances clearly understood.
|
This is an amazign project. Thank you for putting it together. I'm so intruiged at reading this stuff. AMAZING WORK MAN!
|
On July 11 2012 02:08 Stiluz wrote: It would be interesting if you could divide this by league too.
Yes PLEASE would love to see the break out by league.
I would bet that as skill increases so does balance
|
On July 11 2012 23:31 lolcanoe wrote: <Rants>
Welcome to TL standard of statistics. Don't hold your breath for correct application.
Assumes shape is normal Player MMR can fluctuate min max by 1k MMR It still does not account for error on each individual player MMR
|
On July 11 2012 23:00 xian_ wrote: Best approach to meassure racial balance i came across. I'm reliefed to see that the differences seem to be relativly small.
Hope you can gather enough data to calculate solid numbers for master+ soon. In the end that is the level most people here care about.
I wonder wheter we will be able to see changes caused by future patches as well. Or even see which regions can adopt faster to changes. Keep it up.
ps: If you expand your OP with information regarding the methods you might be able to avoid some discussions. Than the op would be 200 sites long. There will always missing information and there are always people who don't think and just assume what they want. I can not explain the hole mmr claculation and all basics about skillsystems in this post. If you know something about it you are able to ask reasonable questions. The rest....
Oh btw: @lolcanoe
Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll...
|
my theory: intelligent players like lasers -> players who like lasers pick protoss -> clever players are better than stupid players -> Protoss seems op
User was warned for this post
|
edited some part or the op to make it more clear what i did. Before you guys start the hole standard deviation question again, read throw this thread. Read the statistic discussion. Read it again. than think 15 min about everything.
than read you own post again ...
Back to the important points: -I dont think that only intelligent players like lasers!
|
On a side note to Not_That graphs: Pls use red for Zerg and yellow for Protoss. I guess most of the ppl here are used to that bc of liquipedia.
|
On July 12 2012 01:10 OrbitalPlane wrote: On a side note to Not_That graphs: Pls use red for Zerg and yellow for Protoss. I guess most of the ppl here are used to that bc of liquipedia. already corrected.
|
On July 12 2012 01:10 OrbitalPlane wrote: On a side note to Not_That graphs: Pls use red for Zerg and yellow for Protoss. I guess most of the ppl here are used to that bc of liquipedia.
Yea i agree they should change it, its big problem, how can i recognize races if COLORS are wrong.
|
On July 12 2012 01:10 OrbitalPlane wrote: On a side note to Not_That graphs: Pls use red for Zerg and yellow for Protoss. I guess most of the ppl here are used to that bc of liquipedia.
Protoss icon is green on stream list, and that's the color I associate Protoss with based on my experience with TL. I imagine I'm not the only one.
|
On July 12 2012 01:19 Not_That wrote:Show nested quote +On July 12 2012 01:10 OrbitalPlane wrote: On a side note to Not_That graphs: Pls use red for Zerg and yellow for Protoss. I guess most of the ppl here are used to that bc of liquipedia. Protoss icon is green on stream list, and that's the color I associate Protoss with based on my experience with TL. I imagine I'm not the only one. That is not correct. I use Teamliquid dark theme and for me its Teal!
|
I calculate the average skill of an race not the generell popularity!
What is generell popularity? Is it like general popularity?
|
On July 12 2012 01:24 treekiller wrote:What is generell popularity? Is it like general popularity? No! The general of the popularity have nothing to do with this.
|
On July 11 2012 20:36 paralleluniverse wrote:] MMR should not be sampled. A sample of MMR isn't a set of independent observations like the height of a random group of people for which the usual techniques of statistical inference can be applied to..
Exactly. You can't apply the statistics of normal distributions to data sets whose points are not independent measurements. The problem isn't that it works in some cases and not others -- in fact the problem is that the more math you do with the wrong assumptions, the bigger your systematic error is in your result.
I need to take a much closer look at how the OP is deriving his results. He asserts that correcting this didn't yield much change, but also it looks like he's not exactly doing a straight-up statistical analysis either, but instead comparing his results to some kind of monte carlo simulation... and of course those kinds of simulations require a whole different level of inspection to validate that I don't see having happened here.
|
On July 12 2012 00:08 lazyitachi wrote: Assumes shape is normal Player MMR can fluctuate min max by 1k MMR It still does not account for error on each individual player MMR
Those last two are not necessarily problems with the analysis of a population, but the concerns of the guy you quoted are spot on.
|
On July 12 2012 00:12 skeldark wrote:Oh btw: @lolcanoe Show nested quote +Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll...
No, he's right -- Elo, at least, usually follows a logistic distribution across a population. I pointed that out earlier, btw.
http://en.wikipedia.org/wiki/Logistic_distribution
|
On July 11 2012 23:31 lolcanoe wrote:Show nested quote +On July 11 2012 09:34 lolcanoe wrote: I'm confused here - why are there 4 random groups? Why not just lump them together?
Where the hell are you getting these numbers for? I'm assuming this 99.3 comes from some application of the normal distribution - have you been thorough established normality in MMR? Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. Show nested quote + 4 because i had 4 races ( with random ) 2 run had only 3 . Dont have be this number only to get kind of close to the amount of accounts of the races. MMR is normal by definition ( if nothing goes wrong)
If this data is normal i can not tell i did not test yet
Its just the % of the chance that random data would create such an result. depending on tests. Im not 100% sure if you can do it this way but i dont see why not. Im no statistic guru in any way. I am not terrible in it but its many years ago i worked with it.
Basic i just produce random data to show that the data is not biased and how "stable" it is. I think there are better ways to show this but this is the fastest with an reasonable result.
Man.. have you really taken stats before? Does that data look normal to you? Look at the graph for fucks sake. You have a pretty severe right skew without a very defined bell! You can't use your deviation calculations if you don't establish normality. At least make note of the huge gaping inaccuracy here! You really need to leave detailed instructions of what each value means and how each one was calculated. Where exactly does each number come from? Where does 99% confidence come from? What do you mean by "-29"...? Lastly, if you don't now what tests you are running or if they are the correct, how the fuck can you make conclusions from them? I respect good statistical work, but the largest mistake in stats is to make claims that don't have their nuances clearly understood.
Just give up, arguing stats on the TL is just worthless. Also there is definately a language barrier thing in this case. Overall a good experiment, I completely agree though that the execution and claims made here have no foundation.
|
On July 12 2012 01:53 Lysenko wrote:Show nested quote +On July 12 2012 00:12 skeldark wrote:Oh btw: @lolcanoe Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll... No, he's right -- Elo, at least, usually follows a logistic distribution across a population. I pointed that out earlier, btw. http://en.wikipedia.org/wiki/Logistic_distribution
Beside all of this. People act like they have a problem with the way i analysed the data. But no one had one single point where i did something wrong. You can say: "i did not understand what you did at point X, can you please explain it and i allways did" You can say : i think you make an mistake at point Y because this will result in this error ... and i will check it.
But what i see so far is this. -this is wrong because its wrong -Skill that dont take apm into account is no skill / you only look at win streaks you should check for the skill not the mmr - Lets talk about something total off topic about statistic...
Often people start with so obvious wrong arguments that its hard for me to force myself to read the rest. I did not see an single post that point out an mistake. If you want to atack my analyse you are welcome, but you have to bring a point. And with you i dont mean you personal.
Whatever mmr is, it have nothing to do with the analyse result! I think i show pretty simple that the "race-groups average" is to high to be a random mistake. And there is nothing in the data that is depending on the player race. So the race value must be biased. = not balanced with the whole data.
If i just come up with a random number that i write behind each account and i accidental get such results on the first try every-time with different data i must be:
A ) very very lucky or B) my random number is not random it shows an biased in the racedata.
Yes im not not exactly doing a straight-up book statistical analysis. Thats what i said in the op from the beginning. But i do an statistical analyse.
|
On July 12 2012 02:17 skeldark wrote:Show nested quote +On July 12 2012 01:53 Lysenko wrote:On July 12 2012 00:12 skeldark wrote:Oh btw: @lolcanoe Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll... No, he's right -- Elo, at least, usually follows a logistic distribution across a population. I pointed that out earlier, btw. http://en.wikipedia.org/wiki/Logistic_distribution Beside all of this. People act like they have a problem with the way i analysed the data. But no one had one single point where i did something wrong. You can say: "i did not understand what you did at point X, can you please explain it and i allways did" You can say : i think you make an mistake at point Y because this will result in this error ... and i will check it.But what i see so far is this. -this is wrong because its wrong -Skill that dont take apm into account is no skill / you only look at win streaks you should check for the skill not the mmr - Lets talk about something total off topic about statistic... Often people start with so obvious wrong arguments that its hard for me to force myself to read the rest. I did not see an single post that point out an mistake. If you want to atack my thesis you are welcome, but you have to bring a point. And with you i dont mean you personal.
Whatever mmr is, it have nothing to do with the analyse result! I think i show pretty simple that the "race-groups average" is to high to be a random mistake. And there is nothing in the data that is depending on the player race. So the race value must be biased. = not balanced with the whole data. If i just come up with a random number that i write behind each account and i accidental get such results, im A ) very very very lucky or B) my random number is not random it shows an biased in the racedata.
I think the main problem people are having is they have no "baseline" to compair your final results too. -26.98 for terran is a fine result, but I don't know what that means compaired to other months or moments in time. If you did the same test for the ladder 3 months after SC2s release and showed the results, side by side, I think people would have a better idea of what the data means. Or if you did it monthly and we could see trends.
I am not claiming the results don't show anything, but people need raw data to be grounded with some real world examples of what it means. Then when it changes, they can apply the data to the changes they see taking place.
|
On July 12 2012 02:38 Plansix wrote:Show nested quote +On July 12 2012 02:17 skeldark wrote:On July 12 2012 01:53 Lysenko wrote:On July 12 2012 00:12 skeldark wrote:Oh btw: @lolcanoe Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll... No, he's right -- Elo, at least, usually follows a logistic distribution across a population. I pointed that out earlier, btw. http://en.wikipedia.org/wiki/Logistic_distribution Beside all of this. People act like they have a problem with the way i analysed the data. But no one had one single point where i did something wrong. You can say: "i did not understand what you did at point X, can you please explain it and i allways did" You can say : i think you make an mistake at point Y because this will result in this error ... and i will check it.But what i see so far is this. -this is wrong because its wrong -Skill that dont take apm into account is no skill / you only look at win streaks you should check for the skill not the mmr - Lets talk about something total off topic about statistic... Often people start with so obvious wrong arguments that its hard for me to force myself to read the rest. I did not see an single post that point out an mistake. If you want to atack my thesis you are welcome, but you have to bring a point. And with you i dont mean you personal.
Whatever mmr is, it have nothing to do with the analyse result! I think i show pretty simple that the "race-groups average" is to high to be a random mistake. And there is nothing in the data that is depending on the player race. So the race value must be biased. = not balanced with the whole data. If i just come up with a random number that i write behind each account and i accidental get such results, im A ) very very very lucky or B) my random number is not random it shows an biased in the racedata. I think the main problem people are having is they have no "baseline" to compair your final results too. -26.98 for terran is a fine result, but I don't know what that means compaired to other months or moments in time. If you did the same test for the ladder 3 months after SC2s release and showed the results, side by side, I think people would have a better idea of what the data means. Or if you did it monthly and we could see trends. I am not claiming the results don't show anything, but people need raw data to be grounded with some real world examples of what it means. Then when it changes, they can apply the data to the changes they see taking place. good point. I dont know it either. I will do this. At the moment i only have data of my last patch because all before did not upload the race data. So from now on i get new data every day. I can do the exact same with a new dataset.
I use this moment to say sorry, if i offended people that make valid points. But you guys have to understand me. I see a lot of post that are so completely wrong that you start to get really annoyed.
|
Are we sure arithmetic mean is the way to go here? One of the problems with arithmetic mean is that it's vulnerable to being skewed by outliers on the extremes. Just looking at your chart, the top 2 data points are Protoss and the bottom 2 data points are Terran. It seems plausible to me that the races of those outlying data points might be random (might get different outlying races in a different sample), and that they might have undue influence on the results.
Another issue that might be worth looking into is the possibility of different trends at different points in the distribution. People at TL really only care about balance from masters and above, but the majority of the data used in your analysis is coming from diamond and below. SC2 is a totally different game in diamond than it is in high masters, and it seems plausible that the races would have different strength levels at the differently levels accordingly. For example. marine vs baneling balance is hugely dependent on skill level. In diamond and below, banelings crush marines, but as you approach GM, marines perform much better. I don't mean to pile work on you since I'm not willing to do it, but if someone is interested and motivated, it would probably be worth isolating skill level ranges and seeing what you find.
|
On July 12 2012 02:47 skeldark wrote:Show nested quote +On July 12 2012 02:38 Plansix wrote:On July 12 2012 02:17 skeldark wrote:On July 12 2012 01:53 Lysenko wrote:On July 12 2012 00:12 skeldark wrote:Oh btw: @lolcanoe Based on my understanding of the MMR system, you would not expect normality... As a stats junky, I'm very suspicious of the statistics here. i hope you troll... No, he's right -- Elo, at least, usually follows a logistic distribution across a population. I pointed that out earlier, btw. http://en.wikipedia.org/wiki/Logistic_distribution Beside all of this. People act like they have a problem with the way i analysed the data. But no one had one single point where i did something wrong. You can say: "i did not understand what you did at point X, can you please explain it and i allways did" You can say : i think you make an mistake at point Y because this will result in this error ... and i will check it.But what i see so far is this. -this is wrong because its wrong -Skill that dont take apm into account is no skill / you only look at win streaks you should check for the skill not the mmr - Lets talk about something total off topic about statistic... Often people start with so obvious wrong arguments that its hard for me to force myself to read the rest. I did not see an single post that point out an mistake. If you want to atack my thesis you are welcome, but you have to bring a point. And with you i dont mean you personal.
Whatever mmr is, it have nothing to do with the analyse result! I think i show pretty simple that the "race-groups average" is to high to be a random mistake. And there is nothing in the data that is depending on the player race. So the race value must be biased. = not balanced with the whole data. If i just come up with a random number that i write behind each account and i accidental get such results, im A ) very very very lucky or B) my random number is not random it shows an biased in the racedata. I think the main problem people are having is they have no "baseline" to compair your final results too. -26.98 for terran is a fine result, but I don't know what that means compaired to other months or moments in time. If you did the same test for the ladder 3 months after SC2s release and showed the results, side by side, I think people would have a better idea of what the data means. Or if you did it monthly and we could see trends. I am not claiming the results don't show anything, but people need raw data to be grounded with some real world examples of what it means. Then when it changes, they can apply the data to the changes they see taking place. good point. I dont know it either. I will do this. At the moment i only have data of my last patch because all before did not upload the race data. So from now on i get new data every day. I can do the exact same with a new dataset. I use this moment to say sorry, if i offended people that make valid points. But you guys have to understand me. I see a lot of post that are so completely wrong that you start to get really annoyed.
If you want a really sharp contrast, try to get results from the 4-5 rax reaper era. Even if you could only get a small amount of data from good players at that time, you could trim the current data down to show the results being compairable to each other.
You could also open with a line like: "You want to see data that proves imbalance? You can't handle the data that proves imbalance!" Then show us the data.
|
On July 12 2012 02:17 skeldark wrote: Beside all of this. People act like they have a problem with the way i analysed the data. But no one had one single point where i did something wrong.
You were averaging series' of players MMR and then averaging those into your data set, assuming incorrectly that they were independent data points. That's one error. You say it doesn't have a significant impact, but you came back with that answer suspiciously fast, which tells me you didn't really understand the nature of the concern or what you'd have to do to check that you were correctly applying your statistics everywhere. This is such a fundamental type of problem that it calls into question everything else that you've done behind the scenes (particularly since you don't show your work leading to your results.)
I think you need to take a deep breath and think about what we're actually saying.
Statistical methods like you're using are based on assumptions about how and what kind of data you collect. If you do not collect the data in a way that matches the assumptions of your statistics, you will not get meaningful results. You'll either come up with a measure of how good your numbers are that doesn't really mean what you think it does, or you'll get errors that add up as you collect more data, or you'll fit a curve that doesn't apply to the data set you're collecting.
You're looking at concerns about these things, valid concerns, and saying "well the effect is small, I can hand-wave it all away." No, you really can't. You have to go back to where you started, check every assumption, learn about the statistics you don't know, and convince the rest of us that you applied them all correctly by showing us your math.
You said earlier in this conversation that you couldn't show us everything because it would be "200 pages." Well, in my view, that's 200 pages in which to make a fundamental mistake. If you can't share every step, there's no way to check your work or understand what you've done. There's no way to know whether you're onto something great or just pulling meaningless numbers out of nowhere.
Don't take this personally -- you may have a great data set on your hands. But your responses to the concerns we're raising do not inspire confidence that you understand the issues involved in doing what you're doing.
Yes im not not exactly doing a straight-up book statistical analysis. Thats what i said in the op from the beginning. But i do an statistical analyse.
It's all or nothing. You either double and triple check that your data set matches the assumptions of the statistical method you're using, or your numbers don't mean anything.
Finally: I'm certainly not in any way offended by anything you've said. I want you to succeed. But, if you're going to predict the likelihood of your result being random, you're deep into actual statistical analysis, and you'd better do it correctly or it just doesn't mean anything.
|
It's all or nothing. You either double and triple check that your data set matches the assumptions of the statistical method you're using, or your numbers don't mean anything. A check can proove if they are meaningful but dont make them meaningful! I think that is the main problem. You wait for the kind of prove you are used to see. So you assume its not meaningful even if its obvious that it is. I think i found a way to convince you. That was the reason why i did not care for the avg of mmr: it does not care what mmr is: Forget sc2 , mmr and races for a second:
Given : Data A Data A was created without knowledge of P Property P Property P was collected without knowledge of A
90.000 Random sorted Data groups of A produced in 99.55% of the cases values between -25 and +25
P sorted Data groups of data A produced T: -53.68 P 11.87 Z 31.52 P sorted Data group of data B subgroub of A produced T: -27.70 P 17.49 Z 3.82 P sorted Data group of data C subgroub of A produced T: -43.51 P 0.37 Z 34.93
Is this not statistic significant that Data A is biased towards P?
|
|
On July 12 2012 03:24 skeldark wrote: So you assume its not meaningful even if its obvious that it is.
If you haven't shown us that your data set fits the assumptions used by the model you're using to analyze it, nothing is "obvious."
|
On July 12 2012 04:54 Lysenko wrote:Show nested quote +On July 12 2012 03:24 skeldark wrote: So you assume its not meaningful even if its obvious that it is.
If you haven't shown us that your data set fits the assumptions used by the model you're using to analyze it, nothing is "obvious." Look my example again. Forget sc2. forget mmr.
Data A, any data. I assume its significant. you agree or not?
I think you misunderstand what im doing here. I dont get paid for it and i have no benefit from it at all. I dont care at all witch race is imba. I have the data and so i publish it.
You can say its meaningless for you because i did not do it in the way you are used to do it. But in this case why dont you do it your way and publish it here?
|
On July 12 2012 05:22 skeldark wrote:Show nested quote +On July 12 2012 04:54 Lysenko wrote:On July 12 2012 03:24 skeldark wrote: So you assume its not meaningful even if its obvious that it is.
If you haven't shown us that your data set fits the assumptions used by the model you're using to analyze it, nothing is "obvious." look my example again. Data A any data. 0 Assumptions. You agree its significant or not? Statistical signficance is quantifiable - this isn't a matter of agreement or disagreement. Set a P-value, do a single greater than/less than comparison, and you either have it or you don't. If you knew statistics well you'd be asking for better tests than accusing me of "trolling".
Can you please just layout what exactly happenned here? Just lay out how you are getting to the .-29.87 etc values. You take a group. Average the MMR... and then? What's this number coming form exactly? Is it the difference from the mean of all groups?
Lastly, statitistical studies are NEVER perfect, whether it be sampling, distribution fitting, or model assumptions. We're not holding you to a perfect standard - what you need to do is clearly layout the ASSUMPTIONS you made while compiling the data so the readers can determine whether or not these assumptions invalidate the conclusion.
|
On July 12 2012 05:26 lolcanoe wrote:Show nested quote +On July 12 2012 05:22 skeldark wrote:On July 12 2012 04:54 Lysenko wrote:On July 12 2012 03:24 skeldark wrote: So you assume its not meaningful even if its obvious that it is.
If you haven't shown us that your data set fits the assumptions used by the model you're using to analyze it, nothing is "obvious." look my example again. Data A any data. 0 Assumptions. You agree its significant or not? Statistical signficance is quantifiable - this isn't a matter of agreement or disagreement. Set a P-value, do a single greater than/less than comparison, and you either have it or you don't. If you knew statistics well you'd be asking for better tests than accusing me of "trolling". Can you please just layout what exactly happenned here? Just lay out how you are getting to the .-29.87 etc values. You take a group. Average the MMR... and then? What's this number coming form exactly? Is it the difference from the mean of all groups? Lastly, statitistical studies are NEVER perfect, whether it be sampling, distribution fitting, or model assumptions. We're not holding you to a perfect standard - what you need to do is clearly layout the ASSUMPTIONS you made while compiling the data so the readers can determine whether or not these assumptions invalidate the conclusion.
Like i said... i take a group of an race. Average the mmr substract the total average mmr of the data from it and have an value. The diffrence between the Race group and expection. I assume: when there is no imbalance the average of all terran players = the average of all players.
Serious do i own you anything? because you act like it. Read your post in this thread,
You waste all your time attacking my points and dont take a single second on doing something useful yourself. You say to agree you have to do x and than tell me to do it?
If you come up and say: i calculated it and i can prove your wrong. that would be fine!
We are not at work here or at university. We are on a community forum. If you are interested in it and think its important to calculate it why dont you just do it?
|
|
On July 12 2012 05:57 monkybone wrote: I think average MMR is a good statistic and does signify balance. But it assumes one key thing: the activity level of the players of each race is equally distributed. I.e. Zerg, Terran and Protoss have proportional amounts of players playing at each activity level. By focusing on diamond and above I believe this is something we can assume. Taking in every player, the activity level distribution will presumably be heavily skewed to the left for Terran, meaning a lower average MMR.
You can think of it as this: suppose each race did have proportional activity levels equal average MMR. Now add a large group of people with a low activity level for one race, say Terran. This will significantly decrease the average MMR for the race.
"the activity level of the players of each race is equally distributed " I know thats not the case. My data is from active users because i collect the gamedata ingame life.
what you describe is not an mistake to find imbalance. It is imbalance. If i have less players of one race on higher skillevel than overall this is imbalance. There can be many reasons for it ( people dont play because race is boring on high level / people dont play because the design is in favour of an other race, there are no people of this race at this level at all ) , but its all imbalance.
Thats why i pointed out in the op. I detect imbalance not the reason for it. Imbalance of the data dont have to be imbalance of the game design. But this is no problem with my method this is a general problem whatever method you use. An not unsolvable one.
|
|
1) In sc2 at the moment the activity level of the players of each race is NOT equally distributed.
2) my data is only from active players because my tool collects games life, while the user is playing.
"what you describe is not an mistake to find imbalance. It is imbalance." ? I dont know how to explain this different than i did.
|
On July 12 2012 05:22 skeldark wrote: But in this case why dont you do it your way and publish it here?
I'm avoiding doing this because I don't currently know the relevant math well enough to do it, and even despite that it's a huge project.
Serious do i own you anything? because you act like it. Read your post in this thread,
Of course you don't owe us anything. However, if you want to use an extremely sophisticated statistical method to convince someone of something, you'll have to go through each step carefully to show that you're doing it correctly.
Note that even the basic idea by itself of computing an Elo rating is a very complex statistical method. Proceeding to take those ratings and make statements about them in the aggregate just compounds the complexity.
You took this all on yourself when you started this project, and it's fine if you don't care to finish it, but it's not ok to use methods like these with less rigor than they require and then turn around and say "Ok, I broke all of this, you guys fix it for me."
|
On July 12 2012 06:17 Lysenko wrote:Show nested quote +On July 12 2012 05:22 skeldark wrote: But in this case why dont you do it your way and publish it here? I'm avoiding doing this because I don't currently know the relevant math well enough to do it, and even despite that it's a huge project. I understand. Same for me. this is a site project. The time i have i use to work on the mmr analyser. lolcanoe ? how long do you need?
On July 12 2012 06:17 Lysenko wrote:Show nested quote +On July 12 2012 05:22 skeldark wrote: But in this case why dont you do it your way and publish it here? I'm avoiding doing this because I don't currently know the relevant math well enough to do it, and even despite that it's a huge project. Show nested quote +Serious do i own you anything? because you act like it. Read your post in this thread, Of course you don't owe us anything. However, if you want to use an extremely sophisticated statistical method to convince someone of something, you'll have to go through each step carefully to show that you're doing it correctly. Note that even the basic idea by itself of computing an Elo rating is a very complex statistical method. Proceeding to take those ratings and make statements about them in the aggregate just compounds the complexity. You took this all on yourself when you started this project, and it's fine if you don't care to finish it, but it's not ok to use methods like these with less rigor than they require and then turn around and say "Ok, I broke all of this, you guys fix it for me."
Thats not what i did at all and exactly what im talking about. If you want to put it this way than i would say: i show the way 
But serious, look at my data and just tell me,
do you belief based on what i posted that the mmr data is depending on the race?
And if you dont want to answer that question because i did not prove it in the way you want it to be proven Answer this one:
Is this data more accurate than all the other methods to measure unbalance we saw in the past? tldr winratio, how many of race x are in top y or tournament z...
|
|
|
basic yes but instead of having 6 overlapping skill- steps i have 3000+ Also grandmaster is the worst example because its not at all top 200. But that is an different story.
|
On July 12 2012 06:38 skeldark wrote: if you talk about my mmr-calculator: it shows your MMR. For that i have to know nothing about the Bayesian skill rating models, becaues i back-engineer towards it.
I think you focuss to much on the skill system here. Thats not what this is about...
I edited my post to remove my comments on this because I figured you did something like that and I didn't want to be unfair or over-critical.
|
On July 12 2012 06:40 Lysenko wrote:Show nested quote +On July 12 2012 06:38 skeldark wrote: if you talk about my mmr-calculator: it shows your MMR. For that i have to know nothing about the Bayesian skill rating models, becaues i back-engineer towards it.
I think you focuss to much on the skill system here. Thats not what this is about...
I edited my post to remove my comments on this because I figured you did something like that and I didn't want to be unfair or over-critical. 
Edit my too. Btw i bet the blizzard guys have a lot of fun reading our discussion 
|
Well i am not sure if you have your the basic statistical requirements considered?
As far as i am concerned you want to draw conclusions from a smaller sample size to draw a conclusion to a bigger sample size.
Define:
What is the smaller sample size? What is the whole data set you want to draw conclusions about? How do you select the smaller sample size from the bigger data set? (has each element in the whole data set the same chance to get into the sample?) What statistical distribution underlies the value you want to analyze? What is the probability of error in your sample size? How big has your sample size to be? (which what statistical significance niveau are you working?) What is your Hypothesis? you want to prove or disprove?
You can't just crunch some numbers (dont want to down play your work) and say that they state balance overall. They may tell balance of your smaller sample, but are we sure they also tell us anything about overall balance?
|
Interesting results. Thank you for doing this. ^_^
|
On July 12 2012 06:49 freetgy wrote: Well i am not sure if you have your the basic statistical things were considered:
As far as i am concerned you want to draw conclusions from a smaller sample size to draw a conclusion to a bigger sample size.
Define:
What is the smaller sample size? What is the whole data set you want to draw conclusions about? How do you select the smaller sample size from the bigger data set? (has each element in the whole data set the same chance to get into the sample?) What statistical distribution underlies the value you want to analyze?
-my collected data -all active sc2 players -No, explanation why and how this affect the result you see when you start reading the op -I have no clue whatsoever! and before you now start to run amok, start reading the op and than the last 5 pages of the discussion
How many of you guys are out there? Are you the one that wants to calculate it? or are you the one that just ask his 4 questions? data source in linked in the op.
On July 12 2012 06:52 Tsunami49 wrote: Interesting results. Thank you for doing this. ^_^ Thank you very much.
|
- How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing
|
On July 12 2012 06:59 freetgy wrote: - How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing
then why did you ask questions that are answered in the op? If you did not understand what you read there should read it 3 time.
"Define all active sc2 players..." Serious some people here...
|
Also grandmaster is the worst example because its not at all top 200. But that is an different story.
True, but it shows that above silver Terran is under represented globally in every league.
I guess the next step is to integrate mmr calculations into an online database, and then into sc2gears...
|
So the OP tried proving imbalance by using ladder MMR data that can only be gotten by people running his addon. This gives a very small sample size, especially as can be seen by the data he provided. I was expecting more statistics to go with his claims, maybe average MMRs for the 3 races, standard deviations, outliers, maybe giving us the total number of players he has in his data. Also saying "he deviation shows, that the diffrence of the race-values are to big, to be explained with an random errors." is a really weak statement when not backed up with data showing how likely it is to get such a deviation, perhaps with a P value.
|
Ok guys last time: Is this a universtiy paper about sc2 balance ? No! Do i get money for it ? No! Do i care which race is imba because i want to whine ? No!
Beside all of that before you act like you can test me make sure you understood what i did dont randomly assume stuff. Look at post above me
I publish the data i collected with my own program that i wrote to backcalculate mmr. I found a very interesting anomalie in the race data. I programmed a quick testroutine to show this anomalie and that is very unlikely that its a random source. If you are interested in making it a university paper i will not stop you . I give you all my source data and you dont even have to say "thank you".
If someone have something PRODUCTIVE to add, just pm me.
On July 12 2012 07:04 Natespank wrote:Show nested quote +Also grandmaster is the worst example because its not at all top 200. But that is an different story. True, but it shows that above silver Terran is under represented globally in every league. I guess the next step is to integrate mmr calculations into an online database, and then into sc2gears...
? thats not the next step , thats where i have my data from! I did this 1 moth ago. http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
|
On July 12 2012 07:02 skeldark wrote:Show nested quote +On July 12 2012 06:59 freetgy wrote: - How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing then why did you ask questions that are answered in the op? If you did not understand what you read there should read it 3 time. "Define all active sc2 players..." Serious some people here...
No you do not understand that the statistics you used have to be at least made clear so we can understand over what you did draw conclusions on.
1) As far as i understand, people did manually upload their replays to your tool. Whichs means you only use sampling out of this data set therefore can only draw conclusions over this data set and not over "ladder-balance".
2) Because of this is ensured that your data set is unbiased? One basic assumption that could be made is that in your data set people on the extremes are over represented, because those are the ones that are interested in balance most and are active here in this forum, while the real average mostlikely could be underrepresented.
Don't get offended just because we ask questions to see if your work is solid. I appreciate your wrk but it is the presentation by you that is lacking.
|
On July 11 2012 11:42 skeldark wrote:Show nested quote +On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391Show nested quote + EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering.
http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4.
Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution.
That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check).
|
Thats my problem. You come here ask your question but did not check the information first. And you are not the first you are nr. 100
As far as i understand, people did manually upload their replays to your tool. Wrong, they upload bnet data while playing
Whichs means you only use sampling out of this data Wrong i use mostly the opponent they play not the user that have my program
Because of this is ensured that your data set is unbiased? Wrong because the opponent is unbiased but his skill range is not because user skill = opponent skill
One basic assumption that could be made is that in your data set your people on the extremes are over represented Yes. My userbase is overrepresented on the higher skill. Thats clearly pointed out in the op.
Sorry to be harsh but i really regret publishing this by now
|
On July 12 2012 07:15 VediVeci wrote:Show nested quote +On July 11 2012 11:42 skeldark wrote:On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391 EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering. http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4. Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution. That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check). Its more likely ELO not Bayesian inference!
Did not want to discuss this to the end because i tryed to make the point that it does not matter for the race value. I know its outdated i dont understand it myself. I know guass glock could be wrong. Dont overrate the picture tho. Its only 1 picture in the programm to give the people an idea where they are on the ladder. Its not used to calculate the MMR-value.
Also whatever system it is, is not so smooth., Blizzard had to correct the offset several time to get back to the 20/20/20 and it can be that they gave up on this by now.
|
On July 12 2012 07:24 skeldark wrote:Show nested quote +On July 12 2012 07:15 VediVeci wrote:On July 11 2012 11:42 skeldark wrote:On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391 EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering. http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4. Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution. That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check). Its ELO not Bayesian inference! did not want to discuss this to the end because i tryed to make clear that it does not matter for the race value. I know its outdated i dont understand it myself. I know the guass could be wrong. Dont overrate the picture tho. Its only 1 picture in the prog to give the people an idea where they are on the ladder. Its not used to calculate the mmr value. Also whatever system it is the offest correction of the past show us the graph is not so smooth., Blizzard had to correct the offset several time to get back to the 20/20/20 and it can be that they gave up on this by now.
Ok, I'm still not convinced that MMR is ELO but that's just an aside really. The work you did was impressive and I don't mean to diminish it by asking, I just wanted to satisfy my own curiosity. Even if Blizzard doesn't use ELO, your conclusions are still valid in terms of relative race strength given the data, and their model would also reflect this.
And oops, I meant to remove the picture, I wasn't considering it in my arguments though anyway.
Edit: added "wasn't" to final sentence.
|
On July 12 2012 06:20 skeldark wrote:
I understand. Same for me. this is a site project. The time i have i use to work on the mmr analyser. lolcanoe ? how long do you need?
Whether this is a community forum or a university-level discussion doesn't change the validity or importance of the critiques here. Where your argument is posted unfortunately has no bearing on the scrutiny it deserves.
And why I haven't been helpful? Because I wanted to see if the self-admitted arrogance here was justified or not. I wanted to confirm my suspicions that you actually do not have a strong grasp of statistics and I wasn't just misinterpreting your calculations. Now that my suspicions are confirmed, let me clearly layout problems that I see.
1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well.
|
You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction.
|
On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced.
The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data.
|
On July 12 2012 07:55 lolcanoe wrote:Show nested quote +On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced. The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data.
And what exactly are you attempting to prove? That there isn't a sum total of 5-6 wins difference between terran and zerg and that random players don't have a significantly decreased chance to win games at reasonably high levels of play? Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought!
Do you have a better method of representing this data than has been presented? Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate? Now I am not going to pretend to be a mathematician as I'm a physicist and we do retard maths, but there are plenty of people who are. Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
|
On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction.
Spoken like a true physicist
|
On July 12 2012 07:48 lolcanoe wrote:Show nested quote +On July 12 2012 06:20 skeldark wrote:
I understand. Same for me. this is a site project. The time i have i use to work on the mmr analyser. lolcanoe ? how long do you need?
Whether this is a community forum or a university-level discussion doesn't change the validity or importance of the critiques here. Where your argument is posted unfortunately has no bearing on the scrutiny it deserves. And why I haven't been helpful? Because I wanted to see if the self-admitted arrogance here was justified or not. I wanted to confirm my suspicions that you actually do not have a strong grasp of statistics and I wasn't just misinterpreting your calculations. Now that my suspicions are confirmed, let me clearly layout problems that I see. 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal. 2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average. 3. In these calculations, independence between populations is a fair concern - and should likewise be noted. 4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well.
His methodology is flawed, and obviously he could never publish this or anything, but it seems to be a significant improvement over most posts discussing balance. I didn't look through the calculations process very much, but it seems likely that his conclusions are meaningful (if not rigorous). And since blizzard is not going to use these results to inform their balance discussions, "likely meaningful" isn't an unreasonable standard.
|
On July 12 2012 08:04 Evangelist wrote:Show nested quote +On July 12 2012 07:55 lolcanoe wrote:On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced. The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data. And what exactly are you attempting to prove? That there isn't a sum total of 5-6 wins difference between terran and zerg and that random players don't have a significantly decreased chance to win games at reasonably high levels of play? Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought! Do you have a better method of representing this data than has been presented? Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate? Now I am not going to pretend to be a mathematician as I'm a physicist and we do retard maths, but there are plenty of people who are. Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
This is a bit ridiculous. If you say "I looked out across a large field and didn't see any curvature, so the earth is flat," I don't have to tell you what shape the world is to tell you that your methodology is flawed. lolcanoe isn't saying that the conclusions are necessarily untrue, just that they aren't valid.
|
And what exactly are you attempting to prove?
A statistician has no agenda outside assuring the proper use of statistics, or in the next best case, qualifying the statistics to incorporate important assumptions. Respectfully, the OP himself has noted that there should be no pre-test bias.
Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought!
You're right. "Can" being the key word. They "can" do what they want. And yes, they are entitled to their beliefs and dogmas, just as you are, but until data has been established, your verbal analysis to me is little better than a 4 year old's conclusion that Terran is completed fucked because it starts with a "T".
Do you have a better method of representing this data than has been presented?
Yes. See previous post.
Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate?
I've never found issue with his MMR calculations but neither have I attempted an analysis.
Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
You aren't a fundamentalist Christian by chance, are you? Because I've seen this before. "God is real! Fuck science because you guys don't have a valid explanation for the creation of universe either!".
No, I don't need to present an alternative solution even though I have. A viable critique does not need to come with a viable solution - ie, I could point out that you're too stupid to be in this debate, but I'm not qualified to provide a solution because I wasn't an abortion doctor prior to your birth.
|
On July 12 2012 07:24 skeldark wrote: Its more likely ELO not Bayesian inference!
If you mean Blizzard's system, it's not Elo. You really need to watch that video we've been linking.
|
On July 12 2012 08:04 Evangelist wrote: valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
Skeldark's done a lot of work that's potentially interesting, but it has some problems. Whether someone goes back and correctly analyzes his data or not, I think it's in everyone's interest to avoid a situation where in six months people are running around this forum saying "Six months ago, Skeldark proved <whatever>" when in fact the effort wasn't rigorous enough to prove anything.
People are too ready to accept the bottom-line result of someone with charts and numbers that they don't understand or have time to understand, and I think making a case that there are well-founded, legitimate questions about how this was done will ensure that it spurs further (and hopefully more rigorous) analysis instead of a lot of "common knowledge" with a shaky foundation.
It does not make my or lolcanoe's concerns less valid that we don't have the time or inclination to conduct a complete study of the data skeldark has collected.
There may be something there, but we won't really know unless someone looks at it all much more rigorously than the OP has.
|
I just realized that Skeldark was using average MMR and not the latest observed MMR of players.
|
I'm a terran in mid masters and i just played some games as zerg just for shits and giggles. I was ripping those terrans so hard. I felt kind of bad A moving my entire army at him, especially when he was trying his heart out dropping everywhere. It just feels like i dont have to work very hard for those Ws with zerg
User was warned for this post
|
On July 12 2012 08:57 NoobCrunch wrote: I just realized that Skeldark was using average MMR and not the latest observed MMR of players.
He did post numbers correcting this after I called him on it -- it's buried in one of the posts deep in the thread. This still doesn't answer my larger concerns but that was a good change.
|
I did a study of my own. I compiled some incidences of HIV/ SIV amongst various primates that supports my null - climbing trees or walking on 4 limbs is why it is not observed in non-human primates. Thanks a lot for all your help. I was not biased in my data collection and I am happy to discover the cure for AIDS.
|
On July 12 2012 08:41 Lysenko wrote:Show nested quote +On July 12 2012 07:24 skeldark wrote: Its more likely ELO not Bayesian inference!
If you mean Blizzard's system, it's not Elo. You really need to watch that video we've been linking.
In his defence it's not Bayesian inference either, its Gaussian Density filtering (I don't believe that the latter is a subset of the former though I could be wrong, Gaussian Density filtering is over my head right now). Either way, it seems that blizzard also keeps track using an ELO ranking system in parallel? Going back and analysing a players performance using ELO shouldn't be a huge issue here though, if you can start with the right data, which I'm not convince he did. Further more, looking through the calculations he used (from another thread, links in the OP), they appear to be only a crude approximation of ELO, i.e. not using the correct update formulas. I think that would still be sufficient though to draw some non-rigorous conclusions from the data.
|
On July 12 2012 09:04 Lysenko wrote:Show nested quote +On July 12 2012 08:57 NoobCrunch wrote: I just realized that Skeldark was using average MMR and not the latest observed MMR of players. He did post numbers correcting this after I called him on it -- it's buried in one of the posts deep in the thread. This still doesn't answer my larger concerns but that was a good change.
I honestly don't think that using average or latest mmr makes any difference.
My only issue is that the original post doesn't really show what he did which was essentially a simple t-test for comparing means (mean mmr) from two different samples (zerg and terran). The p-values for these tests were low meaning that zerg players had statistically significantly higher mmr than terran players. I'm still thinking about the independence stuff in between ladder games and I even busted out some of my old statistics textbooks.
The question I have is if you observe a zerg with 600 mmr does that change the probability of finding a terran with low mmr. Since MMR is a zero-sum game, a zerg with 600 mmr means that someone else (or groups) must have taken that mmr away and you would be more likely to find a terran with higher mmr. Since observing a zerg with lower mmr changes the probability of finding a terran or protoss with higher mmr does that mean that independence is violated. If so, is it ok to do the test?
I've done a lot of projects and stuff in the past where major assumptions were violated and it was ok to deviate from them.
|
On July 12 2012 09:24 NoobCrunch wrote: I honestly don't think that using average or latest mmr makes any difference.
It's never correct to average one player's MMR over time, because the MMR is already cumulative of all previous games. The simple case is that a new player's MMR ramps smoothly from 0 up to their actual skill level -- the average will be half what it should be.
More generally, using average and standard deviation to characterize measurements that aren't independent is not correct, and each MMR data point for a single player depends very strongly on the previous one.
The question I have is if you observe a zerg with 600 mmr does that change the probability of finding a terran with low mmr.
Yes, that's probably why Elo is better fit by a logistic distribution than a normal distribution, if I had to guess.
As far as the impact of violating your own assumptions -- you can get away with it only in one of two cases, either where you do the work correctly and demonstrate that the results didn't change much, or you make a numerical estimate of the magnitude of the error that your violating the assumption will introduce, and show that it's small and stable as you add to your data set.
If you don't do either of those, however roughly, it's just not possible to know what the impact is. In any case, the result for racial differences that the OP claims is quite small in an absolute sense, and could easily be the result of a very small systematic error.
|
On July 12 2012 09:40 Lysenko wrote:Show nested quote +On July 12 2012 09:24 NoobCrunch wrote: I honestly don't think that using average or latest mmr makes any difference. It's never correct to average one player's MMR over time, because the MMR is already cumulative of all previous games. The simple case is that a new player's MMR ramps smoothly from 0 up to their actual skill level -- the average will be half what it should be. More generally, using average and standard deviation to characterize measurements that aren't independent is not correct, and each MMR data point for a single player depends very strongly on the previous one.
I know but we're not dealing with time series data.
|
On July 12 2012 09:43 NoobCrunch wrote: I know but we're not dealing with time series data.
He was taking a single player's MMR and averaging multiple values of it from different times, yes.
|
I find Skeldar's data to be accurate and his methods/assumptions are reasonable. It is not surprising that Terran is underpowered as this also consistent with the latest tournament results for pro-Terran players.
|
wut, what happened to this thread?  Is this TL peer review? 
My main concern is still your formulation of the conclusions:
1)Terran is significant underpowered compared to the total data pool We can not tell if this unbalance comes from design or other reason. still will be a trigger to say that the queen range increase broke the game.
After discussing it with you, I know what you mean. But I certainly didn't at the first read, and neither did most of the other readers it seems. And that comes from you using the words "unbalance" and "underpowered" to mean lower mean MMR. While everyone else use that word to refer to the design of the game. You tried to clarify in the second line, but that only makes it confusing, as it seems to contradict the first line with the standard use od the word "underpowered". You agreed with me that the lower MMR did not necessarily mean that the stats of the units were flawed, but that is still not how most will read your OP.
If you instead would write something like:
1)Terran has on average a lower MMR than the other races. This can be due to a large number of reasons, for example, but not limited to: - Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
It is from this analysis impossible to tell what the reason is. I think you would avoid a lot of the trouble you've ended up in in this thread.
|
On July 12 2012 10:53 Cascade wrote:- Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
I support this list. Of course, I think that we should also add the caveat that even though it can't be established from this data, option 1 is significantly more likely than, say, option 4, and option 2 is probably the least likely given what we've seen in the days of GomTvT. In a sense, all of them are possible options, but Occam's Razor would do well to help us find the correct reason once we decide on the proper way to frame the data. Option 3, for instance, can pretty much be considered false until any evidence supporting it is brought forward, because it introduces an additional condition. Options 1 and 2 are the simplest because they are simple reformulations of the conclusion "Terran has the [weakest] MMR" viz a vis players behaving the way players might be expected to behave.
In order to show 3, for example, and for it to be significant, you'd need to show that the rate of race switching is higher for players who start Terran than for players who start Zerg, which seems rather difficult to do. Further, it doesn't seem like there's any reason a priori to believe that something about the Terran race makes people more likely to switch away from it, while there are very intuitive reasons to think that more players would choose Terran from the start (i.e. human beings being biased toward human beings etc).
Not really making an argument here, just saying that simply because these possibilities are equal from the point of view of the analysis itself, many of them can be readily dismissed.
|
On July 12 2012 11:03 Shiori wrote:Show nested quote +On July 12 2012 10:53 Cascade wrote:- Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
I support this list. Of course, I think that we should also add the caveat that even though it can't be established from this data, option 1 is significantly more likely than, say, option 4, and option 2 is probably the least likely given what we've seen in the days of GomTvT. In a sense, all of them are possible options, but Occam's Razor would do well to help us find the correct reason once we decide on the proper way to frame the data. Option 3, for instance, can pretty much be considered false until any evidence supporting it is brought forward, because it introduces an additional condition. Options 1 and 2 are the simplest because they are simple reformulations of the conclusion "Terran has the [weakest] MMR" viz a vis players behaving the way players might be expected to behave. In order to show 3, for example, and for it to be significant, you'd need to show that the rate of race switching is higher for players who start Terran than for players who start Zerg, which seems rather difficult to do. Further, it doesn't seem like there's any reason a priori to believe that something about the Terran race makes people more likely to switch away from it, while there are very intuitive reasons to think that more players would choose Terran from the start (i.e. human beings being biased toward human beings etc). Not really making an argument here, just saying that simply because these possibilities are equal from the point of view of the analysis itself, many of them can be readily dismissed. Yeah, you can make arguments for and against the different reasons. And no matter how straight forward and reasonable the arguments may seem to you, you risk a 10 page heated discussion about it. Possibly involving Higgs bosons and Christian religion.
Just saying that the OP probably would be better of keeping his head out of that beehive. We get that discussion enough anyway, and he has enough material to present a much cleaner OP without going there.
edit: see, with the current formulation, we get posts like the one below.
|
wish this showed the number of people who got their rank from skill and who got their rank due to imbalances in the game. Regardless this data is great, and now I dont feel so bad for playing Terran now that I know I am playing a race that is at a disadvantage. Then again you ask any Protoss player they will say Terran is OP lol
Keep up the good work Skeldark!!
|
Searched throw the new post: -Not a singe valid argument why the numbers are wrong. -Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
MMR: You tell me to watch information that you only know because me or not_that discovered it. You talk about that my the mmr i analyse is not 100% correct without understanding its race independent. Besite the fact, that no one of you know how i analyse MMR, how i correct derivation and that the method is working flawless for 100.000 games by now. Every possible mistake i do in mmr calculation don't affect the result of this calculation because my MMR calculation is race independent. A simple fact, i point out in the op and most here ignore.
Definition of imbalance: I am not responsible for people who misinterpret my data. Many of you "statistic guys" do so too! You complain that my definition of imbalance is not the one that people on TL use.
I thought this is clear to people with statistic background but to point it clearly out: I detect unbalance in MMR values. Not the reason because this one is not mathematical traceable. Not for me not for blizzard, for no one.
|
On July 12 2012 09:14 VediVeci wrote: In his defence it's not Bayesian inference either, its Gaussian Density filtering (I don't believe that the latter is a subset of the former though I could be wrong, Gaussian Density filtering is over my head right now).
The language is a little confusing, because Bayes defined a particular optimization technique for making probabilistic guesses about a situation's outcome, but a much wider class of techniques are referred to as "Bayesian" because they are mathematically or philosophically similar. In fact, I'd call Elo "Bayesian" by the latter definition, because it's trying to converge on the prediction that's most likely to be accurate, but it's just not using an explicit error function to do so.
Anyway, from the guy's talk at UCI, I got the impression that Gaussian density filtering was just a particular technique applied in the context of a Bayesian algorithm, and not actually the name for the entire matching technique. However, I don't really know about that.
|
On July 12 2012 19:18 skeldark wrote: Searched throw the new post: -Not a singe valid argument why the numbers are wrong. -Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
MMR: You tell me to watch information that you only know because me or not_that discovered it. You talk about that my the mmr i analyse is not 100% correct without understanding its race independent. Besite the fact, that no one of you know how i analyse MMR, how i correct derivation and that the method is working flawless for 100.000 games by now. Every possible mistake i do in mmr calculation don't affect the result of this calculation because my MMR calculation is race independent. A simple fact, i point out in the op and most here ignore.
Definition of imbalance: I am not responsible for people who misinterpret my data. Many of you "statistic guys" do so too! You complain that my definition of imbalance is not the one that people on TL use.
I thought this is clear to people with statistic background but to point it clearly out: I detect unbalance in MMR values. Not the reason because this one is not mathematical traceable. Not for me not for blizzard, for no one.
I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
Get 10 times as much data and there might be a statistical value in it.
|
On July 12 2012 19:18 skeldark wrote: Searched throw the new post: -Not a singe valid argument why the numbers are wrong.
The point isn't your numbers, it's your technique. You can accidentally apply the wrong technique and get the right numbers, but the point is that if you do that, we'll never know. If your technique isn't correct, the whole thing is untrustworthy.
-Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
You're the one who did the work, you really need to correct these problems yourself or have your work disregarded by people who understand the statistics involved.
Edit: I can't even make a half-assed attempt to look at your data until the weekend, sorry. Even then, I can't promise being able to put in the time to do a proper analysis.
|
On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that.
|
On July 12 2012 19:36 Lysenko wrote:Show nested quote +On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that.
It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample...
|
On July 12 2012 19:34 Lysenko wrote:
You're the one who did the work, you really need to correct these problems yourself or have your work disregarded by people who understand the statistics involved. No im not.
There are no problems. You just misunderstand what i did. I show a unbalance in mmr values per race that is not possible to explain with random mistake. Thats all i did. If you want something else, do it yourself. Arguments about the skill-system have NOTHING to do with my calculation. If you dont understand this fact than you dont understand what i did.
OFFTOPIC: the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own.
|
On July 12 2012 19:38 Morfildur wrote:Show nested quote +On July 12 2012 19:36 Lysenko wrote:On July 12 2012 19:33 Morfildur wrote: I didn't read much of this thread because it seems to be mostly arguments about definitions or your methods and such... but unless the datafile you posted is wrong, you have a sample of 5592 players which makes any analysis useless.
That's not really a fair criticism. 5592 players is a huge sample, even by the standards of much more complicated studies in fields like medicine. Only Blizzard will ever do better than that. It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample... My data is biased towards master (near 40% i think), but its only a question of time.
I get 5k games per day with 5k potential new accounts.
|
On July 12 2012 19:38 Morfildur wrote: It's less than 2% of the total player base, it would be about a fourth of the masters players. That's not a huge sample...
It's far more than enough to draw inferences about the population as a whole. Generally the uncertainty of uncorrelated aggregate data from a given sample size improves with 1/sqrt(n), so his sample can be analyzed to maybe 99% accuracy and the entire Starcraft population to about 99.8% accuracy. Not a big difference.
|
On July 12 2012 19:38 skeldark wrote: I show a unbalance in mmr values per race that is not possible to explain with random mistake.
The mistakes we're pointing out in your analysis are systematic, not random. If they were random, they wouldn't be problematic. Systematic errors can often produce a result where you THINK you have pinned things down to a certain accuracy but you're actually off by a much larger amount. That's why it's such a big deal.
Arguments about the skill-system have NOTHING to do with my calculation.
If the skill system you're using (which I gather is Elo) produces a different distribution of results than you're assuming in your calculations (and the difference between a logistic distribution and a normal distribution is small but real), then you could EASILY be off by 20 or 30 Elo points in your estimates of the uncertainty of your average.
the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own.
That makes no sense. Scores between these different skill rating systems don't translate from one to the other. Also, we're all pretty sure the information in that video has not changed in a long time.
|
On July 12 2012 20:07 Lysenko wrote:Show nested quote +On July 12 2012 19:38 skeldark wrote: I show a unbalance in mmr values per race that is not possible to explain with random mistake. The mistakes we're pointing out in your analysis are systematic, not random. If they were random, they wouldn't be problematic. Systematic errors can often produce a result where you THINK you have pinned things down to a certain accuracy but you're actually off by a much larger amount. That's why it's such a big deal. If the skill system you're using (which I gather is Elo) produces a different distribution of results than you're assuming in your calculations (and the difference between a logistic distribution and a normal distribution is small but real), then you could EASILY be off by 20 or 30 Elo points in your estimates of the uncertainty of your average. Show nested quote +the video is old. Whatever method they use i use too because i back-engineer their mmr value. I dont calculate it on my own. That makes no sense. Scores between these different skill rating systems don't translate from one to the other. Also, we're all pretty sure the information in that video has not changed in a long time. I think i finaly understand your problem. You think i run my own skill-system!
I dont use any skill system! I dont wrote a skill-system and blindly assume it is the same blizzard use! Im NOT calculating the skil, i back-engenier it. I dont care what system generates the number. Forget all the technical details and just think about i have direct access to blizzard ladder db. No need to know the function if you know the result of the function!
I am of by +-25 elo points most likely even more. And it does not care!
I can add Random numbers to ANY MMR point and my argument still stands! Thats what im talking about. The mistake in mmr dont affect the result! I take the MMR nr of BLIZZARD i dont calculate the MMR number my self!
|
skeldark i feel so sorry for you...
i can only hope that i am just part of a silent majority who got it from the op. The part that usually doesn't feel the need to write anything.
You have done interesting work!
@Imbalance How can you not get that? If a race is picked by all casuals (reducing their av mmr) than this itself is a form of imbalance - maybe they took this race because it looks uber-awesome, so a graphical imbalance, or got tons of tutorial, so an information imbalance.
And no, this doesn't tell us if one race is stronger in a theoretical game-design-scenario, but no one claimed that to begin with.
|
OK I went back and re-read in detail your writeup of how your actual tool works. I had mistakenly believed that you were actually calculating your own Elo scores for players.
What you're reverse engineering is the adjusted point value. Trying to infer something from this about actual skill ratings has some problems:
1) You are assuming the MMR is Elo, when it's absolutely not. This is explained clearly in the UCI video.
2) There is a 1:1 conversion between MMR and adjusted point score, which are the units you're backing out with your tool. Seeing backed-out adjusted point scores is interesting, but that 1:1 conversion is not necessarily linear, and if it's not linear then you can't necessarily make assumptions about the distributions of the underling MMR. I mean, you can't do that AT ALL. You don't know how that conversion works.
3) The more complex skill rating systems use the uncertainty value as well as the skill number to adjust a player's score. The MMR can move by a different amount than adjusted points over the short term, because the use of difference between MMR and adjusted points provides long term pressure for adjusted points to catch up with a changing MMR. So, what your tool is doing only works for relatively stable MMR numbers.
4) Your monte carlo simulation doesn't capture actual uncertainty of the underlying MMR. The fact that adjusted points tracks MMR with a lag (as I mentioned in 3) means that fully random walks of adjusted points don't capture what the real system will do in any case. It's very possible that the differences you see between races would be much smaller than typical MMR uncertainties (which you can't see or measure) yet very unlikely with your randomly-generated pseudo-matchups.
Bottom line is that while your data regarding league boundaries in terms of adjusted points makes some sense, analyzing this data for racial differences is simply impossible because there's not necessarily a definable relationship (in the absence of information we're missing) between adjusted points and win likelihood.
|
On July 12 2012 20:13 skeldark wrote: I dont use any skill system! I dont wrote a skill-system and blindly assume it is the same blizzard use!
You can't make ANY statistical analysis of a skill system you don't know the details of. Different skill systems produce different distributions of player skill ratings. For example, if I took all the players and rated them from 1 to 1,000,000 or whatever I'd have a flat distribution. Elo produces a logistic distribution. Ideally the distribution is normal, but in the absence of real information you don't know that.
Edit: This is a small issue. You're probably not going all too wrong by assuming a normal distribution. The bigger problem in the analysis is that changes in adjusted points don't track MMR as fast as MMR moves, so you have no way to estimate or take into account the accuracy of individual MMR numbers. The short version is that adjusted points will SEEM more accurate than MMR values would because they change less fast.
|
Can't believe this thread is still going on.
+ Show Spoiler +Problem statement: OP wants to measure e-peen size and compare between different races.
Data gathering: OP constructs a e-peen measuring tool that user directly inserts on a voluntary basis. User base are high hormone individuals who are very interested to know how big their e-peen is. Therefore, these individuals are most likely already at a higher percentile of e-peen length compared to the general population.
Measurement: Users measure e-peen whenever they are playing by themselves. This play will be contested with another user. The winner will have longer e-peen and vice versa thus the true size of e-peen is estimated based on such repetition. Some e-peen have been observed to fluctuate in size up to 1000 inches in a few days. How can it be? Can the true size of e-peen be so volatile? Should it not be stable? Seems like some measurement are taken while in the state of flaccidity.
If indeed the measurement needs multiple observation to settle on a true e-peen then does that mean any single observations is then unreliable and not credible? But we like e-peen so the more is better. Let's not care about that.
Methodology: The statistics earlier is based on the average over a long period of time before the individual has truly established his true e-peen thus if there are any upwards biased (likely because they are all looking to gain the next level of e-peen recognition), the value will be severely underestimated. Not to mention those single observations from more outdated time. Later it is changed to be the latest e-peen measures. Pssh.. we should ignore testing anyway let alone use the correct test tool.
Summary: Human have the shortest e-peens. Humanoid aliens bits and bug tentacles have imbalanced in size.
|
This is pretty cool. I wonder how accurate it actually is and what Blizzard uses for their own methods
|
1) You are assuming the MMR is Elo, when it's absolutely not. This is explained clearly in the UCI video. you know that he guy who found the video is the same guy that wrote the f - function about the adjusted points? not_that! and we have more source about it than the video. We use the f- function and we can SEE it fits. We can prove it fits! What do you think we did in last month. We validate the f-function and other parts of the back-engeniering, We did not just come up with it. We looked on 100.000 games and analysed them for many month!
2) There is a 1:1 conversion between MMR and adjusted point score, which are the units you're backing out with your tool. No its not! adjusted points is a part of it together with other values. no 1:1 ratio. You did not understand the f function. Also the f function is only 10% of the work to find out the MMR. Its way more complicated than that.
3) The more complex skill rating systems use the uncertainty value as well as the skill number to adjust a player's score. No.We thought so to but we found no evidence of the data for it. It act like predicted.
4) Your monte carlo simulation doesn't capture actual uncertainty of the underlying MMR. It dont even have to. It dont need mmr .
i could come up with the color of my coffee instead of mmr. If it produce the result i publish than the race in sc2 affect the color of my coffee! The fact that you are still talking about disputation of skill functions tell me that you did not understand what im doing here. Because it have nothing to do with skill-functions!
My MMR calculation dont prove the result! the result prove my MMR calculation! Thats the main point you dont understand!
|
On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple).
|
On July 12 2012 20:34 Mendelfist wrote:Show nested quote +On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple).
What he's reverse engineered are MMR values mapped back into adjusted points and then mapped from there into an Elo-like point system. The problem is that the mapping between MMR and adjusted points may not behave well for the case where a player's not in equilibrium. That may not be a problem for a player with stable MMR, but across a large population many of the players won't be in equilibrium at any particular time, and the interesting information is in those players.
This difference may not affect the averages very much, but it definitely will affect an estimate of how likely a particular difference between scores is in random play. He's doing this monte carlo simulation to guess how likely those differences between races are, but his monte carlo simulation doesn't capture nonlinearity in the MMR -> adjusted points relationship when a player is NOT in equilibrium.
|
I like how protoss is the only race to show up at 3000+ mmr.
It's really hard to quantify, "Imbalance" because of how difficult it is to factor in individual player skill. I really appreciate your effort even if your sample size is somewhat small, but then again I can imagine how large a pain in the ass it is to collect that many replays.
|
On July 12 2012 20:42 Lysenko wrote:Show nested quote +On July 12 2012 20:34 Mendelfist wrote:On July 12 2012 20:23 Lysenko wrote: What you're reverse engineering is the adjusted point value. No, you are not giving him enough credit. He is reverse engineering the actual MMR value, and as far as I can tell he has succeeded. It is quite revolutionary work (and not very simple). What he's reverse engineered are MMR values mapped back into adjusted points and then mapped from there into an Elo-like point system. The problem is that the mapping between MMR and adjusted points may not behave well for the case where a player's not in equilibrium. That may not be a problem for a player with stable MMR, but across a large population many of the players won't be in equilibrium at any particular time, and the interesting information is in those players. You change topic but what you discribe is not the case. I try to explain that earlier. We searched for this because we thought the exact same. The strange thing is : we did not found it. The players act like predicted without it!+ You try to argue that the mmr calculation of us is wrong. You can do so in the thread about the mmr calculation! But all my graphes and collected data of last month prove you wrong!
But this is offtopic and have NOTHING to do with what i did here. I dont know how to explain it else to you than i did
On July 12 2012 20:44 cydial wrote: I like how protoss is the only race to show up at 3000+ mmr.
It's really hard to quantify, "Imbalance" because of how difficult it is to factor in individual player skill. I really appreciate your effort even if your sample size is somewhat small, but then again I can imagine how large a pain in the ass it is to collect that many replays.
I wrote this program the rest is automatic: http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
the top top player have nothing to do with the analyse. Its just that i only know few top players race. But this are 1-3 players and dont affect the result. I have data of way more but not their races yet.
|
On July 12 2012 20:46 skeldark wrote: But this is offtopic and have NOTHING to do with what i did here. I dont know how to explain it else to you than i did
It's not off topic at all. All it's about is whether your monte carlo simulation accurately estimates the likelihood of those differences occurring randomly. Your simulation doesn't take everything we know about the system into account. That's a huge part of your thread here.
Edit: To estimate this accurately, your simulation would have to take into account the ACTUAL variability of Blizzards ACTUAL MMR system, and then track that MMR with points the way Blizzard's system does. You're not doing any of this, and as far as I can tell, it's not possible to do it either.
|
On July 12 2012 20:42 Lysenko wrote: What he's reverse engineered are MMR values mapped back into adjusted points and then mapped from there into an Elo-like point system. The problem is that the mapping between MMR and adjusted points may not behave well for the case where a player's not in equilibrium. I'm not sure what you are getting at. There are a lot of special cases where the mapping can not be done for various reasons, but he is aware of this. I don't think there is a problem with unstable players. In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player.
|
On July 12 2012 20:51 Mendelfist wrote: In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player.
This can only be true if your MMR is in equilibrium. If your MMR is changing rapidly, meaning that the current MMR estimate for you is very different from the actual value, one game won't get you there.
|
On July 12 2012 20:53 Lysenko wrote:Show nested quote +On July 12 2012 20:51 Mendelfist wrote: In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player. This can only be true if your MMR is in equilibrium. If your MMR is changing rapidly, meaning that the current MMR estimate for you is very different from the actual value, one game won't get you there. My data prove you wrong. See pm. I dont think we make any progress in this discussion. You assume many wrong points about our mmr calculation and bring up problems we solved long time ago. And all this have nothing to do with the data in this thread.
Update i have 10.000 K data-points with race by now! Will upate op when calculation is done.
|
On July 12 2012 20:53 Lysenko wrote:Show nested quote +On July 12 2012 20:51 Mendelfist wrote: In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player. This can only be true if your MMR is in equilibrium. If your MMR is changing rapidly, meaning that the current MMR estimate for you is very different from the actual value, one game won't get you there.
Ok, maybe we are talking about different things. I'm only talking about the MMR estimate that the matchmaking system uses, not some actual skill value which the MMR theoretically should converge to. I don't even think this actual value should be called MMR. That's very confusing.
I probably also should let skeldark speak for himself. He is already covered in tar an feathers, while I'm not, yet. :-)
|
Listen, we're getting into details that are beyond what I can reasonably talk about without going back and reviewing the entire thing from end to end. Let's put this discussion off until the weekend and I'll go through all the work with fresh eyes and continue the discussion then.
|
On July 12 2012 20:56 skeldark wrote:Show nested quote +On July 12 2012 20:53 Lysenko wrote:On July 12 2012 20:51 Mendelfist wrote: In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player. This can only be true if your MMR is in equilibrium. If your MMR is changing rapidly, meaning that the current MMR estimate for you is very different from the actual value, one game won't get you there. My data prove you wrong.
Mendelfist was right, I was speaking about the "actual" skill value that MMR is trying to estimate, so our wires are crossed here. It's 5 a.m. in California, so a bad time for me to be posting about this.
|
On July 12 2012 21:12 Lysenko wrote:Show nested quote +On July 12 2012 20:56 skeldark wrote:On July 12 2012 20:53 Lysenko wrote:On July 12 2012 20:51 Mendelfist wrote: In principle I think your MMR can be calculated quite accurately from only one game, even if you are a new player. This can only be true if your MMR is in equilibrium. If your MMR is changing rapidly, meaning that the current MMR estimate for you is very different from the actual value, one game won't get you there. My data prove you wrong. Mendelfist was right, I was speaking about the "actual" skill value that MMR is trying to estimate, so our wires are crossed here. It's 5 a.m. in California, so a bad time for me to be posting about this. I call the theoretical value that the system try to find out "real skill" . We come up with a lot of word definitions by now to avoid this kind of confusion in discussions.^^ Sometimes i use them and forget that others dont know my special definitions.
eg: Dmmr = division mmr (not yet cleaned form ladder offsets) Ammr = analysed mmr (my endresult) Cmmr = caped mmr MMR = The endresult of blizzards function ommr = the mmr of the opponent pmmr = the mmr of the player
|
I see three potential problems that could confound the data presented and I'm sure some, if not all, have been mentioned before. Please correct me if any of my assumptions are incorrect.
The data measured is for all leagues and all users of your tool, which, depending on your view of how the game should be balanced, may or may not be relevant. Personally, I believe that the game should be balanced around the higher levels of play, but that's up to Blizzard ultimately.
Also, non-game balance/design factors can affect this measure and would be really hard to actually account for. I'm not saying they exist, but if they do, they would be hard to identify and account for. Maybe there are more Terran players in lower leagues due to it being the race in the campaign. (these players are most likely not using your tool, but it's an example of the kind of bias that wouldn't be accounted for by this measure) There are several other of these outside influences that could affect the data.
The other thing, which could be seen as both a positive and negative, is the fluidity of this data. You mention this in the OP, but I wanted to talk about it a little bit. Because the averages are so close and MMR changes so rapidly, this data becomes a snapshot of a period in time. The validity of the data disappears in a very short time after it is published. It literally may already have changed. If you combine that with the fact that Terran metagame is in-flux for one of their matchups and you may get data that suggests that game design changes are necessary, when in reality it will correct itself over time.
All this being said, I very much appreciate the work that you have done with the MMR tool and putting together this data. I hope that good things can come of it and hope that the community can be a little more respectful of people who put time and effort into making it better.
|
On July 12 2012 22:01 TrippSC2 wrote: I see three potential problems that could confound the data presented and I'm sure some, if not all, have been mentioned before. Please correct me if any of my assumptions are incorrect.
The data measured is for all leagues and all users of your tool, which, depending on your view of how the game should be balanced, may or may not be relevant. Personally, I believe that the game should be balanced around the higher levels of play, but that's up to Blizzard ultimately.
Also, non-game balance/design factors can affect this measure and would be really hard to actually account for. I'm not saying they exist, but if they do, they would be hard to identify and account for. Maybe there are more Terran players in lower leagues due to it being the race in the campaign. (these players are most likely not using your tool, but it's an example of the kind of bias that wouldn't be accounted for by this measure) There are several other of these outside influences that could affect the data.
The other thing, which could be seen as both a positive and negative, is the fluidity of this data. You mention this in the OP, but I wanted to talk about it a little bit. Because the averages are so close and MMR changes so rapidly, this data becomes a snapshot of a period in time. The validity of the data disappears in a very short time after it is published. It literally may already have changed. If you combine that with the fact that Terran metagame is in-flux for one of their matchups and you may get data that suggests that game design changes are necessary, when in reality it will correct itself over time.
All this being said, I very much appreciate the work that you have done with the MMR tool and putting together this data. I hope that good things can come of it and hope that the community can be a little more respectful of people who put time and effort into making it better.
1) its mostly the opponents. I collect data of my users and everyone they played. 2) data balance dont have to be desing balance. But this point is valid for every method. You cant tell the reson for the unbalance onl y that the data is out of balance 3) Agree. Can not tell yet how this will turn out. I will public monthly or 2 weeks snapshots depending how much data i get in. We will see
4) thank you. I dont have time to make all stats of my data. But i collected game lenght to. So someone can make statistic about gamelengh to winratio
|
On July 11 2012 01:50 1st_Panzer_Div. wrote: Whoah, rechecked that, you have 149,000 games of data. And you are claiming 4% of that is you as well?
So you have 5900 games of your own in this?
And why did you run the random deviation tests than only running 1,000 games, and not at least equal to the 149,000. (You actually should run random monte carlo's for whatever the estimated current userbase is to get some mock battle.net ladders from a perfectly balanced game). I could easily pick 1,000 games out of your current data and show significant imbalance towards any of the three races.
Also what are the dates your data is from?
This is a cool idea... the deviation bit is just not nearly enough random games to be accurate. What program did you use to run these tests, and was it length of the test that prevented you from doing a few hundred thousand?
Someone is crying.
User was temp banned for this post.
|
On July 12 2012 07:48 lolcanoe wrote: 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well.
At least do the easy part and fix 1 and 2, and note very carefully what test was run (which STD's did you use?) to calculate statistical signficance.
|
On July 12 2012 20:48 Lysenko wrote:Show nested quote +On July 12 2012 20:46 skeldark wrote: But this is offtopic and have NOTHING to do with what i did here. I dont know how to explain it else to you than i did
It's not off topic at all. All it's about is whether your monte carlo simulation accurately estimates the likelihood of those differences occurring randomly. Your simulation doesn't take everything we know about the system into account. That's a huge part of your thread here. Edit: To estimate this accurately, your simulation would have to take into account the ACTUAL variability of Blizzards ACTUAL MMR system, and then track that MMR with points the way Blizzard's system does. You're not doing any of this, and as far as I can tell, it's not possible to do it either.
I asked him about this on the 1st page. His simulation is not a monte carlo according to him, and he claims to have designed his program and model himself. To me as soon as he said that he designed it himself, I just gave up, and realized there was no further point arguing about it.
|
On July 13 2012 00:23 lolcanoe wrote:Show nested quote +On July 12 2012 07:48 lolcanoe wrote: 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well. At least do the easy part and fix 1 and 2, and note very carefully what test was run (which STD's did you use?) to calculate statistical signficance. 1) I dont assume normality.
I show that 99.99% of random values are in a range +- x and my value is outsite of range x. So its very unlikely that my value is random! THATS ALL. You call yourself statistic freaks but fail to understand this simple method!
If you want to do a more complex test with the data. Feel free to do so! i dont stop you! Its not on me to prove anything. I publish data. If you want to prove something, prove it. Why does so many people think i have to do something? do you pay me?
Sorry if im harsh but this thread is full of people who dont understand anything but act like they know what they are talking about. So its hard for me to filter everytime who have a good point and who just want to look smart.
|
I love observing stats. Interesting that Terran's highest moments are in early MMR, Zerg's is near the middle, Protoss near the high middle, and then it flattens out decently.
|
Update:
Result + Show Spoiler + TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT Datasize 10063 Average MMR: 1593.1 Min Difference to be significant: 90% : +-16 99% : +-24 99,99% : +-36 Difference to average MMR per Race: T: -53.08 P: 11.18 Z: 32.05
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT MMR Filter: Only Master+ Datasize 2278 Average MMR: 2278.03 Min Difference to be significant: 90% : +-15 99% : +-23 99,99% : +-35 Difference to average MMR per Race: T: -24.42 P: 14.98 Z: 3.69
The deviation shows, that the diffrence of the race-values are to big, to be explained with an random errors.
So we come to the conclusion:
1)Terran is have significant lower average MMR compared to the total data pool We can not tell if this unbalance comes from design or other reason.
2) The unbalance is small A average win on ladder is +16 MMR
3) The data is biased towards EU/US and towards higher skill-rate.
README before writing a long post why you think that is no scientific statistic prove. + Show Spoiler +This is not an university paper about sc2 balance I dont get money for this. I dont personal care which race is op or not I publish the data i collected with my own program that i wrote to back calculate mmr. I found a very interesting anomalies in the race data. I programmed a quick test routine to show this anomalie and that is very unlikely that its a random source. I show that 99.99% of random values are in a range +- x and my value is outsite of range x. So its very unlikely that my value is random! If you want to do a more complex test with the data. Feel free to do so!Source DataIf you read the text careful, i think will agree that this is not perfect but a way better method than tldp win-ratios or random tournament results.
It says a lot about this community that over 30 people tell me what i should do and 0 people who do something with the data.
|
|
TL BANHAMMER Quote from lazyitachi removed! Which post? Looks like my ignore list is well made. He was already on the list before he wrote that 
|
You say in your OP that you were able to calculate the mmr very accurately. Is the so to speak official mmr, used by the bnet, somehow observable? I thought it was not. If it is not, how do you know that your results are very accurate?
Great thread. =)
|
Sorry, didn't read the 18 pages all, but who is the Protoss with this highest MMR?
Crazy Crawling btw, really nice to see what you can actually do with some time
|
On July 13 2012 02:40 Junichi wrote: You say in your OP that you were able to calculate the mmr very accurately. Is the so to speak official mmr, used by the bnet, somehow observable? I thought it was not. If it is not, how do you know that your results are very accurate?
Great thread. =) i can observe it because im a fucking genius. ^^ Nah it was month of work to find a way and i did not do it alone : http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
Good question. Its very hard to judge that. Promotion demotion is one way. I know before the promotion that he should get a promotion and he really gets promoted. Also i know that the opponent should be close to the player ( in the end Match making). So i can see if this is the case in average. Not allways in practise because i judge the opponent sometimes on the player so i would only observ my own mistake.
They main way to see it is by analyse the gamedata. We could find out many special rules we would never able to see if we dont have accurate values. Like MMr caps.
Also i can test single part of the process. Like my tier analyser. In the second someone plays a master player i know the exact tier. ( master only have 1) So i take a high diamond player of my user-base and predict the mmr of his next game. Than he plays a master player and i calculate the mmr independent from his history. After that i check how close my prediction from the tier analyser is to reality. Last values i checked i was not more than 5 mmr off.
But its not perfect. My data say incontroll is one of the best us ladder players and thats obvious a bug ^^
|
This is a chunk of data you analyzed there! Thanks for your insights.
I have to disagree in one point, though:
2) Why mistakes in the MMR calculation don't affect the result.
First: the accuracy of my mmr calculation is very good. But i can be wrong in some points or for some users. However nothing in the calculation takes the race into account. So every mistake in mmr calculation is independent from the race average mmr result!
Well imagine the following situation: A few month ago all races were perfectly balanced, but Terran was too strong. Then Terran got nerved, so finally all races are in perfect balance. Naturally Terran players will now start to fall in their rankings a little. This is of course reflected in the MMR value. But what if you have to much weight on the soaring/sinking factor? Then your calculated MMR for all those sinking terran players would be even lower than the "correct" MMR value. This would lead to a miss judgment of your data even thou you don't specifically check for the race.
Im not saying any of the above is true. I'm just trying to say that the conclusion (I-dont-check-for-races => (implies) Wrong-MMRs-Do-No-Harm-Or-Good-To-A-Specific-Race) must not be always true.
Btw: I love the way you presented your data. Totally unbiased and even hinting towards the smallness of the difference.
|
On July 13 2012 03:05 AKnopf wrote:This is a chunk of data you analyzed there! Thanks for your insights. I have to disagree in one point, though: Show nested quote +2) Why mistakes in the MMR calculation don't affect the result.
First: the accuracy of my mmr calculation is very good. But i can be wrong in some points or for some users. However nothing in the calculation takes the race into account. So every mistake in mmr calculation is independent from the race average mmr result! Well imagine the following situation: A few month ago all races were perfectly balanced, but Terran was too strong. Then Terran got nerved, so finally all races are in perfect balance. Naturally Terran players will now start to fall in their rankings a little. This is of course reflected in the MMR value. But what if you have to much weight on the soaring/sinking factor? Then your calculated MMR for all those sinking terran players would be even lower than the "correct" MMR value. This would lead to a miss judgment of your data even thou you don't specifically check for the race. Im not saying any of the above is true. I'm just trying to say that the conclusion (I-dont-check-for-races => (implies) Wrong-MMRs-Do-No-Harm-Or-Good-To-A-Specific-Race) must not be always true. Btw: I love the way you presented your data. Totally unbiased and even hinting towards the smallness of the difference. 
That is a good point. I wrote a long post why this is theoretical not possible but when i think about it it is. I dont have a shrinking factor so the example cant happen. I calculate each game new.
But i get the point that i could make mistake that is race biased without even knowing the race. I have to think about what this can be and if any of this factors affect my calculation At the moment i dont see such a point and if there is one we would notice in the datasets allready. But i realise that "notice" and "there is no" is not a prove against your point.
--- Site note for the guys i discussed with that mmr dont care for the result. This still holds. The statistic significant change in the result would be still significant. This would even prove that the race is a depending factor on mmr. Also this kind of mistake would not show up in any statistic analyse.
A very good point and the first valid critic on my method i see.
|
On July 13 2012 02:40 Junichi wrote: You say in your OP that you were able to calculate the mmr very accurately. Is the so to speak official mmr, used by the bnet, somehow observable? I thought it was not. If it is not, how do you know that your results are very accurate?
The answer is that it's "kind of" observable. You can figure out the relationship between the MMR of one player and the adjusted point score of another. If the second player has a fairly stable point score and has played a lot of games, then you can assume their MMR is stable and in equilibrium with their point score. Then, you look at how many points another player vs. them gains or loses. If the gain or loss is 12 points, then they have the same MMR in units of adjusted points.
What you can't really measure from a single game between two players is how the MMR probability function works for players with different MMRs. What I mean by this is that if two players gain or lose 12 points after a game, they'll have a 50/50 win/loss rate vs. each other, but if two players play and the point differential is +10/-14 if the better player wins, what's the likelihood of a win or loss?
Having a large enough data set, like skeldark is collecting, can potentially answer that question. I haven't read what they've written closely enough to know if they've backed that out, but it should be possible (just by, for example, selecting all the games that result in a +10/-14 result among players with stable point values, and looking at the percentage results.)
There's also the potential possibility that +10/-14 games have a different win likelihood in Diamond than they do in Bronze. It might be possible to back that out from the data as well, but I'm guessing that everyone's assumed that's not the case. I don't have an opinion, just mentioning the possibility for completeness.
Edit: Given skeldark's answer above, it looks like they haven't done the kind of analysis I described here, but combined with looking at promotions and demotions, this kind of analysis might help provide more useful information from this data set.
Note that I'm not criticizing the data collection or some of the aggregate info they've extracted from it, my concerns mostly focus on the monte carlo simulation and analysis in this particular OP. I think the rest of their work is very interesting stuff.
|
On July 13 2012 03:39 Lysenko wrote:Show nested quote +On July 13 2012 02:40 Junichi wrote: You say in your OP that you were able to calculate the mmr very accurately. Is the so to speak official mmr, used by the bnet, somehow observable? I thought it was not. If it is not, how do you know that your results are very accurate? The answer is that it's "kind of" observable. You can figure out the relationship between the MMR of one player and the adjusted point score of another. If the second player has a fairly stable point score and has played a lot of games, then you can assume their MMR is stable and in equilibrium with their point score. Then, you look at how many points another player vs. them gains or loses. If the gain or loss is 12 points, then they have the same MMR in units of adjusted points. What you can't really measure from a single game between two players is how the MMR probability function works for players with different MMRs. What I mean by this is that if two players gain or lose 12 points after a game, they'll have a 50/50 win/loss rate vs. each other, but if two players play and the point differential is +10/-14 if the better player wins, what's the likelihood of a win or loss? Having a large enough data set, like skeldark is collecting, can potentially answer that question. I haven't read what they've written closely enough to know if they've backed that out, but it should be possible (just by, for example, selecting all the games that result in a +10/-14 result, and looking at the percentage results. There's also the potential possibility that +10/-14 games have a different win likelihood in Diamond than they do in Bronze. It might be possible to back that out from the data as well, but I'm guessing that everyone's assumed that's not the case. I don't have an opinion, just mentioning the possibility for completeness. Understand the points you have. We checked all this month ago. The f function not-that publish in his thread is version 3. If there is something like you mention we would see that long long time ago. We searched for it. +24 -1 games have a high derivation but the f function still give results that fit in the picture of 12/-12 games . Any of this factors would show up in an not expected mmr for an single game that be noticeable even for only 1 player and for sure noticed in our 100k gamedatabase. If you where there 2 month ago you could help us a lot figure all this out 
My main defends point against critic on the MMR-method is : it work in practise without mistakes for 1 month now.
Beside that, the f function is one part not the hole MMR calculation, it only give you dmmr and even this only if some special rules are not active.
|
On July 13 2012 03:45 skeldark wrote: +24 -1 games have a high derivation but the f function still give results that fit in the picture of 12/-12 games . Any of this factors would show up in an not expected mmr for an single game that be noticeable even for only 1 player and for sure noticed in our 100k gamedatabase.
Can you explain this part again? I do not understand what you're saying.
Edit: Not sure what "high derivation" means. Also, I don't understand what a "not expected MMR" is.
|
The f function calculate the dmmr depending on adjusted points and changepoints. Its see how far the opponents mmr is away of someones adjusted points by the change points. You look at the skill-function and how it act. But thats not the important part. We back-engenier. We look at what cases the change not the diffrence of the players and how we have to change them. Blizzard already did this! We dont need to calculate the MMR we just have to READ it.
Not_that found a function that would act like the one we can observe on their results ( the point-change is the result of this function)
We calculate back to the start value ( DMMR) , the function used, to get to this pointchange. The derivation of the startvalue is higher if the pointchange is away form 12/-12. That has nothing to do with the derivation of the skill-function!
We can not see exactly where the startvalue (DMMR) is (information loose of the function because it calculate a small number out of a big number) but we can see the range where it is.
Not expected MMR is an value that that we know can not be correct. e.g. you won a game and your mmr falls. Falling and raising after win and looses was the main indicator to find the f function! We calculate the mmr before a game. And before the next game. If there is any mistake in the function in one of the datapoints we would see a raise after a loose or a fall after a win. We dont have such a point in any game!
All this is explained in not_that thread. You should post your question there. NT is also better in explaining what he did than me
|
On July 13 2012 01:06 skeldark wrote:Show nested quote +On July 13 2012 00:23 lolcanoe wrote:On July 12 2012 07:48 lolcanoe wrote: 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well. At least do the easy part and fix 1 and 2, and note very carefully what test was run (which STD's did you use?) to calculate statistical signficance. 1) I dont assume normality. I show that 99.99% of random values are in a range +- x and my value is outsite of range x. So its very unlikely that my value is random! THATS ALL. You call yourself statistic freaks but fail to understand this simple method! I fail to understand because you failed to explain.
What you did:
You take a large pool of data. Find average.
You take a subset of that data. Find average.
Subtract the difference. Then what? What do you mean by you "showed" 99.99% of the random values are +-x. From what ,the average? By random values, you mean data points?
If my understanding is correct, you are implying that 99+% of the data lies between +-25 of the mean. That's NOT what the graph here: http://postimage.org/image/n60jmstyz/ shows at all. You need to be more thorough in your explanation and your calculations.
And once again, it's not a more sophicated test, but you should not be comparing dependent sample averages. You should be comparing average MMR of t to average of z, t to p, and p to z, and so forth. So you want 3 tests to see if the averages are different from each other, not testing if a single race varies signficantly from the average of all races.
|
On July 13 2012 04:06 lolcanoe wrote:Show nested quote +On July 13 2012 01:06 skeldark wrote:On July 13 2012 00:23 lolcanoe wrote:On July 12 2012 07:48 lolcanoe wrote: 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well. At least do the easy part and fix 1 and 2, and note very carefully what test was run (which STD's did you use?) to calculate statistical signficance. 1) I dont assume normality. I show that 99.99% of random values are in a range +- x and my value is outsite of range x. So its very unlikely that my value is random! THATS ALL. You call yourself statistic freaks but fail to understand this simple method! I fail to understand because you failed to explain. What you did: You take a large pool of data. Find average. You take a subset of that data. Find average. Subtract the difference. Then what? What do you mean by you "showed" 99.99% of the random values are +-x. From what ,the average? By random values, you mean data points? If my understanding is correct, you are implying that 99+% of the data lies between +-25 of the mean. That's NOT what the graph here: http://postimage.org/image/n60jmstyz/ shows at all. You need to be more thorough in your explanation and your calculations. And once again, it's not a more sophicated test, but you should not be comparing dependent sample averages. You should be comparing average MMR of t to average of z, t to p, and p to z, and so forth. So you want 3 tests to see if the averages are different from each other, not testing if a single race varies signficantly from the average of all races.
NO. nothing of this is true. i try to explain
Given : Data A Data A was created without knowledge of P Property P Property P was collected without knowledge of A
90.000 Random sorted Data groups of A produced in 99.55% of the cases values between -25 and +25 P sorted Data groups of data A produced P1: -53.68 P2 11.87 P3 31.52 P sorted Data group of data B subgroub of A produced P1: -27.70 P2 17.49 P3 3.82 P sorted Data group of data C subgroub of A produced P1: -43.51 P2 0.37 P3 34.93
Data A is obvious significant biased towards P!
btw you find this in the op....
And once again, it's not a more sophicated test, but you should not be comparing dependent sample averages. You should be comparing average MMR of t to average of z, t to p, and p to z, and so forth. So you want 3 tests to see if the averages are different from each other, not testing if a single race varies signficantly from the average of all races.
Adding or substracting the same number from 2 numbers dont change the diffrence of this number to each other: A -C = X B -C = Y X -Y = A-B
|
|
On July 13 2012 04:14 monkybone wrote: Why does uneven skill distribution not affect average MMR in an ideal situation with perfect balance? because a situation with perfect balance = even skill distribution
When i talk about balance i talk about even skill distribution of races. That the balance of the Property (race) of the data (account skill)
This DONT have to be game design balance. Last one is a social term and can be calculated because its not even clear defined. If all Terran pro players are ill and can not play is the game still balanced? I say no. You could say yesl Its not a mathematical question.
|
On July 13 2012 01:30 skeldark wrote: If you read the text careful, i think will agree that this is not perfect but a way better method than tldp win-ratios or random tournament results.
This is something I can completely agree with, that the method used, regardless of the many faults I find with it, is much more significant than the tldp win-ratios.
You also said it's not a university paper, and I think that's what most people are looking for, a much more detailed and broken down explaination. I think in general I get what you did with your actual data, my main issue is with your model that you used to determine what acceptable SD would be under a balanced model. But as you used your own program, I don't think there's any point in going further down that road.
With that, cool idea, I disagree with your final analyst and explaination of the results you found from the data, and thus disagree that it proves anything.
I do want to say that with your large amount of data, that over the course of it's collection, that your data does show that Terran tends to be slightly lower MMR than the other two, this could be for many reasons, including that at the start of your collection terran was over-balanced, and then readjusted lower after you started collecting data, it could fit within a proper standard deviation with a different (better) model of b.net, or many other things.
Thank you for posting your data as well, final question, when did you start collecting data, was it in fact 1970 (lol, jk) or was it on May 13th, 2012 (best date I can find from what you've posted)
|
On July 13 2012 04:10 skeldark wrote:Show nested quote +On July 13 2012 04:06 lolcanoe wrote:On July 13 2012 01:06 skeldark wrote:On July 13 2012 00:23 lolcanoe wrote:On July 12 2012 07:48 lolcanoe wrote: 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well. At least do the easy part and fix 1 and 2, and note very carefully what test was run (which STD's did you use?) to calculate statistical signficance. 1) I dont assume normality. I show that 99.99% of random values are in a range +- x and my value is outsite of range x. So its very unlikely that my value is random! THATS ALL. You call yourself statistic freaks but fail to understand this simple method! I fail to understand because you failed to explain. What you did: You take a large pool of data. Find average. You take a subset of that data. Find average. Subtract the difference. Then what? What do you mean by you "showed" 99.99% of the random values are +-x. From what ,the average? By random values, you mean data points? If my understanding is correct, you are implying that 99+% of the data lies between +-25 of the mean. That's NOT what the graph here: http://postimage.org/image/n60jmstyz/ shows at all. You need to be more thorough in your explanation and your calculations. And once again, it's not a more sophicated test, but you should not be comparing dependent sample averages. You should be comparing average MMR of t to average of z, t to p, and p to z, and so forth. So you want 3 tests to see if the averages are different from each other, not testing if a single race varies signficantly from the average of all races. NO. nothing of this is true. i try to explain Given : Data A Data A was created without knowledge of P Property P Property P was collected without knowledge of A 90.000 Random sorted Data groups of A produced in 99.55% of the cases values between -25 and +25 P sorted Data groups of data A produced P1: -53.68 P2 11.87 P3 31.52 P sorted Data group of data B subgroub of A produced P1: -27.70 P2 17.49 P3 3.82 P sorted Data group of data C subgroub of A produced P1: -43.51 P2 0.37 P3 34.93 Data A is obvious significant biased towards P! btw you find this in the op.... Show nested quote + And once again, it's not a more sophicated test, but you should not be comparing dependent sample averages. You should be comparing average MMR of t to average of z, t to p, and p to z, and so forth. So you want 3 tests to see if the averages are different from each other, not testing if a single race varies signficantly from the average of all races.
Adding or substracting the same number from 2 numbers dont change the diffrence of this number to each other: A -C = X B -C = Y X -Y = A-B
What do you mean by data group - how large is each of these data groups you are talking about? And why are we dealing with data groups instead of lumping all the data into one sum?
As for the A-C = X mumbo jumbo, technically you're right in the sense that testing t vs p and t vs z will utilize all the data, but you're missing the point I'm trying to make.
You're currently using A - avg(a,b,c), when should be testing a against c directly to avoid confusion.
|
On July 13 2012 04:21 1st_Panzer_Div. wrote:Show nested quote +On July 13 2012 01:30 skeldark wrote: If you read the text careful, i think will agree that this is not perfect but a way better method than tldp win-ratios or random tournament results.
I do want to say that with your large amount of data, that over the course of it's collection, that your data does show that Terran tends to be slightly lower MMR than the other two That is all i ever said!
Thank you for posting your data as well, final question, when did you start collecting data, was it in fact 1970 (lol, jk) or was it on May 13th, 2012 (best date I can find from what you've posted)
mid season 7 but with races since last patch of my program so 2 weeks ago
|
|
On July 13 2012 04:32 monkybone wrote:Show nested quote +On July 13 2012 04:17 skeldark wrote:On July 13 2012 04:14 monkybone wrote: Why does uneven skill distribution not affect average MMR in an ideal situation with perfect balance? because a situation with perfect balance = even skill distribution When i talk about balance i talk about even skill distribution of races. That the balance of the Property (race) of the data (account skill) This DONT have to be game design balance. Last one is a social term and can be calculated because its not even clear defined. If all Terran pro players are ill and can not play is the game still balanced? I say no. You could say yesl Its not a mathematical question. Of course the game could be balanced even in absurd situations where all Terran players were complete scrubs or any other skill distribution.
Now you talking about game balance. So what is game balance. Define...
I say the fact that all terrans are scrubs = inbalance. Its imblanced because all terran are scrubs. My definition of balance = not all terrans are scrubs!
You say its balanced because the reasons is the player not the game.
Reason can be anything. That is not a mathematical value!
You are free to find statistic methods or social analyses what is a reason for inbalance in the data. Thats not what i did. I look IF there is inbalance in the data not what cases it.
|
You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times.
|
On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup
I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race.
Datagroup terran = All terran players of the data. Datagroup random = a random subgroup A random data group is when i give every Player a random number and than take the group where this random number is 1.
I dont know how i can simplify more.
|
|
On July 13 2012 04:50 skeldark wrote:Show nested quote +On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups?
And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right?
|
On July 13 2012 04:53 monkybone wrote:Show nested quote +On July 13 2012 04:40 skeldark wrote:On July 13 2012 04:32 monkybone wrote:On July 13 2012 04:17 skeldark wrote:On July 13 2012 04:14 monkybone wrote: Why does uneven skill distribution not affect average MMR in an ideal situation with perfect balance? because a situation with perfect balance = even skill distribution When i talk about balance i talk about even skill distribution of races. That the balance of the Property (race) of the data (account skill) This DONT have to be game design balance. Last one is a social term and can be calculated because its not even clear defined. If all Terran pro players are ill and can not play is the game still balanced? I say no. You could say yesl Its not a mathematical question. Of course the game could be balanced even in absurd situations where all Terran players were complete scrubs or any other skill distribution. Now you talking about game balance. So what is game balance. Define... I say the fact that all terrans are scrubs = inbalance. Its imblanced because all terran are scrubs. My definition of balance = not all terrans are scrubs! You say its balanced because the reasons is the player not the game. Reason can be anything. That is not a mathematical value! You are free to find statistic methods or social analyses what is a reason for inbalance in the data. Thats not what i did. I look IF there is inbalance in the data not what cases it. Didn't you read the whole post? You want to make a statistic about how much someone is improving compare to how much he trains. I did not see what this have to do with what i did thats why i ignored it. I think we talk about different topics here. You think its more interesting to see the mmr change over trainging i calculated the total mmr average. What should i say to this? yes it could be interesting to see how the mmr change with more games depending on the race.
|
Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting.
|
I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o
|
On July 11 2012 01:34 skeldark wrote:
[...]
Result TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT Datasize 10063 Average MMR: 1593.1 Min Difference to be significant: 90% : +-16 99% : +-24 99,99% : +-36 [...]
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT MMR Filter: Only Master+ Datasize 2278 Average MMR: 2278.03 Min Difference to be significant: 90% : +-15 99% : +-23 99,99% : +-35 [...]
The significance intervall is lower for the larger sample? That doesn't make sense to me.
Someone help me out here please.
|
On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993,
On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
|
On July 13 2012 05:03 xian_ wrote:Show nested quote +On July 11 2012 01:34 skeldark wrote:
[...]
Result TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT Datasize 10063 Average MMR: 1593.1 Min Difference to be significant: 90% : +-16 99% : +-24 99,99% : +-36 [...]
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT MMR Filter: Only Master+ Datasize 2278 Average MMR: 2278.03 Min Difference to be significant: 90% : +-15 99% : +-23 99,99% : +-35 [...]
The significance intervall is lower for the larger sample? That doesn't make sense to me. Someone help me out here please. The lower sample have a way lower range. ( only master vs all skillranges) Ask myself the same question when i saw the output ^^
|
Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
Indeed, concerning the sample size, it may not be relevant, that not all races are played equally distributed in different countries, which have their own respective disconnect-rates. So even if there were like 70% of the Russians playing Terran and suffer from an increased disconnect-rate of say 100%, that might not really bias the sample at all. Might become a point, when you crack the trillion-observation-sample-size :D
|
On July 13 2012 04:53 lolcanoe wrote:Show nested quote +On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself.
First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here.
Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n).
But I don't see any of that happenning here...
Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
|
On July 13 2012 05:08 BBS wrote:Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? Indeed, concerning the sample size, it may not be relevant, that not all races are played equally distributed in different countries, which have their own respective disconnect-rates. So even if there were like 70% of the Russians playing Terran and suffer from an increased disconnect-rate of say 100%, that might not really bias the sample at all. Might become a point, when you crack the trillion-observation-sample-size :D Seriously guys... this sort of strain would have an entirely neglible effect on anything at all. If you studied statistics you'd know that proportional errors are pretty independent of sample size, so if it's a neglible effect to begin with it'll continue to be that way even as the data pool grows.
|
On July 13 2012 05:29 lolcanoe wrote:Show nested quote +On July 13 2012 04:53 lolcanoe wrote:On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself. First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here. Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n). But I don't see any of that happenning here... Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
what you see on the graph is the deviation of the playerbase in total. It have nothing to do with my calculation. i did not say its not normal. I dont now what it is.
I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
The "mix" was an quote to you because you asked something about why different groups . I try to explain you that with 3 different races i need 3 different groups , one for each race or i can not calculate their average.
Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation.
|
On July 13 2012 05:03 skeldark wrote:Show nested quote +On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
TY, is there any information on how high masters MMR goes or about where GM starts? Or is that not available due to the nature of GM. I believe there's still a cap on masters MMR though, wonder if that number is available somewhere.
|
On July 13 2012 05:47 hunts wrote:Show nested quote +On July 13 2012 05:03 skeldark wrote:On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? TY, is there any information on how high masters MMR goes or about where GM starts? Or is that not available due to the nature of GM. I believe there's still a cap on masters MMR though, wonder if that number is available somewhere. ^^ i edited it out of the copy paste. Because this value change all the time and is not very accurate. +800 on master is a good hand rule for gm But gm is... fucked up. Its not top 200 at all. Its just a bad system.
there is no cap for master mmr only lower leagues have a cap that they can not fall under 0 of their tier. If you mean an Maximum MMR we are not sure if it exist. When it exists this would be terrible because it would destroy the skillfunction on the high end. We just hope blizzard is clever enough not to build one in or build it only in the match finding algorithm.
|
On July 13 2012 05:38 skeldark wrote: [I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
Jesus christ I don't think an explanation could be more obfuscated than that.
On July 13 2012 05:38 skeldark wrote: Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation.
The question are not random. What's random is your procedure. Your tests are inconsistent with any conventional test and the conclusions are based on pseudo-stats at best.
In light of the bullshit here, let me post very very concisely what you can do to the data to make this test statistically more sound.
Assume normality - but add a disclaimer that normality is an assumption that you do not take lightly..
Take T, Z, P data pools and calculate the following:
Mean: Average MMR of each race. SD: Use a SD calculator to find the standard deviations of each race.
Test two races at a time (start with T vs Z, etc), using the 2 sample t test. See http://ccnmtl.columbia.edu/projects/qmss/the_ttest/twosample_ttest.html
Test the t-statistic against a P of .01.
Post the data with caveats about the conclusion and caveats about the normality assumption.
- Done.
|
On July 13 2012 05:29 lolcanoe wrote:Show nested quote +On July 13 2012 04:53 lolcanoe wrote:On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself. First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here. Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n). But I don't see any of that happenning here... Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing.
|
On July 13 2012 05:52 lolcanoe wrote:Show nested quote +On July 13 2012 05:38 skeldark wrote: [I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
Jesus christ I don't think an explanation could be more obfuscated than that. Show nested quote +On July 13 2012 05:38 skeldark wrote: Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation. The question are not random. What's random is your procedure. Your tests are inconsistent with any conventional test and the conclusions are based on pseudo-stats at best. In light of the bullshit here, let me post very very concisely what you can do to the data to make this test statistically more sound. Assume normality - but add a disclaimer that normality is an assumption that you do not take lightly.. Take T, Z, P data pools and calculate the following: Mean: Average MMR of each race. SD: Use a SD calculator to find the standard deviations of each race. Test two races at a time (start with T vs Z, etc), using the 2 sample t test. See http://ccnmtl.columbia.edu/projects/qmss/the_ttest/twosample_ttest.html Test the t-statistic against a P of .01. Post the data with caveats about the conclusion and caveats about the normality assumption. - Done.
Nice!
Post when you are done. Datasource is linked in the op.
You could already do it in all the time you spend complaining about what i did. But wait telling other people what you think they should do is more fun than working yourself right?
|
Excellent work. Its cool too see that at certain skill range sure one race may appear more dominate. But with improvement of your own skill comes balance. Nice to see.
|
On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing.
If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course.
The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation.
As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered.
Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work.
|
On July 13 2012 06:10 lolcanoe wrote:Show nested quote +On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing. If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course. The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation. As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered. Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work. it is updated 1h ago. , separated csv file It did not change since i created the topic before, search for an other excuse. Also missed the point where you ask me for a stable file, so you can make some test. I think i missed it in your complains.
The source file is never filtered the filter comes after saving the file.
source file
Good luck.
|
On July 13 2012 05:03 skeldark wrote:Show nested quote +On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
Because the sample size of your Terran/Protoss/Zerg groups aren't really close to equal, so if Terran (the smaller group) has the same amount of disconnects as Protoss or Zerg, then it affects it more?
Or since disconnects discourage play time, I can see how it's mostly meaningless, since you had more active members participating than inactive.
|
On July 13 2012 06:19 furerkip wrote:Show nested quote +On July 13 2012 05:03 skeldark wrote:On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? Because the sample size of your Terran/Protoss/Zerg groups aren't really close to equal, so if Terran (the smaller group) has the same amount of disconnects as Protoss or Zerg, then it affects it more? When you assume that the smaller group terran, have the same amount of disconnect that he bigger group zerg, you assume that terran have more % of disconnect than zerg.
Like i said , I just dont see why the race you play should affect the amount of disconnects.
But my program dont track this anyway:
if you disconnect or leave the game to fast my program cant detect the game and will not work ^^
|
On July 13 2012 05:51 skeldark wrote: If you mean an Maximum MMR we are not sure if it exist. When it exists this would be terrible because it would destroy the skillfunction on the high end. We just hope blizzard is clever enough not to build one in or build it only in the match finding algorithm.
The designer in the UCI video clearly stated that a maximum MMR cap exists, and it appears to be an actual MMR cap, not just a tweak to match-finding. I'm guessing there must be a reason that they strictly capped the MMR.
What I've speculated in the past is that an MMR cap at the top is required to balance off the fact that there's now an MMR floor at 0, because there were bugs in early seasons relating to Bronze players winding up with negative MMRs and never being able to go positive again. Capping the MMR at both ends may be necessary to prevent a floor from causing constant MMR inflation. I don't know whether that's actually the case, though, it's just a guess.
|
Hey skeldark just wanted to say thanks for all the work you've done and the info that you've provided. Great thread.
|
On July 13 2012 06:28 Lysenko wrote:Show nested quote +On July 13 2012 05:51 skeldark wrote: If you mean an Maximum MMR we are not sure if it exist. When it exists this would be terrible because it would destroy the skillfunction on the high end. We just hope blizzard is clever enough not to build one in or build it only in the match finding algorithm. The designer in the UCI video clearly stated that a maximum MMR cap exists, and it appears to be an actual MMR cap, not just a tweak to match-finding. I'm guessing there must be a reason that they strictly capped the MMR. What I've speculated in the past is that an MMR cap at the top is required to balance off the fact that there's now an MMR floor at 0, because there were bugs in early seasons relating to Bronze players winding up with negative MMRs and never being able to go positive again. Capping the MMR at both ends may be necessary to prevent a floor from causing constant MMR inflation. I don't know whether that's actually the case, though, it's just a guess.
They fixed the 0 floor different with am very stupid patch. Your mmr can still go under 0 the only diffrence is , he dont show it to you. The problem is not only for tier 0 bronze, for all tiers under master! So noone have to feel bad because he see 0 points. The adjusted points dont follow the skill points anymore. This make also sure the f function always give you points. It was never a problem of the skill- system, its a problem of the system that hides the MMR . We have to throw half of our data away because of this and it delayed our work because no one know of this tiercap before and we calculated wrong results all the time.
The reason they would patch an high mmr cap is : Top level player dont find any opponent because next opponent online is out of range for the match finding system.
This however will screw the hole skillpoints at the top ! And the only reason someone would do such a patch is: -To lazy to write a patch for the match finding system -Not aware of how bad a cap would be for the system.
A clever solution would be: -change the match finding system that its more loose at the top or if you are lazy - program the cap only in the match finding system when you give him the skillpoints of the players but dont change the skill rating system at all
Because they only changed the frontend ( player points in bnet profile ) for the under 0 problem i think they are clever enough to dont cap in the backend for the high mmr problem
My theories is they caped in the beginning and when the realised their mistake they bring up the "loose match making " patch 1 month ago. But this patch was also needed because the drastic drop in sc2 players.
My data shows so far no clear cap at the top. But cant say for sure yet
On July 13 2012 06:48 geoIOPS wrote: Hey skeldark just wanted to say thanks for all the work you've done and the info that you've provided. Great thread. thank you.
|
On July 13 2012 06:50 skeldark wrote: Your mmr can still go under 0 the only diffrence is , he dont show it to you. The problem is not only for tier 0 bronze, for all tiers under master! So noone have to feel bad because he see 0 points.
No, you don't understand. The 0 point I'm talking about was the 0 point for the underlying MMR value, not the displayed point value. This only affected low Bronze, and caused a bug where low Bronze players could not ever increase their MMR, because negative MMR values were misinterpreted as very large positive values when the system would update MMR scores after a game. This has been fixed by placing a hard floor at 0 MMR (which is not the same as 0 displayed points and does NOT affect anyone in any higher league.)
My data shows so far no clear cap at the top. But cant say for sure yet
Whatever your data says, the designer is on record stating that there is in fact an actual MMR cap. If you come to the conclusion that your data says otherwise, you're reading it wrongly.
Incidentally, the issues with GM league started when they instituted this cap, probably because the cap affected more than the top 200 players.
The reason that I speculate that inflation may be the reason for the MMR cap is that by placing a hard floor at 0, players who lose but do not go below that floor will nevertheless be adding more MMR points into the system every time they lose, which essentially causes upward pressure on all the other scores. Adding a complementary cap at the high end would cancel this out. This is true whether MMR is zero sum or whether there are other mechanisms in place to handle inflation.
Incidentally, the developers of this system are not lazy, and they do simulate extensively the effects of their changes before implementing them. However, they do have to balance a large number of competing design goals, which means any choice they make won't be perfect.
|
|
US DATA ONLY
Terran Average MMR, STD 1559.214909, 546.131097
Protoss Average MMR, STD 1620.764863, 509.5809733
Zerg Average MMR, STD 1672.129547, 495.3121321
TWO SAMPLE T-TEST RESULTS
T-Stat, T vs Z T-Stat = -5.693 P = .0000001386
T-Stat, P vs Z T-Stat : -2.872 P = 0.00472
T-Stat, T vs P Tstat = -3.03 p = .00238
Histogram of T MMR for normality check:
![[image loading]](http://i.imgur.com/FNzvx.png)
Anderson-Darling Test for Normality (T only)
![[image loading]](http://i.imgur.com/7kVqA.png) With a p slightly greater than .05, we cannot reject normality of the data. However, the weakness of this statistic indicates that normality should be scrutinized in the interpretation.
Assumptions - MMR is an independent, fair indicator of skill. - MMR is approximately normal. - There is no sampling bias between races, however there is a sampling bias towards higher average skill. - Cause-effect cannot be established by this test.
With over 99% confidence, we can reject the null hypothesis that the averages are equal in all 3 matchups. This is not surprising given the quantity of data, in addition to a maximum 7% difference between T and Z in average MMR.
The data for T appears approximately normal, but the study does not conclusively show that MMR is normal.
|
On July 13 2012 07:33 Lysenko wrote:Show nested quote +On July 13 2012 06:50 skeldark wrote: Your mmr can still go under 0 the only diffrence is , he dont show it to you. The problem is not only for tier 0 bronze, for all tiers under master! So noone have to feel bad because he see 0 points. No, you don't understand. The 0 point I'm talking about was the 0 point for the underlying MMR value, not the displayed point value. This only affected low Bronze, and caused a bug where low Bronze players could not ever increase their MMR, because negative MMR values were misinterpreted as very large positive values when the system would update MMR scores after a game. This has been fixed by placing a hard floor at 0 MMR (which is not the same as 0 displayed points and does NOT affect anyone in any higher league.) Whatever your data says, the designer is on record stating that there is in fact an actual MMR cap. If you come to the conclusion that your data says otherwise, you're reading it wrongly. Incidentally, the issues with GM league started when they instituted this cap, probably because the cap affected more than the top 200 players. The reason that I speculate that inflation may be the reason for the MMR cap is that by placing a hard floor at 0, players who lose but do not go below that floor will nevertheless be adding more MMR points into the system every time they lose, which essentially causes upward pressure on all the other scores. Adding a complementary cap at the high end would cancel this out. This is true whether MMR is zero sum or whether there are other mechanisms in place to handle inflation. Incidentally, the developers of this system are not lazy, and they do simulate extensively the effects of their changes before implementing them. However, they do have to balance a large number of competing design goals, which means any choice they make won't be perfect. I do understand very good and i know what im talking about.
The 0 line problem was not just a unsigned int problem. If this would be the case the fix would be 1 line of code.
About mmr cap that exist in all leagues: The problem is that the f function gives points depending on opponent adjusted points. If your mmr are down he gives a minimum point change. It always gives you points even if you don't deserve them. This way the system make sure you stay above 73 points all the time. If it does not do that, you would fall at 0 ( bad because people cry) and also he f function would give you so few points on a win that you loose them fast and stay at 0 ( people cry more) The mmr dont care for this because its above 0, but because blizzard subtracted the offset from it, Now the adjusted points follow the Dmmr. And this one can be very easy negative! The cap make sure the adjusted points can not follow the dmmr under 73. In bronze tier 0 there is nothing to subtract any-more but you still stay above 73
And this is not the only bug of the system. They have more and tried to fix them. On few they gave up because they realised that the fix would make other problems.
All this only to hide the mmr.
We can not confirm a 0 line or deny it because blizzard allows no one to go under 73 mark of his tier so 73 is the minimum mmr we can see in our data.
To the blizzard video. When we learned something than not to trust blizzard! In the beginning we calculated with numbers blizzard published at battlenet about the leagues. We take them as indicator until we realised they can not be true. Its just plain impossible. . If they are true you run into logic mistakes. We thought the mistake is on our site for very long time.
Now after we have the real offsets we can tell the published numbers are total wrong. I dont know if they wanted to fool us or if they did an mistake ( they publish them every season tho) or if someone posted them who did not know what he was talking about. B
And would you say in an interview:
"We invented this very complex system so we can hide the real skill form the user but this system had a lot of bugs so we had to cap in this system, so people dont drop even if their skill-value drop!" Or would you say: " We make a cap for mmr at 0" And thats not even wrong. They made a cap but for dmmr. But this guy will for sure not talk about dmmr and how they hide information from the user.
I can say for sure if there is an mmr cap at the top, no player i track reached it yet! Pro players have still different values on the ladder. ( and i have most of them in my db) look at the datafile even in this small one i use here you can see it.
---
I know you dont trust my statistic skills. But if we talk about the ladder system i study it for month and i know a lot about it. All arguments you bring up i discussed with others already for very long time.
You however saw the video we found long time ago. One of our many sources for mmr and by far not the best one.
|
On July 13 2012 06:10 lolcanoe wrote:Show nested quote +On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing. If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course. The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation. As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered. Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work.
Requiring someone to have a college education is a bit of an ivory tower buddy. And his manipulations were clever, even if they were wrong.
|
|
On July 13 2012 08:13 VediVeci wrote: Requiring someone to have a college education is a bit of an ivory tower buddy. I'm not requiring anyone to have anything. My criticisms are objectively based on the analysis and not the source.
There is no ivory tower here. I've proven that my methods can be applied in a statistically coherent and easily understandable way, so your accusations that my suggestions are impractical (or "ivory tower") are pretty moot.
|
On July 13 2012 08:13 VediVeci wrote:Show nested quote +On July 13 2012 06:10 lolcanoe wrote:On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing. If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course. The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation. As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered. Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work. Requiring someone to have a college education is a bit of an ivory tower buddy. And his manipulations were clever, even if they were wrong. dont worry im very confident im better educated than him. and i did not manipulate any data also you can argue you want more but not that it is wrong. It did not follow the standard way and many people did not understood what i did but not understanding is not a valid prove for wrong.
With over 99% confidence, we can reject the null hypothesis that the averages are equal in all 3 matchups. This is not surprising given the quantity of data, in addition to a maximum 7% difference between T and Z in average MMR.
time for an excuse isnt it?
On July 13 2012 08:21 lolcanoe wrote:Show nested quote +On July 13 2012 08:13 VediVeci wrote: Requiring someone to have a college education is a bit of an ivory tower buddy. I'm not requiring anyone to have anything. My criticisms are objectively based on the analysis and not the source. There is no ivory tower here. I've proven that my methods can be applied in a statistically coherent and easily understandable way, so your accusations that my suggestions are impractical (or "ivory tower") are pretty moot.
You did not only attack the analyse. You attacked every word i wrote and that in a very bm way from the beginning. You came in this thread and nostop rant what you dont like and what you want me to do and act like i have to do stuff that you want to have.
After said all this:
Thank you very much for your analyse. I will link it in the op!
|
On July 13 2012 08:14 monkybone wrote:Show nested quote +On July 13 2012 07:41 lolcanoe wrote:US DATA ONLYTerran Average MMR, STD 1559.214909, 546.131097 Protoss Average MMR, STD 1620.764863, 509.5809733 Zerg Average MMR, STD 1672.129547, 495.3121321 TWO SAMPLE T-TEST RESULTST-Stat, T vs Z T-Stat = -5.693 P = .0000001386 T-Stat, P vs Z T-Stat : -2.872 P = 0.00472 T-Stat, T vs PTstat = -3.03 p = .00238 Histogram of T MMR for normality check: ![[image loading]](http://i.imgur.com/FNzvx.png) Anderson-Darling Test for Normality (T only) ![[image loading]](http://i.imgur.com/7kVqA.png) With a p slightly greater than .05, we cannot reject normality of the data. However, the weakness of this statistic indicates that normality should be scrutinized in the interpretation. Assumptions- MMR is an independent, fair indicator of skill. - MMR is approximately normal. - There is no sampling bias between races, however there is a sampling bias towards higher average skill. - Cause-effect cannot be established by this test. With over 99% confidence, we can reject the null hypothesis that the averages are equal in all 3 matchups. This is not surprising given the quantity of data, in addition to a maximum 7% difference between T and Z in average MMR. The data for T appears approximately normal, but the study does not conclusively show that MMR is normal. Very nice. But why does it gives so different results? As far as I can see there is over a 100 MMR difference between Z and T here. Us only
|
On July 13 2012 02:09 monkybone wrote:Show nested quote +On July 12 2012 20:26 lazyitachi wrote:Can't believe this thread is still going on. + Show Spoiler +Problem statement: OP wants to measure e-peen size and compare between different races.
Data gathering: OP constructs a e-peen measuring tool that user directly inserts on a voluntary basis. User base are high hormone individuals who are very interested to know how big their e-peen is. Therefore, these individuals are most likely already at a higher percentile of e-peen length compared to the general population.
Measurement: Users measure e-peen whenever they are playing by themselves. This play will be contested with another user. The winner will have longer e-peen and vice versa thus the true size of e-peen is estimated based on such repetition. Some e-peen have been observed to fluctuate in size up to 1000 inches in a few days. How can it be? Can the true size of e-peen be so volatile? Should it not be stable? Seems like some measurement are taken while in the state of flaccidity.
If indeed the measurement needs multiple observation to settle on a true e-peen then does that mean any single observations is then unreliable and not credible? But we like e-peen so the more is better. Let's not care about that.
Methodology: The statistics earlier is based on the average over a long period of time before the individual has truly established his true e-peen thus if there are any upwards biased (likely because they are all looking to gain the next level of e-peen recognition), the value will be severely underestimated. Not to mention those single observations from more outdated time. Later it is changed to be the latest e-peen measures. Pssh.. we should ignore testing anyway let alone use the correct test tool.
Summary: Human have the shortest e-peens. Humanoid aliens bits and bug tentacles have imbalanced in size.
Wow, what an ignorant post...
Please tell me what is ignorant about it. It quite accurately describe what he did and wants to prove.. "Ladder Balance" His MMR reliability is not validated anyway. Just his own conjecture at this point. Its not like he is measuring and comparing against any actual number and neither can he proof one observation can yield the correct number. If it can then there is no way someone's "MMR" can change by significant degree which means his method is flawed or he did not account properly for unusable data. Given so much of the data is single obs, quite frankly his whole calculation is unreliable anyway.
|
On July 13 2012 08:24 skeldark wrote: dont worry im very confident im better educated than him. and i did not manipulate any data also you can argue you want more but not that it is wrong. It did not follow the standard way and many people did not understood what i did but not understanding is not a valid prove for wrong.
The standard "way" is standard for a reason - you set baseline standards for comparisons so that you can make meaningful conclusions. Conclusions that both the community and fellow statisticians can agree with. If you want to move away from a standard you better have a spectacular reason to do so, and I just haven't seen strong evidence that conventional methods are going to work here.
Likewise, my education is really not too important - but since you brought it up, I work at one of US's top credit rating firms, have a finance/math background, and performed excellently in statistics coursework. Inconsistently presented statistics would ultimately have large implications to anyone in my career field (The SEC would probably come after us). For investment banks small miscalculations in risk statistics, like the under-estimate of tail risk in the housing crisis, could result in billions of losses. So unless you have a PHD in math or specializing in stats throughout college, I doubt you'll have a better grasp of the concepts here. Well actually, I doubt it simply because of the way you attempted the analyses.
On July 13 2012 08:24 skeldark wrote: time for an excuse isnt it?
Sorry? I'm not sure I understand you, which appears to a recurring theme here. Are you making fun of the fact that I qualified my conclusions with significant data or are you congratulating yourself for guessing correctly the first time around?
On July 13 2012 08:24 skeldark wrote: You did not only attack the analyse. You attacked every word i wrote and that in a very bm way from the beginning. You came in this thread and nostop rant what you dont like and what you want me to do and act like i have to do stuff that you want to have.
Misuse of statistics is a huge problem and creates unnecessary skepticism to legitimate conclusions. Just look around at how many people in this thread immediately turned their heads and said "it's just stats, i'd rather make arbitrary judgments on how pros are doing". You have to draw the right conclusions from right data. My "attacks" were pretty concise, my concerns were legitimate and fixable, and I put my mouth where the money was by proving it. Now you can say what you wanted to say with a much higher degree of precision and certainty.
As far as the tone of my remarks? I share the blame for letting things get personal. But as friendly advice I've learned (the hard way) in my workplace, there as an inverse correlation between the receptiveness of the presenter and the harshness of the critiques received.
|
|
On July 13 2012 05:51 skeldark wrote:Show nested quote +On July 13 2012 05:47 hunts wrote:On July 13 2012 05:03 skeldark wrote:On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? TY, is there any information on how high masters MMR goes or about where GM starts? Or is that not available due to the nature of GM. I believe there's still a cap on masters MMR though, wonder if that number is available somewhere. ^^ i edited it out of the copy paste. Because this value change all the time and is not very accurate. +800 on master is a good hand rule for gm But gm is... fucked up. Its not top 200 at all. Its just a bad system. there is no cap for master mmr only lower leagues have a cap that they can not fall under 0 of their tier. If you mean an Maximum MMR we are not sure if it exist. When it exists this would be terrible because it would destroy the skillfunction on the high end. We just hope blizzard is clever enough not to build one in or build it only in the match finding algorithm.
If I'm not mistaken there is an upper MMR cap though. They introduced one after the incident with huk playing on TLOs account because it took him too long to find games with his MMR, but I'm not sure if they ever took it out.
|
On July 13 2012 10:03 lolcanoe wrote: Misuse of statistics is a huge problem and creates unnecessary skepticism to legitimate conclusions. Just look around at how many people in this thread immediately turned their heads and said "it's just stats, i'd rather make arbitrary judgments on how pros are doing". You have to draw the right conclusions from right data. My "attacks" were pretty concise, my concerns were legitimate and fixable, and I put my mouth where the money was by proving it. Now you can say what you wanted to say with a much higher degree of precision and certainty.
This deserves to be repeated.
I'm a little taken aback at the OP accusing the statistician at Blizzard who developed the bulk of the system they use now of actively misrepresenting how their system works. If secrecy about their methods were so important to them, he simply wouldn't be out in public talking about their system.
I'm not saying that there's no value in what these guys have done, but the fact that there's a multilayer system of MMR -> MMR-derived adjusted point equivalent value -> adjusted points -> displayed points offers a lot of opportunity for mistaken interpretations about the deeper levels to creep in when analyzing what one can see on the surface.
|
On July 13 2012 10:11 hunts wrote: If I'm not mistaken there is an upper MMR cap though. They introduced one after the incident with huk playing on TLOs account because it took him too long to find games with his MMR, but I'm not sure if they ever took it out.
There is absolutely, positively a hard cap on Blizzard's matchmaking rating number. That's been stated explicitly by the developers. That doesn't mean there's necessarily a cap on the "MMR" value that the OP has derived from what's visible in the system, but he seems to not agree with the idea that those two numbers may not be equivalent.
As I indicated before, I'll have to wait until this weekend to put the time in to decide what I really think about this. Until then, though, I'll believe the developers over the OP on matters where they differ.
|
First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.
Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group.
Race Results + Show Spoiler +For US: ZvT (62.0, 164.6) PvT (8.9, 115.0) ZvP (3.3, 99.4)
For EU: ZvT (19.6, 108.6) PvT (18.3, 113.2) ZvP (-45.3, 42.0)
US and EU: ZvT (51.5, 118.8) PvT (28.9, 99.6) ZvP (-11.1, 53.2)
As for US vs EU, the 99% confidence interval for the mean difference in MMR is: (21.9, 77.1)
For each interval a positive difference indicates the mean of the first population is higher than the second, so for US vs EU it reads, 99% of such samplings will yield a result such that the mean MMR of the US player base is between 21.9 and 77.1 MMR higher than that of the EU player base.
The meaning of a 99% confidence interval for the mean is as follows: If we were to randomly pick samples of the same size* from each population and found the difference of the means between the groups, 99% of such samplings would result in a difference of means within the given interval.
*By same size I mean the same sizes as were sampled to construct the interval, so if the interval were constructed by sampling 10 Zergs and 15 Protosses, it would be random samples of 10 and 15, respectively.
I've provided the MATLAB code I used for the analysis if anyone can run it and wants to do analysis on future data:
Helper Function + Show Spoiler +function [lower,upper] = findInterval(pop1,pop2,confidence) mu1 = mean(pop1); mu2 = mean(pop2); s1 = std(pop1,1); s2 = std(pop2,1); n1 = length(pop1); n2 = length(pop2); diff = mu1-mu2; df = (s1^2/n1 + s2^2/n2)^2/((s1^2/n1)^2/(n1-1)+(s2^2/n2)^2/(n2-1)); tcrit = tinv(1-(1-confidence)/2,df); s = sqrt(s1^2/n1 + s2^2/n2); halfrange = tcrit*sqrt(s1^2/n1 + s2^2/n2); lower = diff-halfrange; upper = diff+halfrange; end
Main script + Show Spoiler +%script for calculating balance
%get data from file (would be ez if OP hadn't put quotes in the .csv, BAD!) fid = fopen('balance.csv'); str = char(fread(fid))'; fclose(fid);
omitFirstLine = '(?<=\n).*'; stripped = str( regexp(str,omitFirstLine):end ); %strip first line rawdata = textscan(stripped, '%s %s %d', 'delimiter',' \t\n,"',... 'MultipleDelimsAsOne', 1);
%define some constants (not saying protoss #1) protoss=1; zerg=2; terran=3; US = 1; EU = 2;
%combine into one big array col = length(rawdata{3}); data = zeros(col, 3); data(:,3) = rawdata{3}; for i=1:col if ( rawdata{1}{i}(1) == 'U') data(i,1) = US; else data(i,1) = EU; end if ( rawdata{2}{i} == 'z') data(i,2) = zerg; elseif ( rawdata{2}{i} == 'p') data(i,2) = protoss; else data(i,2) = terran; end end
%define filters tF = data(:,2) == terran; pF = data(:,2) == protoss; zF = data(:,2) == zerg; uF = data(:,1) == US; eF = data(:,1) == EU;
%construct the 99% confidence intervals based on a two-sided t-test %zerg vs protoss confidence = 0.99; place = eF | uF; %lets you quickly change if US,EU, or both (uF | eF) [zpLower,zpUpper] = findInterval( data(zF & place,3), data(pF & place,3),confidence); [ztLower,ztUpper] = findInterval( data(zF & place,3), data(tF & place,3),confidence ); [tpLower,tpUpper] = findInterval( data(tF & place,3), data(pF & place,3),confidence ); [UsEuLower,UsEuUpper] = findInterval( data(uF,3), data(eF,3), confidence);
|
Skeldark, don't get down on those who are using weak and illogical arguments to disprove your work. Your work is top notch and objective and I don't see any flaws.
The nerfs have taken their toll on Terran, making Terran the weakest race and non-competitive. Terrans are struggling in all the top tournaments lately because they are not competitive anymore.
|
On July 13 2012 10:23 Jadoreoov wrote:First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters. Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group. Race Results+ Show Spoiler +For US: ZvT (62.0, 164.6) PvT (8.9, 115.0) ZvP (3.3, 99.4)
For EU: ZvT (19.6, 108.6) PvT (18.3, 113.2) ZvP (-45.3, 42.0)
US and EU: ZvT (51.5, 118.8) PvT (28.9, 99.6) ZvP (-11.1, 53.2) As for US vs EU, the 99% confidence interval for the mean difference in MMR is: (21.9, 77.1) For each interval a positive difference indicates the mean of the first population is higher than the second, so for US vs EU it reads, 99% of such samplings will yield a result such that the mean MMR of the US player base is between 21.9 and 77.1 MMR higher than that of the EU player base. The meaning of a 99% confidence interval for the mean is as follows: If we were to randomly pick samples of the same size* from each population and found the difference of the means between the groups, 99% of such samplings would result in a difference of means within the given interval. *By same size I mean the same sizes as were sampled to construct the interval, so if the interval were constructed by sampling 10 Zergs and 15 Protosses, it would be random samples of 10 and 15, respectively. I've provided the MATLAB code I used for the analysis if anyone can run it and wants to do analysis on future data: Helper Function+ Show Spoiler +function [lower,upper] = findInterval(pop1,pop2,confidence) mu1 = mean(pop1); mu2 = mean(pop2); s1 = std(pop1,1); s2 = std(pop2,1); n1 = length(pop1); n2 = length(pop2); diff = mu1-mu2; df = (s1^2/n1 + s2^2/n2)^2/((s1^2/n1)^2/(n1-1)+(s2^2/n2)^2/(n2-1)); tcrit = tinv(1-(1-confidence)/2,df); s = sqrt(s1^2/n1 + s2^2/n2); halfrange = tcrit*sqrt(s1^2/n1 + s2^2/n2); lower = diff-halfrange; upper = diff+halfrange; end Main script+ Show Spoiler +%script for calculating balance
%get data from file (would be ez if OP hadn't put quotes in the .csv, BAD!) fid = fopen('balance.csv'); str = char(fread(fid))'; fclose(fid);
omitFirstLine = '(?<=\n).*'; stripped = str( regexp(str,omitFirstLine):end ); %strip first line rawdata = textscan(stripped, '%s %s %d', 'delimiter',' \t\n,"',... 'MultipleDelimsAsOne', 1);
%define some constants (not saying protoss #1) protoss=1; zerg=2; terran=3; US = 1; EU = 2;
%combine into one big array col = length(rawdata{3}); data = zeros(col, 3); data(:,3) = rawdata{3}; for i=1:col if ( rawdata{1}{i}(1) == 'U') data(i,1) = US; else data(i,1) = EU; end if ( rawdata{2}{i} == 'z') data(i,2) = zerg; elseif ( rawdata{2}{i} == 'p') data(i,2) = protoss; else data(i,2) = terran; end end
%define filters tF = data(:,2) == terran; pF = data(:,2) == protoss; zF = data(:,2) == zerg; uF = data(:,1) == US; eF = data(:,1) == EU;
%construct the 99% confidence intervals based on a two-sided t-test %zerg vs protoss confidence = 0.99; place = eF | uF; %lets you quickly change if US,EU, or both (uF | eF) [zpLower,zpUpper] = findInterval( data(zF & place,3), data(pF & place,3),confidence); [ztLower,ztUpper] = findInterval( data(zF & place,3), data(tF & place,3),confidence ); [tpLower,tpUpper] = findInterval( data(tF & place,3), data(pF & place,3),confidence ); [UsEuLower,UsEuUpper] = findInterval( data(uF,3), data(eF,3), confidence);
Nice work, though it might be nice to narrow it down to a 95% CI to get a slightly better measurement I think. I'm too lazy to do it though :D
|
Done:
95% confidence intervals for the EU and US combined: ZvT: (59.5, 110.7) PvT (37.3, 91.2) ZvP (-3.7, 45.5)
US vs EU (28.5, 70.5)
|
On July 13 2012 10:23 Jadoreoov wrote: First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.
No. No. No. No....More misinformation. Normal distributions are indeed pretty prevalent in the real world, and the central limit theorem is a good rule of thumb, but its these sorts of assumptions that have lost certain financial entities billions as well.
Take stock prices returns - approximately normal - but with a fat left-tail. If you used a normal distribution you would severely undervalue the possibility of total disaster and hence under-price risk. Hence, returns are best modeled with a modified distribution to account for the extremities. Or waiting times in a queue, where you have a very long right tail but a distinctly left weighted distribution (think about it, you have a minimum of 0, but a max of infinity, with a peak that is much closer to left than right).
Most of all, we are dealing with an entirely man-made distribution here. If you counted by league only, you'd have 20 20 20 20 20 EVENLY distributed. For MMR, the way the curve shaped is ENTIRELY shaped by modeling software. If Blizzard wanted to they could create a distribution of any type. With our data we can only guess the distribution and approximate our statistics under reasonable normal guidelines (after establishing that normality is a possible model).
Hope this makes sense, and I really encourage you to keep this in mind, especially if you ever plan to work on Wall Street in your life time.
|
United States12230 Posts
Yes, the MMR cap exists. A floor likely also exists.
Don't get defensive when other community members demand more thorough data or a stronger analysis. Understanding the ladder is a communal effort. lolcanoe and Lysenko bring up salient points that should be addressed in order to produce more concrete hypotheses, even if this means refuting existing hypotheses.
We call the reverse-engineered values (points -> adj.pts -> adj.pts with offsets removed) "MMR" because that's the closest representation of MMR we have. We know that the "actual" hidden MMR factors in an uncertainty value when determining the degree of change after a match, but it's unlikely that will ever be deciphered.
The league and division offsets used by the MMR tool are not exact, but they're somewhat close. Still, this introduces a margin of error. This is probably mitigated by the volume of data, and even the relatively arbitrary values that are calculated can be used when compared to each other for the purposes of gauging race balance, because the margin of error applies universally to each race and matchup.
One thing I want to be very careful about is considering any part of this interpretation as "final" data. Every other person who has posted theories about how the ladder works in the past has fallen into the same trap of interpreting his data incorrectly until it fits his conclusions, so it's important we don't repeat that mistake. The data must remain impartial. The only additional information we have about the ladder comes from Josh himself.
Also a special side note: the ladder isn't 20/20/20/20/18/2 anymore. There were some offset corrections and I don't know the new targeted distribution, but I would say conservatively it's closer to 20/20/20/20/16/4. I don't expect Blizzard to release the new target values.
|
@lolcanoe
The issue wasn't whether the distribution itself was close to normal at all. It can be the most skewed thing in the world. The issue is that the sample size is very large, so the distribution of the SAMPLING MEAN is approximately normal.
In probability theory, the central limit theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed.
The students t-test assumes that the distribution of the sampling mean is approximately normal, but makes no assumptions regarding the underlying distribution of the data itself.
|
Oh, it's nice that you guys are redoing what I did back at page 10, but now with more statistics. 
- Yes, I think we have enough statistics, and the distribution is well behaved enough so that central limit theorem will give a sufficiently accurate estimate of the statistical error.
- However, it does assume that the samples are uncorrelated. OP, you said that you removed duplicates from the list, but do you think there can be other correlations in the list of samples? You probably know best exactly what is in the list. If there are still correlations, it means that the error should be larger than what you get from a central limit analysis. But it seems like the (small) signal will still be significant, even if the error is increased a bit. Hopefully there shouldn't be large correlations in there?
|
On July 13 2012 12:15 Jadoreoov wrote: @lolcanoe
The issue wasn't whether the distribution itself was close to normal at all. It can be the most skewed thing in the world. The issue is that the sample size is very large, so the distribution of the SAMPLING MEAN is approximately normal.
You should scroll down the page you quoted.
"In a specific type of t-test, these conditions are consequences of the population being studied, and of the way in which the data are sampled. For example, in the t-test comparing the means of two independent samples, the following assumptions should be met: Each of the two populations being compared should follow a normal distribution. This can be tested using a normality test, such as the Shapiro-Wilk or Kolmogorov–Smirnov test, or it can be assessed graphically using a normal quantile plot. If using Student's original definition of the t-test, the two populations being compared should have the same variance (testable using F test, Levene's test, Bartlett's test, or the Brown–Forsythe test; or assessable graphically using a Q-Q plot). If the sample sizes in the two groups being compared are equal, Student's original t-test is highly robust to the presence of unequal variances.[7] Welch's t-test is insensitive to equality of the variances regardless of whether the sample sizes are similar. The data used to carry out the test should be sampled independently from the two populations being compared. This is in general not testable from the data, but if the data are known to be dependently sampled (i.e. if they were sampled in clusters), then the classical t-tests discussed here may give misleading results."
(http://en.wikipedia.org/wiki/Student's_t-test#Assumptions) Keep in mind we are using a two-sample t-test here... you did scroll down right?
|
On July 11 2012 16:27 Not_That wrote:Show nested quote +On July 11 2012 16:15 Cascade wrote:On July 11 2012 16:05 Not_That wrote:On July 11 2012 15:35 Cascade wrote:On July 11 2012 15:13 Not_That wrote:On July 11 2012 14:53 Cascade wrote:On July 11 2012 14:39 Not_That wrote:MMR distribution by races. Click for full version. ![[image loading]](http://s17.postimage.org/n60jmstyz/image.jpg) Amount of players: 2014 Zerg 1784 Protoss 1516 Terran The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from. Cool! Can you do 100 or even 200 granularity to make it easier to read? :o) We are not trying to see any structure smaller than 200 MMR anyway. Here you go: ![[image loading]](http://s8.postimage.org/5s1v1o3tt/image.jpg) We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc. Thanks! I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs. Good thinking. Same graph normalized, each bar representing the percentage of players of each race in the bin: ![[image loading]](http://s11.postimage.org/utfwxg0vz/image.jpg) Nice! Now just put the error bars back on that plot, and it's perfect!  *leaving* How do I figure out error margins for a graph with granularity? Fixed colors btw. Sorry, missed this post... The error is sqrt(N) in each bin, before normalisation. Then when you rescale, just scale the error with the same factor. Equivalently, the relative error in each bin is 1/sqrt(N). N is the number of entries in that that bin btw.
That way, when you group up bins, you can expect the error to go down a factor 2 if you go from 50 to 200 granularity.
When N gets too low (rule of thumb: it is ok down to N = 20), this error estimate starts becoming a bit shaky, but for a plot like this, it is good enough. Below N = 20, we wont be able to see much anyway I think, so the bin will just say that there is not enough statistics.
|
On July 13 2012 12:40 lolcanoe wrote:Show nested quote +On July 13 2012 12:15 Jadoreoov wrote: @lolcanoe
The issue wasn't whether the distribution itself was close to normal at all. It can be the most skewed thing in the world. The issue is that the sample size is very large, so the distribution of the SAMPLING MEAN is approximately normal.
You should scroll down the page you quoted. "In a specific type of t-test, these conditions are consequences of the population being studied, and of the way in which the data are sampled. For example, in the t-test comparing the means of two independent samples, the following assumptions should be met: Each of the two populations being compared should follow a normal distribution. This can be tested using a normality test, such as the Shapiro-Wilk or Kolmogorov–Smirnov test, or it can be assessed graphically using a normal quantile plot. If using Student's original definition of the t-test, the two populations being compared should have the same variance (testable using F test, Levene's test, Bartlett's test, or the Brown–Forsythe test; or assessable graphically using a Q-Q plot). If the sample sizes in the two groups being compared are equal, Student's original t-test is highly robust to the presence of unequal variances.[7] Welch's t-test is insensitive to equality of the variances regardless of whether the sample sizes are similar. The data used to carry out the test should be sampled independently from the two populations being compared. This is in general not testable from the data, but if the data are known to be dependently sampled (i.e. if they were sampled in clusters), then the classical t-tests discussed here may give misleading results." (http://en.wikipedia.org/wiki/Student's_t-test#Assumptions) Keep in mind we are using a two-sample t-test here... you did scroll down right? No need for that tone imo. We are all working together here as far as I know.
Yes, for these probability calculations to be mathematically accurate, you need normal distributions. But according to central limit theorem, the more you sample any distribution, the more it will look like a normal distribution. The better behaved (ie, normal distribution-like) the distribution is, the faster the convergence. So while these errors are not 100% mathematically accurate, with a distribution that is well behaved like this (no strong tails), and with sample sizes of thousands, they are close enough.
|
On July 13 2012 08:21 lolcanoe wrote:Show nested quote +On July 13 2012 08:13 VediVeci wrote: Requiring someone to have a college education is a bit of an ivory tower buddy. I'm not requiring anyone to have anything. My criticisms are objectively based on the analysis and not the source. There is no ivory tower here. I've proven that my methods can be applied in a statistically coherent and easily understandable way, so your accusations that my suggestions are impractical (or "ivory tower") are pretty moot.
Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly.
And I'm glad you have such good insight into how the financial crisis happened and can tell us about it. Now that you're on the case we can rest assured it won't happen again!!
And skeldark, when I say you "manipulated" the data, I don't mean you did anything negative, I just mean you performed a series of calculations or "manipulations" on the data.
Edit: clarity
|
All this talk just to deny the simple truth that terran is in rough shape. Sc2 WOL is abandonware to Blizzard now.
User was temp banned for this post.
|
On July 13 2012 13:03 VediVeci wrote:
Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly. Edit: clarity He had at least a 50% chance of getting it right. I'm going to ignore the rest of the post has to not encourage further irrelevance from posters who self-admittedly don't read things carefully.
On July 13 2012 12:59 Cascade wrote: Yes, for these probability calculations to be mathematically accurate, you need normal distributions. But according to central limit theorem, the more you sample any distribution, the more it will look like a normal distribution. The better behaved (ie, normal distribution-like) the distribution is, the faster the convergence. So while these errors are not 100% mathematically accurate, with a distribution that is well behaved like this (no strong tails), and with sample sizes of thousands, they are close enough. Ok, let's separate the statements clearly so I can explain why your explanation is inaccurate and why his is pretty much entirely misplaced. I understand the confusion here because my high school math teacher needed to be corrected on the same misunderstanding.
Imagine a population with a distribution that is skewed in one way or another (not normally distributed). If you take a a sample, and increase the sample size from n in an orderly fashion, what happens? Eventually your sample size is the entire population and your sample distribution and population distribution are unsurprisingly identical! So in this 1 sample situation, the shape of the distribution is dependent on the population being sampled. If the population is normal, and only if it is, the sampling distribution will become increasingly normal as n grows. This idea is pretty intuitive once you imagine a sample size equal that of your population (that's exactly what's going on here). This is why a normality test is important!
The central limit theorem specifically relates to the distribution of sampling means and infinite random samples (which isn't exactly what we have here). The distribution of sampling means does NOT equal the sample distributions themselves, as you have incorrectly equated! It refers to the distribution of the AVERAGE values in each sample, and this distribution becomes increasingly normal, not as the number of samples increase but rather as n, the sampling size, increases. In this regard it makes complete sense (with a formal mathematical proof) why the population distribution tends to be irrespective of the distribution of sampling means! Please look into http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/cnt_lim_therm/cnt_lim_therm_02.html to understand why neither of your posts are accurate and how a completely non-normal distribution can have normally distributed sample means as n increases.
Hopefully, you'll begin to understand how you guys are misapplying CLT!
|
On July 13 2012 13:23 DwindleFlip wrote: All this talk just to deny the simple truth that terran is in rough shape. Sc2 WOL is abandonware to Blizzard now.
User was temp banned for this post. ahaha, ok guys, we are busted. We can stop all this statistics BS now. You know, the one we make up out of thin air as we type, completely baseless. We got called on the bluff, nothing more to say. Was fun while it lasted. No point in trying to pretend that analyzing data is of any use when we have people like DwindleFlip laying down the simple truth like a B40UwwwwzzZZZzz!!!11oneone
|
Its always easier to rip something apart then it is to build something... kinda like what I just did
|
On July 13 2012 13:40 lolcanoe wrote:Show nested quote +On July 13 2012 13:03 VediVeci wrote:
Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly. Edit: clarity He had at least a 50% chance of getting it right. I'm going to ignore the rest of the post has to not encourage further irrelevance from posters who self-admittedly don't read things carefully. Show nested quote +On July 13 2012 12:59 Cascade wrote: Yes, for these probability calculations to be mathematically accurate, you need normal distributions. But according to central limit theorem, the more you sample any distribution, the more it will look like a normal distribution. The better behaved (ie, normal distribution-like) the distribution is, the faster the convergence. So while these errors are not 100% mathematically accurate, with a distribution that is well behaved like this (no strong tails), and with sample sizes of thousands, they are close enough. Ok, let's separate the statements clearly so I can explain why your explanation is inaccurate and why his is pretty much entirely misplaced. I understand the confusion here because my high school math needed to be corrected on the same misunderstanding. Imagine a population with a distribution that is skewed in one way or another (not normally distributed). If you take a a sample, and increase the sample size from n in an orderly fashion, what happens? Eventually your sample size is entire population and your sample distribution and population distribution unsurprisingly identical! So in this 1 sample situation, the shape of the distribution is dependent on the population being sampled. If the population is normal, and only if it is, the sampling distribution will become increasingly normal as n grows. This idea is pretty intuitive once you imagine a sample size equal that of your population.(that's exactly what's going on here). This is why a normality test is important! The central limit theorem specifically relates to the distribution of sampling means and infinite random samples (which isn't exactly what we have here). The distribution of sampling means does NOT equal the sample distributions themselves! It refers to the distribution of the AVERAGE values in each sample, and this distribution becomes increasingly normal, not as the number of samples increase but rather as n, the sampling size, increases. In this regard it makes complete sense (with a formal mathematical proof) why the population distribution tends to be irrespective of the distribution of sampling means! Please look into http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/cnt_lim_therm/cnt_lim_therm_02.htmlto understand why neither of your posts are accurate and how a completely non-normal distribution can have normally distributed sample means as n increases. You guys are misapplying CLT! Ok, let me prove it for you then.  My claim is that if the set of samples is large enough, we can use the normal distribution with S/sqrt(N) width to estimate the errors. For simplicity, let me prove that the 2*S/sqrt(N) interval is close to 95%:
Let the distribution f(x) have an average 0 and standard deviation S. An average X from a sufficiently large (specified in the proof) set of N samples from f(x) will fall within 2*S/sqrt(N) of the average 0 with a probability between 0.93 and 0.97. proof: Calculating the average x from N samples (from many different sets, each of N samples) will give a distribution of averages A_N(x) that approaches a normal distribution as N goes to infinity, centred around 0, and with a width of S/sqrt(N). This is the CLT.
Specify "sufficiently large N" such that A_N(x) is similar to a normal distribution g(x) of width S/sqrt(N). Close enough so that the integral from -2*S/sqrt(N) to 2*S/sqrt(N) is between 0.97 and 0.93 (it is close to 0.95 for g). As A_N approaches g as N-->infty, this will happen for some N. The more similar f(x) is to a normal distribution, the lower N is required.
Now take a single average X from f(x), using N samples (this would be the OP). This average is distributed according to A_N(x), and with a sufficiently large N, the probability that X is between -2*S/sqrt(N) and 2*S/sqrt(N) is larger than 0.93, and smaller than 0.97. QED.
Then at what N it reaches "sufficiently large" is a trickier matter. But I am personally convinced (from experience) that with the well behaved distribution of MMR we see, and with thousands of samples, the errors are accurate enough so that the conclusion stands. Ie, that there is a significant signal that the terran MMR is lower than the zerg MMR. Due to the finite (aawwwww ) sample size there is little point in claiming confidence levels of exactly 0.99957353526452, but if this method gives a confidence level of 99.9% I think it is safe to say that you are more than 99% sure. This would also include other errors, such as correlations in the sample (as I was nagging about earlier).
|
Discussion: I think its time to forget the past and start new again. Most of us did not behaviour in the past like they should have ( me included) After we do all agree on the main points we can let the personal stuff aside.
On July 13 2012 12:38 Cascade wrote: - However, it does assume that the samples are uncorrelated. OP, you said that you removed duplicates from the list, but do you think there can be other correlations in the list of samples? You probably know best exactly what is in the list. If there are still correlations, it means that the error should be larger than what you get from a central limit analysis. But it seems like the (small) signal will still be significant, even if the error is increased a bit. Hopefully there shouldn't be large correlations in there?
Duplicates -I can 100% guarantee that there are no duplicated accounts
The profile list is generated backwards ( last upload game first ) and filtered by: - The mmr of the account is valid - The race of the player is known - The player is not a random player - The account is not already in the list
In fact there is a mistake that i exclude data unnecessary: i forgot that the id is only unique for an server and i only check for id not for server+id
Other correlations: Only thing i can think of is that the users-mmr and the opponent-mmr is analysed in total different way. And the analyser for the opponent take the result of the player into account I can mark witch data value is userdata and witch is opponent data Also all opponents of one player are obvious not far away from each other. I can also mark witch opponent values are submitted by the same user.
Beside this the analyse and collection of the mmr is very complicated I can not guarantee that i dont have any structural mistakes at some place that could create correlations But at the moment i dont see such an factor.
Data I can add some useful information to the profile list and publish it again What i think of is: -Time the game was played ( this is sadly user time not server time. i should fix this in the long term) - An id of the user that submitted the data - An id of the account that is shown - mark if the data comes from a user or an opponent - mainrace of the account +the race of the account in the last game he played Anything else?
High mmr cap: I have some more arguments but its offtopic and i just wake up. Let us leave this topic for now and perhaps catch up on it later.
Also a special side note: the ladder isn't 20/20/20/20/18/2 anymore. There were some offset corrections and I don't know the new targeted distribution, but I would say conservatively it's closer to 20/20/20/20/16/4. I don't expect Blizzard to release the new target values. Total agree with this. The data move away from normal slowly and they try to correct with offsets. However i have the feeling they decided not to do so anymore because they dont want to create demotion/promotion waves. On the other hand they could do so at session start and obvious did not with start of season 8. Example the platin offsets are not equal to silver what should be the case if the data is normal. So they corrected with this offsets towards 20/20... already.
|
Sure, add all the data you can think off.
I think a more interesting analysis can be made from the list of games though. Although there we will REALLY have to think of the systematics, as each player submits many games, and what if a player that is really good at say PvZ submits 30 games? That is for another thread though. 
Do you think it is a problem that the samples are weighted by activity? Ie, if (X level) terrans feel frustrated and play less, they will face your users less often, and be less represented in the statistics (at X level). What we measure is actually not only MMR as a flat average over all players, but an average weighted by their current activity.
Otherwise I'm not sure there is much more I have to say. Doing measurement of single leagues (intervals in MMR) doesn't really make sense, as it would only measure the difference in slope of the distribution for the different races. Also I won't have much access to internet over the weekend. 
cheers
|
On July 13 2012 16:23 Cascade wrote:Sure, add all the data you can think off. I think a more interesting analysis can be made from the list of games though. Although there we will REALLY have to think of the systematics, as each player submits many games, and what if a player that is really good at say PvZ submits 30 games? That is for another thread though.  Do you think it is a problem that the samples are weighted by activity? Ie, if (X level) terrans feel frustrated and play less, they will face your users less often, and be less represented in the statistics (at X level). What we measure is actually not only MMR as a flat average over all players, but an average weighted by their current activity. cheers
That is true. I already notice when i try to collect division data, that i see the same division all the time because the first players of new season create them and this are the guys who play all the time. The active userbase is way smaller than the total userbase and the very small very active userbase create alone most of the games. It could get a problem if you make the time interval shorter. But i have a feeling this is again a definition of balance. If good players of one race stop playing is this an balance indicator?
Otherwise I'm not sure there is much more I have to say. Doing measurement of single leagues (intervals in MMR) doesn't really make sense, as it would only measure the difference in slope of the distribution for the different races. Also I won't have much access to internet over the weekend.  But the difference in slope of the distribution for the different races in different mmr intervals is a interesting fact too.
The total gamedata is published in my MMR-Tool thread. I will update it soon with the race data and the game length.
|
On July 13 2012 10:23 Jadoreoov wrote: First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.
Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group.
US and EU: ZvT (51.5, 118.8) PvT (28.9, 99.6) ZvP (-11.1, 53.2)
On July 13 2012 11:20 Jadoreoov wrote: Done:
95% confidence intervals for the EU and US combined: ZvT: (59.5, 110.7) PvT (37.3, 91.2) ZvP (-3.7, 45.5)
US vs EU (28.5, 70.5)
Shouldn't the interval in which the mean can fall become larger as you lower your level of confidence?
|
|
On July 13 2012 13:40 lolcanoe wrote:Show nested quote +On July 13 2012 13:03 VediVeci wrote:
Im not arguing that your methods aren't better, they probably are, (I didn't read your post very closely). You're attacks have been pretty consistently derisive, rude, and especially condescending though, in my opinion. And I know it's not a smoking gun, but his results seem pretty consistent with yours, so he didn't do too poorly. Edit: clarity He had at least a 50% chance of getting it right. I'm going to ignore the rest of the post has to not encourage further irrelevance from posters who self-admittedly don't read things carefully.
Thats the sort of stuff I'm talking about. Whether or not I gave the math in your post a thorough reading was irrelevant to mine, because I have been reading through almost everything else you've posted.
You talk down to everybody, and at least 3 people have called you out on it so far. Constructive criticism is great, but don't be so damn rude about it. This was a pretty respectful discussion, no need to be so vituperative.
|
On July 11 2012 04:01 skeldark wrote: There is one other method that you can use to show trends: you look at the change of mmr of an race over time! Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.
I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine )
Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything
|
On July 13 2012 17:31 Thrombozyt wrote:Show nested quote +On July 13 2012 10:23 Jadoreoov wrote: First off I'd like to point out that the normality of the data doesn't really matter because of the Central Limit Theorem, so please stop discussing that like it matters.
Continuing with lolcanoe's analysis, I found the 99% confidence intervals for the difference in mean for each group.
US and EU: ZvT (51.5, 118.8) PvT (28.9, 99.6) ZvP (-11.1, 53.2)
Show nested quote +On July 13 2012 11:20 Jadoreoov wrote: Done:
95% confidence intervals for the EU and US combined: ZvT: (59.5, 110.7) PvT (37.3, 91.2) ZvP (-3.7, 45.5)
US vs EU (28.5, 70.5) Shouldn't the interval in which the mean can fall become larger as you lower your level of confidence?
No, the 95% confidence interval should be smaller.
It is similar to if someone asked you to guess a number between 0 and 100. If you guessed that it was exactly 50 you wouldn't be very confident. (low confidence interval) If you guessed that it was between 1 and 99 you would be pretty confident that you'd be correct (high confidence interval)
In each calculation the data itself gives us the same amount of uncertainty, so to be more confident in our interval we have to include a greater range of values.
|
On July 13 2012 20:27 Alexj wrote:Show nested quote +On July 11 2012 04:01 skeldark wrote: There is one other method that you can use to show trends: you look at the change of mmr of an race over time! Do players of race Z loose mmr? do players of race X win mmr? this will happen after an patch. But perhaps its not imbalance perhaps it correct the imbalance that was there from the beginning.
I would say this is the only way your data could become usefull. Right now you aggregated some MMR stats over 42 years (just kidding, I understand it is a few month, but still quite some time). There might have been a few metagame shifts and patch changes over that time, but your data doesn't reflect them, since some of the calculated MMR values can be 3 months old and the others are from last week. If you could only account the calculated MMR from the last week, it would be in fact actual balance data, and not some averaged out over many months. And if you could do it periodically, you would be able to show trends and shifts. You would also generate a lot of discussion (and by that I mean new waves of balance whine  ) Edit: also, I am not sure if EU MMR and NA MMR have the same weight. These are 2 groups of accounts who never play with each other. I keep facepalming at sc2ranks who also assume that points on different ladders have the same value. At least your data doesn't mix in KR server, which would completely break everything
Eu and na mmr are very close to each other. I have user that have eu and us accounts and have very similar mmr on both accounts
The data is from 3 weeks not more.
|
On July 13 2012 14:34 Cascade wrote:Ok, let me prove it for you then.  My claim is that if the set of samples is large enough, we can use the normal distribution with S/sqrt(N) width to estimate the errors. For simplicity, let me prove that the 2*S/sqrt(N) interval is close to 95%: Let the distribution f(x) have an average 0 and standard deviation S. An average X from a sufficiently large (specified in the proof) set of N samples from f(x) will fall within 2*S/sqrt(N) of the average 0 with a probability between 0.93 and 0.97.proof:Calculating the average x from N samples (from many different sets, each of N samples) will give a distribution of averages A_N(x) that approaches a normal distribution as N goes to infinity, centred around 0, and with a width of S/sqrt(N). This is the CLT. Specify "sufficiently large N" such that A_N(x) is similar to a normal distribution g(x) of width S/sqrt(N). Close enough so that the integral from -2*S/sqrt(N) to 2*S/sqrt(N) is between 0.97 and 0.93 (it is close to 0.95 for g). As A_N approaches g as N-->infty, this will happen for some N. The more similar f(x) is to a normal distribution, the lower N is required. Now take a single average X from f(x), using N samples (this would be the OP). This average is distributed according to A_N(x), and with a sufficiently large N, the probability that X is between -2*S/sqrt(N) and 2*S/sqrt(N) is larger than 0.93, and smaller than 0.97. QED. No, reread part A. The claim that the distribution that the sample distribution approaches normality only applies when the population data itself is normal. This is extraordinarily intuitive as you watch your sample size approach the entire population. In your claim here, you used standard distribution around a known average to describe a population. In our data, we do not know if SD's can be applied to the population, as the SDs we are calculating are really only accurate for Gaussian distributions.
It is a common misapplication of CLT to state that a sample size of 30 guarantees approximate normality. This iteration tends to be true only because populations tend to be normally distributed. To be mathematically precise, the correct statement is that with a sufficient amount of samples of at least size ~ 30, the distribution of the means of these samples will begin approaching normality, with only a slight regard to the original distribution.
The normality test is essential when running the two-side t test if you want to be thorough when dealing with an unknown population distribution. The textbook, wiki, and other websites have confirmed it. I do not understand why this question persists.
Edit: I should further add that the tendency of sampling means and samples themselves in normal populations to approach normality only occurs when the sample is RANDOMLY procured. In this case it is clearly NOT random (we have different population means vs sample means), so the normality test is ABSOLUTELY a reasonable thing to be concerned about.
|
To me, what the result here indicates is the opposite of what a lay person would think from reading the post.
I'd read your results as "this means that Terran players in general have a lower MMR". But based on your data:
Analysis
"Terran Average MMR, STD 1559.214909, 546.131097
Protoss Average MMR, STD 1620.764863, 509.5809733
Zerg Average MMR, STD 1672.129547, 495.3121321"
What the above seems to imply is that, although the average player included in the study has a smaller MMR, as you go higher, MMR seems to be higher for Terrans than other races. In particular, Mean + 2*STDev = (Cutoff for Top 5% of Normal Distribution) is:
T 2651.47 P 2639.92 Z 2662.75
Giving us much different looking results. As we strive to study based on arbitrarily good players (as player skill increases over time), I would think we'd want to look more heavily at analysis of the implications of Terran's higher STDEV.
A question
Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.
If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.
Edit: Link to the test I'm referring to: http://www.cithep.caltech.edu/~fcp/statistics/hypothesisTest/PoissonConsistency/ScholzStephens1987.pdf.
|
On July 13 2012 23:33 Treehead wrote:
Analysis
"Terran Average MMR, STD 1559.214909, 546.131097
Protoss Average MMR, STD 1620.764863, 509.5809733
Zerg Average MMR, STD 1672.129547, 495.3121321"
Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.
If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.
I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest.
Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant).
I hope that addresses your concerns.
|
Ok i have an hard question for you guys.
If i want to publish average mmr of the data in timeline. What is the minimum value of profiles to still be accurate ? Someone can test a weekly / monthly update ?
|
On July 13 2012 23:59 lolcanoe wrote:Show nested quote +On July 13 2012 23:33 Treehead wrote:
Analysis
"Terran Average MMR, STD 1559.214909, 546.131097
Protoss Average MMR, STD 1620.764863, 509.5809733
Zerg Average MMR, STD 1672.129547, 495.3121321"
Are you sure you can assume normality here? How well do your distributions fit a normal distribution having the same mean and St dev? The reason I ask, of course, is that a T-test can only be used meaningfully on normal distributions.
If normality doesn't fit so well, I'd reccomend MA Stephens article on k-sample Anderson-Darling tests, which uses ranking and therefore needs only continuity as an assumption to move forward.
I'd really suggest reading my post again, as it already includes the Anderson-Darling test! See the probability plot curve and the associated p value which was done using the Anderson-Darling test in Minitab. Anyways, let me be a little more articulate with what you are saying and address them one at a time. Can we assume normality? No. However, in this case the Anderson-Darling test results is inconclusive. Keep in mind, Anderson-Darling tends to be OVERLY powerful with large sample sizes. Your best bet is actually looking at the fitted histogram to judge approximate normality yourself! To me, given the hugely significant P values far under .01, and no strong evidence of non-normality - I'd say that that we can put the majority of these concerns to rest. Now what is more interesting is that we have massive standard deviations, and relatively low actual differences. The two sample t-test only tests whether or not the sample means are EXACTLY equal or not - the magnitude of this difference should not be directly inferred from the p-value, but rather through observation. For instance, with 2 sample sizes with the the size of 1 billion, even a negligible actual MMR difference would result in very low p values. It has to be up to the interpreter to decide whether the maximum 7% difference between T and Z is effectively significant (and not just statistically significant). I hope that addresses your concerns.
My bad - you already did some of the work I suggested. Honestly, I didn't read most of the thread terribly closely except the OP, which I read over a couple times to make sure he hadn't posted anything definitive on this.
Here's the thing though. Maybe you'll get better p-values to convince ourselves of normality. But maybe you won't. 0.05-0.1 isn't bad, and if the T-test returns as good a result as stated in the OP, I doubt you'll get worse than .05 on the Anderson-Darling test if the thing is anywhere close to normal. My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.
I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.
And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.
For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit, and combining pdfs always gives us weird looking data.
Edit: I don't mean to be dismissive here. The work done is really great (and far better than other stats workups I've seen on these boards), deserves credit and it does have some meaning to it. I only include this in the discussion above for the sake of good bookkeeping on assumptions.
Also, maybe if more data is continues to be gathered, enough will be obtained to use the data as a time series (which it is), rather than as a sample. Just some thoughts. Keep up the good analysis, though. I liked reading all this. Good to see some other quanty nerds in here.
|
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it.
On July 14 2012 03:01 Treehead wrote: My suggestion (which can be ignored without any hard feelings) is that if we want this to be clear of scrutiny, we can remove normality concerns by just using Anderson-Darling to compare the races to begin with, instead of saying something like "well, you can almost reject the null at a significance value of 0.05 - so hopefully the reader is convinced..." when you can just skip that part. My suspicion is that A-D results will be just as low anyway - but in a serious study (which this doesn't have to be), you'd want to post those values, and not the T-test ones, because there's likely no downside to doing so.
My experience is that the A-D test is actually not as common as you think, especially given it's tremendous sensitivity at high sample values. It's much more common to show a fitted histogram as I've done to show that approximate normality is fufilled.
The purpose here is simply to show that the SD's are relevant calculations. If 1 SD cover 68% of the normalized data, but in actuality 72% of the real data, it's not a terrible problem when you're making observations over 3 SD's down the line, as the majority of your error is going to be somewhat centralized.
On July 14 2012 03:01 Treehead wrote: I completely agree with your assertion that the differences are rather low compared to the mean and stdev values. I wish this were more clearly reflected in the OP - as it would be easier to interpret for someone with a limited numerical background.
Yes. But defining effectively significant here is difficult.
On July 14 2012 03:01 Treehead wrote: And of course, the predominant concern I always have with using statistics to begin with is that pdfs are created with the unwritten assumption that your data (and hence, winning and losing) is analogous to a random variable, which is much harder to back up than any concerns about normality. I think that this is probably the reason for the large stdev and small differences seen in the data - because as time goes on, playstyles evolve, so we aren't looking at one set of distributions, we're looking at many sets of distributions which change over time as playstyles evolve and devolve.
The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...
On July 14 2012 03:01 Treehead wrote: For example, I'm guessing 1-1-1 is still reasonably effective in master's TvP these days. Maybe next month, though, some protoss badass comes out with a build that doesn't just beat it - it CRUSHES the 1-1-1 and puts you in a good spot against other builds as well. This might show in our data as a downswing in Terran MMR, but really what's happening is a metagame shift. The pdf for MMRs of TvPers doing 1-1-1 and the pdf for MMRs of TvPers doing other builds are almost assuredly different - especially when our new TvP strat is... new. Maybe I'm wrong, but this example was a hypothetical anyway. Point is - builds are still changing quite a bit. You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.
|
Skeledark - the number of profiles you'd want depends on the size of the confidence interval you want at a certain mean. If you wanted to make these calculations you'd need to use Excel's solver plugin to work back from interval size to sample size. Alternatively, you could guess and check to approximate it. when the day comes i install exel, i buy a mac, quit programming and dont look in the mirror again...
I willl wait and split the data into timelines in near future if it works out i just go on from there. The problem is, i get new user and loose old, so my data-income is not as stable as i wish.
|
On July 14 2012 03:14 lolcanoe wrote:
The high SD values for lower means was surprising for me too. Typically you'd expect it to be the other way around. I would be cautious of making any real conclusions about that though...
...
You've left the scope and purpose of this study so I'm not sure if I shoudl answer that.
Of course I'll be cautious. When confidence cannot accurately be assessed, people tend to be overconfident when the idea is their own and overcritical when it isn't. I'd be foolish to ignore that and proceed as though I were right about my "multiple distributions" theory.
If I were right though, it wouldn't be statistically provable without knowing more about each data and qualitatively categorizing different types of games into different categories - which a person couldn't really do for thousands of games without a lot more involved. You could try to place the games in some kind of pockets based on what info is known (such as time) and perform some kind of goodness-of-fit analysis, but fitness and disparity never proves a theory, it only shows that the data is what a theory would expect - which is less than useful. When something is not statistically provable, then, it must remain as theory. You have to admit, though, that the idea of varying MMR pdfs for varying builds in varying matchups is at least qualitatively plausible, I hope.
The paragraph you mention that has "left the scope of the study" was just a random example illustrating my theory. Don't read more into it than that.
|
Nice Job in taking the time to do so and informing all of us!~
|
This data shows Blizzard needs to buff Terran to bring back balance to the game. I hope somebody at Blizzard looks at this data because they need to realize the game has balance issues at this moment.
|
Is it possible to see what average time it takes for a race to win?
For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game.
|
|
On July 15 2012 16:27 themell wrote: Is it possible to see what average time it takes for a race to win?
For example, if TvZ win ratio in the early game is 50%, then we can say the early game is fair. But then we can see TvZ in late game is 20% win rate for Terran, then we can say Terrans are having difficulty in the late game. yes even way more accurate. I dont have time at the moment but the data is there
|
Update the result with a lot of stats:
Result
Source Main Data + Show Spoiler + - The data is biased towards EU/US and towards higher skill-rate.
Gamescount: 125976 Sc2-Accounts: 45203
-worst to best player: 3200 MMR -one average win/loose on Ladder: +16 / -16 MMR
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT
Average MMR per Race + Show Spoiler +Race account count: 15814 Data average MMR: 1539.46
Difference in average MMR per Matchup: T-P: -62.14 T-Z: -117.03 P-Z: -54.89
Average Win-ratio per Race + Show Spoiler +
TvP 50.43 Games: 6700 TvZ 46.7 Games: 8118 PvZ 51.61 Games 9189
Win-ratio per Race over Game-Time + Show Spoiler + TvP gamelength,%race1 win,%race2win, %of games 0,44.9,55.1,3.66 5,40.71,59.29,13.9 10,58.32,41.68,24.21 15,59.7,40.3,24.78 20,45.72,54.28,18.31 25,37.79,62.21,9.16 30,35.04,64.96,3.49 35,46.71,53.29,2.49
TvZ gamelength,%race1 win,%race2win, %of games 0,37.13,62.87,3.78 5,33.78,66.22,9.15 10,46.91,53.09,15.96 15,52.51,47.49,22.12 20,47.88,52.12,22.9 25,44.36,55.64,14.3 30,50.0,50.0,6.65 35,48.08,51.92,5.12
PvZ gamelength,%race1 win,%race2win, %of games 0,47.38,52.62,4.57 5,38.3,61.7,11.39 10,59.72,40.28,25.07 15,50.17,49.83,25.36 20,49.97,50.03,17.34 25,53.21,46.79,9.14 30,51.0,49.0,4.37 35,58.89,41.11,2.75
|
This is fantastic work, well done.
I'd just like to make the (obvious) point that the concept of an 'instantaneous balance' is a bad one that should be ignored. As skeldark has said many times, one of the ways to detect imbalance is to track the MMR of the player base over time - I'd argue that this is the only reasonable way to do it. A sufficiently large sample of games determined in a small time period is rather meaningless for the 'balance' of a game, especially due to the competitive nature and the way balance is completely tied to perception.
To give an example, if you had built a sample of games in the month following the NASL season 1 final, you probably would've seen an 'imbalance' in TvP - players of those races that had equal MMRs before Puma unveiled 1/1/1 would not have a 50% winrate once 1/1/1 became common. As such there would be a short term spike in TvP winrates, and the Protoss average MMR would drop until this winrate normalised to some extent. This would produce a corresponding rise in PvZ winrates as Protoss players are getting matched against zergs with a lower MMR than they're used to facing and nothing significant has changed in the matchup.
As such a development in the TvP matchup influences PvZ winrates and this happens fairly consistantly at all MMR ranges (with the possible exception of the bottom end MMR range). The only way you can distinguish the development of 1/1/1 from 'imbalance' in PvZ is by monitoring the MMRs over a sufficiently large time.
Furthermore does this mean the game is 'imbalanced'? Not even remotely. 1/1/1 was eventually solved without significant patching (immortal range is the only really important change), but before the solution was found no one could claim to know a solution would be found, so how could we comment on balance? Well we couldn't at the time... we needed to let games be played over a long enough period, then, if after months and months of 1/1/1 dominance we could possibly conclude that that particular 'strategy' was overpowered.
But the crucial point is that this works in the other direction as well. Let's assume that all 3 races have a player base with identical MMR distributions and all matchups havea 50-50 winrate. This doesn't mean the game is 'balanced' - someone might think up a strategy that causes one race to gain a significant advantage and is never overcome. Thus to determine 'balance' we need to be analysing a period of years, not months - a position we are now easily able to monitor thanks to skeldarks efforts.
But the main point I'm trying to make here is that balance is actually largely tied to perception and nothing more. The root of the problem lies in the fact that we're using one word to describe multiple concepts. If we say 'players of equal skill should get to where they are regardless of race choice,' we are being utterly foolish. What is meant by skill? Sheer mechanical speed? Strategising ability? On-the-fly decision making? There are so many factors of what constitutes 'skill' that you can't possible keep a general universal decision.
In fact I'd like to explicitly make the point that it is a BAD thing if a player gets to exactly the same MMR with all three races - this is a sign of a one dimensional game. I am a person possessed of certain abilities - those abilities happen to align with the skillset required by one particular race more than the others - hence I play that race, and accept that if I switch race I will not perform as well.
If we then ignore people using balance whine as a crutch to justify their own poor performance, we can only begin to talk about balance 'at the highest levels of the game.' The beauty of the game lies in the fact that 'balance' is inseparable from the 'distribution' of human abilities. If we genuinely cared about the game being balanced, we would have to care about 'the best possible player of starcraft 2' - which would undoubtedly be a computer ai possessed of unlimited apm that we don't quite have the ability to code yet. All we truly care about is A) the perception that over a sufficient period of time all three of the races perform 'equally well' at the highest level of human ability (ie tournaments) and B) active innovation is occuring.
I realise I've ranted on for quite some time and I must apologise, but +10 points if you managed to read this entire post.
|
I'd actually just like to follow up with a far more simple single statement that I believe cuts right to the point:
If you do not believe that the 'overpowered and imbalanced' race is the one you are playing then you've chosen the wrong race - balance is a function of the skill set required by a particular race matched to the corresponding distribution of skills in the human population. Your race should always feel like the 'easiest race' for *any* player at *any* skill level, or your abilities simply do not match up with those required by the race you've chosen. As such the best way to determine 'balance' is actually just to look at the percentage of players in each race over a long period of time as long as there are equal numbers of each race at any given bracket, then you can flat out conclude the game is 'balanced' in the only meaningful sense of the word.
|
Yay, graphs.
|
United Arab Emirates439 Posts
On July 15 2012 20:53 skeldark wrote:Update the result with a lot of stats: Result Source Main Data+ Show Spoiler + - The data is biased towards EU/US and towards higher skill-rate.
Gamescount: 125976 Sc2-Accounts: 45203
-worst to best player: 3200 MMR -one average win/loose on Ladder: +16 / -16 MMR
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT
Average MMR per Race+ Show Spoiler +Race account count: 15814 Data average MMR: 1539.46
Difference in average MMR per Matchup: T-P: -62.14 T-Z: -117.03 P-Z: -54.89
Average Win-ratio per Race+ Show Spoiler +
TvP 50.43 Games: 6700 TvZ 46.7 Games: 8118 PvZ 51.61 Games 9189
Win-ratio per Race over Game-Time+ Show Spoiler + TvP gamelength,%race1 win,%race2win, %of games 0,44.9,55.1,3.66 5,40.71,59.29,13.9 10,58.32,41.68,24.21 15,59.7,40.3,24.78 20,45.72,54.28,18.31 25,37.79,62.21,9.16 30,35.04,64.96,3.49 35,46.71,53.29,2.49
TvZ gamelength,%race1 win,%race2win, %of games 0,37.13,62.87,3.78 5,33.78,66.22,9.15 10,46.91,53.09,15.96 15,52.51,47.49,22.12 20,47.88,52.12,22.9 25,44.36,55.64,14.3 30,50.0,50.0,6.65 35,48.08,51.92,5.12
PvZ gamelength,%race1 win,%race2win, %of games 0,47.38,52.62,4.57 5,38.3,61.7,11.39 10,59.72,40.28,25.07 15,50.17,49.83,25.36 20,49.97,50.03,17.34 25,53.21,46.79,9.14 30,51.0,49.0,4.37 35,58.89,41.11,2.75
This is really cool.
Especially awesome to see TvZ 50/50 at 30 minutes, and PvZ 51/49 at 30 minutes.
|
Winrates in ladder are total useless. Look at EU GM. Only 18% Terrans remaining. Maybe the winrate is still around 50%, because all the terrans with a lower winrate are dropped down one league. Even when only Kas & Happy will remain in EU GM the winrate can be around 50%. However, this gives no clue about balance.
|
|
I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this.
|
On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this.
But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick
|
It's kinda funny how the people this effects the least are the ones who care the most about it.
|
On July 18 2012 23:11 Zacsafus wrote:Show nested quote +On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick
It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule.
Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2.
I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed.
|
|
OK, so I've gone back over the "find your MMR in one game" post by not_that and I am confused about a few things that represent a large leap from there to this post.
* Not_that's work makes a lot of sense, but it's quite clear (and stated very explicitly in that post) that you can only use that technique to find a master league player's MMR with respect to the 0 point of master league. How have you backed out MMRs for the other leagues, particularly considering how many different offsets there are per league and that you can't see the offsets?
* All of this work (in this post) seems to be overlooking the fact that Not_that's F function may actually be nonlinearly dependent on MMR. That is, "actual" MMR, where difference between two players' MMRs predicts a likelihood of win vs. loss, could map nonlinearly into the ladder-point-space "MMR" number that's being compared to adjusted points in not_that's F function. The result, if this were the case, would be that MMR changes after a game would not proportionally track the won or lost points, and the system would rely on MMR stabilization and the feedback between MMR and ladder points to force ladder points to reach equilibrium with MMR.
More to the point, if that were the case it would also cause the ladder-point-space MMR number not to follow a normal distribution, since it would be a normal distribution with a nonlinear mapping applied to it.
Edit: I forgot when I wrote this that lolcanoe's analysis seems to confirm that there is close to a normal distribution to these ladder-point-scale MMR numbers, so this is maybe a moot point.
* Finally, MMR is known not to be Elo, so if there are any points in any of these analyses that assume MMR to be Elo, those points are not valid. I only saw one place in Not_that's post where Elo came up, and it was along the lines of "oh, I saw blah blah blah in the data and that roughly reminds me of blah blah blah in Elo, so it's probably correct." That kind of thing is fine, though maybe not as good a reinforcement as he thinks.
* Edit: REALLY finally -- to the extent that skeldark is emphatic that there is no visible MMR cap in his data, that calls into question whatever process happened to the data before it wound up in ladder-point-scale MMR space, because we KNOW that there is such a cap. If that process is broken, all bets are off.
Bottom line is that I'm not sure that any of these issues are fatal to lolcanoe's analysis of the data set, assuming there's some answer to the first point, though skeldark's approach of generating tons of random games using some black-box code he wrote is a highly dubious way to interpret the data, and the question of why no MMR cap is visible is problematic. (Edit: It may simply be that the MMR cap does not affect enough people in the data set to be clearly evident.)
However, even with lolcanoe's confirmation that there is a modest amount of variation between races in this data set, there's still no way to tell why that is -- whether it comes from game design, player-originated biases in race choice, or simply players not having caught up to the current state of the game in their understanding of how, optimally, to play the races against each other. In that light, this whole discussion is a lot of heat and very little illumination.
|
On July 18 2012 23:52 monkybone wrote: Yes, this assumes a similar skill distribution for each race.
Author has basically just defined balance as average MMR, which doesn't give any information. But taking the skill distribution in consideration, then the MMR distribution gives evidence of balance.
OK. That makes sense then. Thank you for the clarification.
Personally I'll follow top tournament results for significant imbalance indications. I hope that Blizzard continues to improve the balance of WOL.
The win rates over time are interesting in the study. Thank you to whoever took the time to put that together.
|
On July 18 2012 23:58 Lysenko wrote: OK, so I've gone back over the "find your MMR in one game" post by not_that and I am confused about a few things that represent a large leap from there to this post.
* Not_that's work makes a lot of sense, but it's quite clear (and stated very explicitly in that post) that you can only use that technique to find a master league player's MMR with respect to the 0 point of master league. How have you backed out MMRs for the other leagues, particularly considering how many different offsets there are per league and that you can't see the offsets?
* All of this work (in this post) seems to be overlooking the fact that Not_that's F function may actually be nonlinearly dependent on MMR. That is, "actual" MMR, where difference between two players' MMRs predicts a likelihood of win vs. loss, could map nonlinearly into the ladder-point-space "MMR" number that's being compared to adjusted points in not_that's F function. The result, if this were the case, would be that MMR changes after a game would not proportionally track the won or lost points, and the system would rely on MMR stabilization and the feedback between MMR and ladder points to force ladder points to reach equilibrium with MMR.
More to the point, if that were the case it would also cause the ladder-point-space MMR number not to follow a normal distribution, since it would be a normal distribution with a nonlinear mapping applied to it.
Edit: I forgot when I wrote this that lolcanoe's analysis seems to confirm that there is close to a normal distribution to these ladder-point-scale MMR numbers, so this is maybe a moot point.
* Finally, MMR is known not to be Elo, so if there are any points in any of these analyses that assume MMR to be Elo, those points are not valid. I only saw one place in Not_that's post where Elo came up, and it was along the lines of "oh, I saw blah blah blah in the data and that roughly reminds me of blah blah blah in Elo, so it's probably correct." That kind of thing is fine, though maybe not as good a reinforcement as he thinks.
* Edit: REALLY finally -- to the extent that skeldark is emphatic that there is no visible MMR cap in his data, that calls into question whatever process happened to the data before it wound up in ladder-point-scale MMR space, because we KNOW that there is such a cap. If that process is broken, all bets are off.
Bottom line is that I'm not sure that any of these issues are fatal to lolcanoe's analysis of the data set, assuming there's some answer to the first point, though skeldark's approach of generating tons of random games using some black-box code he wrote is a highly dubious way to interpret the data, and the question of why no MMR cap is visible is problematic. (Edit: It may simply be that the MMR cap does not affect enough people in the data set to be clearly evident.)
However, even with lolcanoe's confirmation that there is a modest amount of variation between races in this data set, there's still no way to tell why that is -- whether it comes from game design, player-originated biases in race choice, or simply players not having caught up to the current state of the game in their understanding of how, optimally, to play the races against each other. In that light, this whole discussion is a lot of heat and very little illumination.
I'll be happy to address the points in your post as well as future ones you may have.
The output of F gives what we refer to as dmmr, which is the difference between the player's MMR and the opponent's league and tier offset. For example if player A plays opponent B who is from an unknown diamond division tier and has 300 adjusted points before the game and loses 12 points, we can tell that A's dmmr is 300+-14 (the deviation of F for 12 points change matches) in relation to B's diamond tier offset. If B was in master we would know A's MMR right there and then, however for all leagues with multiple tiers it is more complicated than that. This is why the MMR calculator requires more than a single match for players who play opponents below master before it becomes accurate.
Once we have multiple consecutive matches of A against opponents below master, we can infer more about A's MMR by looking at the series as a whole. We know that A's MMR increases after every win and decreases after every loss. From this we can start making predictions of A's opponent's tiers in their leagues. It is a fairly difficult problem and it has taken Skeletor quite a while to get it right, but by the current version of the calculator the predictions are very accurate at high game count. You may dispute his methods, but for the current topic of discussion it's enough to say that the individual races of A or his opponents do not play a role whatsoever. So any potential problem which may exist in the calculation method is unbiased towards producing results that show one race as having a different MMR than the others.
Skeletor described his methods in several posts. You call the calculator a black box, but he has released a source code (which afaik got no attention).
On to next point: MMR is not ELO (adjusted ladder points are very similar to ELO however). We never claimed that it is, and it's irrelevant. We do not calculate MMR changes, we simply read MMR values and let the ladder system worry about how MMR behaves.
We discussed the surprising behavior of MMR several times and how the "uncertainty value" stored for each player is surprisingly the same for all players who have played some matches since buying the game, even for players who experience DRAMATIC MMR fluctuations. You can literally go from Bronze to Master with 90+% winratio and your uncertainty value will be the same as that of a player who is steady. However once again I emphasize it has nothing to do with current topic as we don't care about the uncertainty value or what system governs MMRs, we simply read MMR values.
Regarding MMR cap, I haven't dug too deep into it. I looked at #1 ranked GM earlier in the season and his adjusted points (around 820 at the time. It's 981 for #1 on Europe atm) were close to his MMR. That to me indicated that if that player is MMR capped, he's probably very close to the cap as adjusted points and dMMR tend to get roughly close to each other after having played enough games and not experience MMR shifts. For a player who is MMR capped we expect his adjusted points to go quite a bit higher than his MMR and towards his uncapped dMMR, the reason for which I explain in #1 in the next paragraph.
The reasons I haven't spent much time on this are: 1) If player A is MMR capped, that doesn't affect A whatsoever. It affects A's OPPONENTS. They receive less points for beating A and lose more points for losing to him. Assuming that only few players at the top are capped, this would spread out across a large multitude of opponents each suffering small impacts from playing against capped players. The capped player's ladder points would otherwise behave as though they are uncapped. They wouldn't notice. 2) We aren't trying to model MMR behavior. We're simply reading it. If MMR is capped - great, we don't care. Those who are capped are GMs anyway (and potentially few high Masters who very much belong in GM but didn't get in due to Blizzard's strange qualification rules for it, i.e. Whitera who didn't get GM until near the end of the season last season). GM league has 1 tier - we easily read MMRs of people who play GMs, we don't care about cap.
|
On July 19 2012 04:02 Not_That wrote: Skeletor described his methods in several posts. You call the calculator a black box, but he has released a source code (which afaik got no attention).
Are those posts in this thread? If not, can you link to them? (If they are, I'll go back and look again.)
I wasn't referring to anything he did to calculate league offsets as a black box, because I don't recall seeing the documentation of that. What I was referring to was the code he talked about using to simulate large numbers of imaginary games in an attempt to guess what the likelihood of a given distribution would be.
|
They're probably a bit dated now. This is the one linked in the calculator thread, that discussion happened there. Best way to understand how the calculator figures out the MMR of players below Master is to find Skeletor on TL teamspeak and having a chat with him directly. We're always glad when people show interest in the theoretical part, and this is certainly something that would benefit from more minds thinking about, particularly when it comes to the as of yet unexplored territories.
Regarding the MMR numbers and the distribution, if you consider a few wins worth of MMR difference between the races as something that falls within the ladder's MMR distribution without being statistically significant, I don't think the part about simulating large numbers of imaginary games adds something to the table that will make you change your mind. As I understand it, it assumes a distribution of MMRs. Whether or not that assumption makes sense considering the data, I propose some of the statisticians who put this thread to so much scrutiny will prove or disprove.
|
On July 18 2012 23:26 Rick Deckard wrote: It's not necessary for blind people pick one race over another for my point to be valid. Actually, it is.
On July 18 2012 23:26 Rick Deckard wrote: This is proof by contradiction, it only requires a single counter example to disprove a rule.
What an awful, awful way to rehash the causation-correlation concerns.
On July 18 2012 23:26 Rick Deckard wrote: Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. If your "proof" had been valid, you would've been led to no conclusion, and no conclusion alone. In this particular situation, the opposite of your statement is true - differences in average MMR per race would GENERALLY suggest imbalance, however there are specific instances where causation is a concern. Keep in mind without a time-based model, most statistical tests leave causation-correlation debates to reasonable examination. Likewise, there are concerns about sampling, MMR measurement, normality, and applicability to the highest tier of play.
Op has already noted that most players sampled have high average MMR to lessen concerns that the MMR differences was centered around race choices of players who were new to the game, which was the predominant causation-correlation concern. Besides this concern, there seems to be no reasonable account to support the idea that there is a strong race selection-bias at hand here.
|
On July 20 2012 14:39 lolcanoe wrote: Op has already noted that most players sampled have high average MMR to lessen concerns that the MMR differences was centered around race choices of players who were new to the game, which was the predominant causation-correlation concern. Besides this concern, there seems to be no reasonable account to support the idea that there is a strong race selection-bias at hand here.
Thing is, I don't see any particular reason to expect that the impact of new-to-the-game players favoring Terran wouldn't trail off measurably all the way to master league. After all, a small but nonzero number of players get quite good at the game very fast.
Edit: Also, remember that that mechanism was just as much a factor in 2010 as today, so players who went through that process then have had plenty of time to improve their play. An excess of Terrans as one goes farther down in the distribution may simply be an echo of choices made two years ago and the fact that race choices tend to be somewhat "sticky" since people tend to like playing what they know.
|
On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this.
You're analogy is simply implying that every Z player is better than every T player. While, statistically speaking, this isn't impossible it's very arrogant to even assume it a possibility.
EDIT: Basically, what lolcanoe stated.
|
Updated the data:
MMR Filter: Above Master TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -15.77 Z: -0.77 P: 12.23
MMR Filter: Above Master TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -15.24 Z: -7.24 P: 17.76
MMR Filter: No TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -45.24 Z: 28.76 P: 6.76
MMR Filter: No TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -46.82 Z: 23.18 P: 14.18
|
On August 11 2012 21:23 skeldark wrote: Updated the data:
MMR Filter: Above Master TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -15.77 Z: -0.77 P: 12.23
MMR Filter: Above Master TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -15.24 Z: -7.24 P: 17.76
MMR Filter: No TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -45.24 Z: 28.76 P: 6.76
MMR Filter: No TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -46.82 Z: 23.18 P: 14.18
Interesting. Terran is UP according to the data. But now when they get faster Ravens Terran will dominate for sure.
|
On August 11 2012 21:31 MockHamill wrote:Show nested quote +On August 11 2012 21:23 skeldark wrote: Updated the data:
MMR Filter: Above Master TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -15.77 Z: -0.77 P: 12.23
MMR Filter: Above Master TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -15.24 Z: -7.24 P: 17.76
MMR Filter: No TIME Filter: 1 Jul 2012 00:00:00 GMT - 31 Jul 2012 23:59:59 GMT T: -45.24 Z: 28.76 P: 6.76
MMR Filter: No TIME Filter: 1 Aug 2012 00:00:00 GMT - 11 Aug 2012 11:47:54 GMT T: -46.82 Z: 23.18 P: 14.18 Interesting. Terran is UP according to the data. But now when they get faster Ravens Terran will dominate for sure. I know you're trolling, I'm not sure which group you target... damn.. time to upgrade my sarcasm detector.
|
Is above Master, GM only, or inclusive of all Master and above?
|
No wonder I've been so mad on ladder lately :/
|
On August 13 2012 18:40 Whirligig wrote: Is above Master, GM only, or inclusive of all Master and above? All master and above. Average mmr in gm would not tell you much. The result would be different if one top player just make a ladder break.
Their are just to few players. If i track everyone who is near gm-level, that would be - 1k accounts. But i dont have every account in my database so to few data...
|
On July 18 2012 23:26 Rick Deckard wrote:Show nested quote +On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed.
Your proof is not valid and you saying it is doesn't make it so...
|
On August 13 2012 22:54 SupLilSon wrote:Show nested quote +On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races!
The REASON for the imbalance of the data is not because of game-design in this case!
Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON.
What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason.
But i think he wants to show that imbalance dont have to come from gamedesign and this is true. However different reason for something dont change the reality of it.
You are free to argue about the reason however because of the missing big sociology study about this topic you will not come to an conclusion.
|
On August 13 2012 23:03 skeldark wrote:Show nested quote +On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous.
|
On August 13 2012 23:08 SupLilSon wrote:Show nested quote +On August 13 2012 23:03 skeldark wrote:On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous. It is. But he takes a extreme point to show the existens of an group of points. Thats a valid argumentation. What he did wrong is : He said he prove that this dont show balance but what he really proved is: "The reason for data-balance is not in 100% cases game-design." A point i mentioned in the op already.
However the main point is, that ANY way to calculate imbalance have the this problem. There is NO way to be 100% sure the reason for the imbalance comes from the game design.
But we have to just act like this is the case because in all other cases, there would be no point of balancing the game.
|
On August 13 2012 23:17 skeldark wrote:Show nested quote +On August 13 2012 23:08 SupLilSon wrote:On August 13 2012 23:03 skeldark wrote:On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous. It is. But he takes a extreme point to show the existens of an group of points. Thats a valid argumentation. What he did wrong is : He said he prove that this dont show balance but what he really proved is: "The reason for data-balance is not in 100% cases game-design." A point i mentioned in the op already. However the main point is, that ANY way to calculate imbalance have the this problem. There is NO way to be 100% sure the reason for the imbalance comes from the game design. But we have to just act like this is the case because in all other cases, there would be no point of balancing the game. Skeldark, you do realise that more than half of the discussion you end up in in this thread comes from you using the word "imbalance" in a different way than everyone else on this forum? 
Here on TL, when people talk about imbalance, they refer to game design. So as I told you several times already, if you want to avoid discussion like the one I am quoting, you need to be MUCH clearer with what you mean when you use that word. Or better, use the word the same way everyone else does, and use a different word for what you mean with imbalance, like "race dependent MMR distribution", which is a much more transparent term. So by typing a few more letter for that, you will save yourself 100 times that in avoiding replies like the quoted ones above. 
I don't care if your definition makes sense or not, it is just a matter of communication. And it is not effective communication to start using a word differently from everyone else.
But if you enjoy ending up in this discussion over and over again, go ahead.
|
On August 14 2012 08:30 Cascade wrote:Show nested quote +On August 13 2012 23:17 skeldark wrote:On August 13 2012 23:08 SupLilSon wrote:On August 13 2012 23:03 skeldark wrote:On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous. It is. But he takes a extreme point to show the existens of an group of points. Thats a valid argumentation. What he did wrong is : He said he prove that this dont show balance but what he really proved is: "The reason for data-balance is not in 100% cases game-design." A point i mentioned in the op already. However the main point is, that ANY way to calculate imbalance have the this problem. There is NO way to be 100% sure the reason for the imbalance comes from the game design. But we have to just act like this is the case because in all other cases, there would be no point of balancing the game. Skeldark, you do realise that more than half of the discussion you end up in in this thread comes from you using the word "imbalance" in a different way than everyone else on this forum?  Here on TL, when people talk about imbalance, they refer to game design. So as I told you several times already, if you want to avoid discussion like the one I am quoting, you need to be MUCH clearer with what you mean when you use that word. Or better, use the word the same way everyone else does, and use a different word for what you mean with imbalance, like "race dependent MMR distribution", which is a much more transparent term. So by typing a few more letter for that, you will save yourself 100 times that in avoiding replies like the quoted ones above.  I don't care if your definition makes sense or not, it is just a matter of communication. And it is not effective communication to start using a word differently from everyone else. But if you enjoy ending up in this discussion over and over again, go ahead.  I use it like everyone else, all i do is, i look deeper in it.
What most people, you included did not understand is: I explain data - imblance dont have to be game design. However its pointless to assume anything else. This is valid for every game and every method to detect imbalance.
This problem have nothing to do with this thread or my data. I just pointed out that you can never now the reason for sure. Other threads did not point this out even if its valid for everything. Blizzard patches, tournament results ect. People just started to realise that and now act like its related to the method i use or my data.
TLDR data balance dont have to be game design. If its not, there is no point of balance an game. So we have to assume data-inbalance = game design imbalance if we want to try to balance it.
PS: but yes im tiered of this discussion 
|
On August 14 2012 10:30 skeldark wrote:Show nested quote +On August 14 2012 08:30 Cascade wrote:On August 13 2012 23:17 skeldark wrote:On August 13 2012 23:08 SupLilSon wrote:On August 13 2012 23:03 skeldark wrote:On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous. It is. But he takes a extreme point to show the existens of an group of points. Thats a valid argumentation. What he did wrong is : He said he prove that this dont show balance but what he really proved is: "The reason for data-balance is not in 100% cases game-design." A point i mentioned in the op already. However the main point is, that ANY way to calculate imbalance have the this problem. There is NO way to be 100% sure the reason for the imbalance comes from the game design. But we have to just act like this is the case because in all other cases, there would be no point of balancing the game. Skeldark, you do realise that more than half of the discussion you end up in in this thread comes from you using the word "imbalance" in a different way than everyone else on this forum?  Here on TL, when people talk about imbalance, they refer to game design. So as I told you several times already, if you want to avoid discussion like the one I am quoting, you need to be MUCH clearer with what you mean when you use that word. Or better, use the word the same way everyone else does, and use a different word for what you mean with imbalance, like "race dependent MMR distribution", which is a much more transparent term. So by typing a few more letter for that, you will save yourself 100 times that in avoiding replies like the quoted ones above.  I don't care if your definition makes sense or not, it is just a matter of communication. And it is not effective communication to start using a word differently from everyone else. But if you enjoy ending up in this discussion over and over again, go ahead.  I use it like everyone else, all i do is, i look deeper in it. What most people, you included did not understand is: I explain data - imblance dont have to be game design. However its pointless to assume anything else. This is valid for every game and every method to detect imbalance.This problem have nothing to do with this thread or my data. I just pointed out that you can never now the reason for sure. Other threads did not point this out even if its valid for everything. Blizzard patches, tournament results ect. People just started to realise that and now act like its related to the method i use or my data. TLDR data balance dont have to be game design. If its not, there is no point of balance an game. So we have to assume data-inbalance = game design imbalance if we want to try to balance it. PS: but yes im tiered of this discussion  The reason (one of the reasons) you (and me) are tired of this discussion is because I told you the exact same thing earlier in the thread, and at that point you acknowledged that maybe it was a poor choice of word, but you seem to have forgotten that by now. 
I understand perfectly what you did. We discussed it for ages earlier in the thread, and I calculated the errors for you from your data file. Remember? 
What you (seem to) fail to understand is that when most of the people on TL use (or read) the word "imbalance" in this context, they refer to a design flaw. You know of all the "protoss imba" threads. They are not talking about sc2 newbs tending to choose terran, causing a tilted MMR distribution. They talk about flawed game play causing some races being easier to win with at the very highest level.
You are talking about imbalance in your data (different average MMR), which as you said, may or may not be a signal of a flawed design.
Due to these two different uses of the word "imbalance", when you talk about "proving imbalance" and so on, referring to imbalance in your data, many of the TLers skimming through your text will understand it as if you are saying that you have proven that there is a design flaw in sc2. I know that is not what you mean, but I'm telling you that that is what many understand from what they read. Which is why you end up in a lot of needless discussions. So if you are tired of discussions, it may be worth the effort to use a notation that will not be misunderstood. 
I'm not saying your choice of word is technically wrong, but it is begging for misunderstandings in this context, on this forum.
anyway. cheers, gl.
|
On August 14 2012 12:59 Cascade wrote:Show nested quote +On August 14 2012 10:30 skeldark wrote:On August 14 2012 08:30 Cascade wrote:On August 13 2012 23:17 skeldark wrote:On August 13 2012 23:08 SupLilSon wrote:On August 13 2012 23:03 skeldark wrote:On August 13 2012 22:54 SupLilSon wrote:On July 18 2012 23:26 Rick Deckard wrote:On July 18 2012 23:11 Zacsafus wrote:On July 18 2012 22:47 Rick Deckard wrote: I believe varying average MMR between races is not an indicator of imbalance.
Here's why, imagine there is a race exactly like zerg but is only be played by blind people, call it blind-zerg. This race would have a very low average MMR as blind people obviously can't play starcraft as well as sighted people. But because the blind-zerg race is the same as zerg it's no weaker than zerg. Thus a low average MMR per race doesn't necessarily imply that race is weaker.
In conclusion the statistics gathered in the study provided can't be used to make conclusions about how well starcraft 2 is balanced.
In practice the data is confounded by the fact that the average skill of of players of different races is not necessarily the same. Dividing race MMR by number of players per race to determine average race MMR doesn't change this. But blind people aren't more likely to pick one race over another, its not possible to imply that one race has worse players on it, at the highest levels everyone is competent and displays good skill so your point isnt valid at high levels, or really at any level because there is no bias between the races of which a handicapped/less-skilled player would pick It's not necessary for blind people pick one race over another for my point to be valid. I don't expect blind people to play starcraft. What I've shown is that differences in average MMR per race doesn't always imply imbalance, even significantly different average MMRs. This is proof by contradiction, it only requires a single counter example to disprove a rule. Given that (my proof is valid that) differences in average MMR per race doesn't imply imbalance in general, it also doesn't per se imply imbalance in starcraft 2. I'm not making a statement about the balance of the game at either high or low skill level. Just pointing out a logical flaw in the argument that different average race MMRs indicate imbalance. As best I can tell the author of the study has concluded because average MMR per race is different therefore the game is imbalance. I believe that logic to be flawed. Your proof is not valid and you saying it is doesn't make it so... Rick Deckard makes the mistake of redefining balance. If his example is true and only blind people play race x and this would make race x have lower mmr. Than the game is imbalanced. Race x is weaker in average than the other races! The REASON for the imbalance of the data is not because of game-design in this case!Notice that i make clear 100 times in op that i argue about the imbalance and not the REASON. What this factors are you can not tell without big studies about the topic and even than its more sociology than math. His theorie about blind people is a valid argument. But it is obvious not the reason. But he is right that imbalance dont have to come from gamedesign. I think that is the point he wants to make. However different reason for something dont change the reality of it. Yea, I understood the gist of his point, that MMR differences do not necessarly correlate directly with imbalance, that other factors may be at work. But the example of blind people favoring Z is pretty rediculous. It is. But he takes a extreme point to show the existens of an group of points. Thats a valid argumentation. What he did wrong is : He said he prove that this dont show balance but what he really proved is: "The reason for data-balance is not in 100% cases game-design." A point i mentioned in the op already. However the main point is, that ANY way to calculate imbalance have the this problem. There is NO way to be 100% sure the reason for the imbalance comes from the game design. But we have to just act like this is the case because in all other cases, there would be no point of balancing the game. Skeldark, you do realise that more than half of the discussion you end up in in this thread comes from you using the word "imbalance" in a different way than everyone else on this forum?  Here on TL, when people talk about imbalance, they refer to game design. So as I told you several times already, if you want to avoid discussion like the one I am quoting, you need to be MUCH clearer with what you mean when you use that word. Or better, use the word the same way everyone else does, and use a different word for what you mean with imbalance, like "race dependent MMR distribution", which is a much more transparent term. So by typing a few more letter for that, you will save yourself 100 times that in avoiding replies like the quoted ones above.  I don't care if your definition makes sense or not, it is just a matter of communication. And it is not effective communication to start using a word differently from everyone else. But if you enjoy ending up in this discussion over and over again, go ahead.  I use it like everyone else, all i do is, i look deeper in it. What most people, you included did not understand is: I explain data - imblance dont have to be game design. However its pointless to assume anything else. This is valid for every game and every method to detect imbalance.This problem have nothing to do with this thread or my data. I just pointed out that you can never now the reason for sure. Other threads did not point this out even if its valid for everything. Blizzard patches, tournament results ect. People just started to realise that and now act like its related to the method i use or my data. TLDR data balance dont have to be game design. If its not, there is no point of balance an game. So we have to assume data-inbalance = game design imbalance if we want to try to balance it. PS: but yes im tiered of this discussion  The reason (one of the reasons) you (and me) are tired of this discussion is because I told you the exact same thing earlier in the thread, and at that point you acknowledged that maybe it was a poor choice of word, but you seem to have forgotten that by now.  I understand perfectly what you did. We discussed it for ages earlier in the thread, and I calculated the errors for you from your data file. Remember?  What you (seem to) fail to understand is that when most of the people on TL use (or read) the word "imbalance" in this context, they refer to a design flaw. You know of all the "protoss imba" threads. They are not talking about sc2 newbs tending to choose terran, causing a tilted MMR distribution. They talk about flawed game play causing some races being easier to win with at the very highest level. You are talking about imbalance in your data (different average MMR), which as you said, may or may not be a signal of a flawed design. Due to these two different uses of the word "imbalance", when you talk about "proving imbalance" and so on, referring to imbalance in your data, many of the TLers skimming through your text will understand it as if you are saying that you have proven that there is a design flaw in sc2. I know that is not what you mean, but I'm telling you that that is what many understand from what they read. Which is why you end up in a lot of needless discussions. So if you are tired of discussions, it may be worth the effort to use a notation that will not be misunderstood.  I'm not saying your choice of word is technically wrong, but it is begging for misunderstandings in this context, on this forum. anyway. cheers, gl. sure i remember and i did before i wrote the last post, I know that i agreed with you that it was poor choice of words but i changed my opinion in some ways. Most people use it as design flaw because they just think: data imbalance = design imbalance. If top 16 of all tournaments for 10 years are terran, this is an imbalance of data but don't have to be from game design. (hard example: perhaps the rules of the tournaments only allow terran)
BUT: My mistake is to mix the this one little point about game design with the main topic, publishing my data. in the op there is no "proven imbalance" any-more. There is more a "take the data as what it is, if you dont understand it i dont care" ^^
I kind of regret the whole thing by now. TL is just the wrong platform for this kind of posts. This is in no means pointed at you.
|
just found this thread... great work. But I couldn't really find out how you calculate the MMR 
But I love the data and unbiased presentation of it, shows how close everything is.
|
|
Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg?
|
On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg? If one race draw weaker players the race is obvious weaker than the others! So the game is imbalanced. In this case, blizzard would try to buff this race to make the game balanced again. Please read the post on this page. we just discussed this topic. To the causal argument: First dataset show master player only.
Just edited the op:
This is the diffrence to the average MMR, of my Ladder-Data, per Race. Not more not Less. You can not see on any statistic game-data, if the reason is game design or social aspects. Not you, not me, not blizzard, not a single game designer! So we have the choice of paying for a global sociology study to find it out (if you can call it this way in sociology ^^) or just ASSUME it comes from game design like every game company does.
To your random data: There are only few random players and this would be a lot of work only to find out witch race is the favourite of most random players. I hope you dont think that favourite of randoms prove: strongest race for everyone because of game design. You started with social reasons. I think this example show them perfect.
Random players play the game total different than all others. My first runs with random come to the result, that random players have a -100 - -200 lower mmr than all other races.
They are the race that draw the player with less skill or it is a way harder to play different races. They start with an automatic advantage in sc2 and have no in-game disadvantages because of race-matchups. ( they would equal out in few games) So in this special case, it is even proven the Avg-MMR-diffrence must come from an social reason.
PS: Sorry, this part is even for me hard to read. But i dont know how to explain it different with my limited english.
|
On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg?
The data is more biased towards higher skill levels, ie. not "casuals". Please read the original post and stop posting the first thing that comes to your mind.
|
On August 15 2012 11:42 plogamer wrote:Show nested quote +On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg? The data is more biased towards higher skill levels, ie. not "casuals". Please read the original post and stop posting the first thing that comes to your mind.
You do realize that skeldark is the OP right? He made this algorithm and made the entire calculation. I think he knows a bit more than u do....
|
On August 15 2012 11:47 CaptainCrush wrote:Show nested quote +On August 15 2012 11:42 plogamer wrote:On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg? The data is more biased towards higher skill levels, ie. not "casuals". Please read the original post and stop posting the first thing that comes to your mind. You do realize that skeldark is the OP right? He made this algorithm and made the entire calculation. I think he knows a bit more than u do.... He quoted whacks not me. I think he posted while i wrote my answer and did not see it. @plogamer if you answer to everyone that dont read the full op like this, you have a lot to do on this website 
@CaptainCrush Just to clear this. I did not do the algorithm alone. The dmmr calculation and the tier offsets comes form Not_That. I did the tier analyser ( the algorithm that try to find out in what tier a player is) and the software-program.
|
On August 15 2012 11:42 plogamer wrote:Show nested quote +On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg? The data is more biased towards higher skill levels, ie. not "casuals". Please read the original post and stop posting the first thing that comes to your mind.
Easy on the nerd rage there 
What you and skeldark don't seem to be getting is that there is a self selection bias in your analysis. This self selection bias could introduce a systematic error that is prevalent at all levels of the game and cannot be eliminated by averaging large sample sizes. In the wow example that I gave, I was referring to those who hit level 60 within a few months... They spent hours playing the game everyday. And yet, even within them, the higher skilled players had a bias towards one race. This had nothing to do with design imbalance since the two races were almost exactly identical.
This is why any serious study in peer reviewed journals avoid self-selection bias where at all possible. The best way to do this for SC2, would be to look at data gathered by random-race players only.
Anyway, I've said my peace. If you decide to crunch the random-race data, that would be great. If not, thanks for what you've put together so far.
|
On August 16 2012 01:45 whacks wrote:Show nested quote +On August 15 2012 11:42 plogamer wrote:On August 15 2012 08:48 whacks wrote: Skeldark, when WOW first introduced battlegrounds (yes, ~7 years ago), Horde was winning these battlegrounds significantly more across all servers. But this wasn't because the game mechanics favored Horde in any way. It just so happened that "hardcore" players were significantly more likely to pick Horde and "casuals" more likely to pick Alliance. It's definitely very possible that the same could be happening in SC2; one race might be more likely to draw in players of weaker skill level. This would cause the average MMR of that race to be lower, even though it's actually balanced.
Some data I would LOVE to see: average win rates for Random players, broken down by race. Eg: What is the win percentage for random players when they spawn as Terran? What is the win percentage for random players when they spawn as Zerg? The data is more biased towards higher skill levels, ie. not "casuals". Please read the original post and stop posting the first thing that comes to your mind. Easy on the nerd rage there  What you and skeldark don't seem to be getting is that there is a self selection bias in your analysis. This self selection bias could introduce a systematic error that is prevalent at all levels of the game and cannot be eliminated by averaging large sample sizes. In the wow example that I gave, I was referring to those who hit level 60 within a few months... They spent hours playing the game everyday. And yet, even within them, the higher skilled players had a bias towards one race. This had nothing to do with design imbalance since the two races were almost exactly identical. This is why any serious study in peer reviewed journals avoid self-selection bias where at all possible. The best way to do this for SC2, would be to look at data gathered by random-race players only. Anyway, I've said my peace. If you decide to crunch the random-race data, that would be great. If not, thanks for what you've put together so far. You miss the point. My argument had nothing to do plogamer sentence. Social aspects are a total different point than data-biased. If you call the data biased because of the possible existent of social aspects than every data in this world is biased. Please reads my answer-post again.
|
|
I play all three races at Diamond level, and this corresponds with my own personal experiences. Playing Terran is like hard mode for Starcraft 2. ZvT is mind numbing-ly easy, you just don't make a huge mistake, get to late game, you just a move and win. You don't attack, not much micro, no scouting needed.
|
I don't understand why this doesn't get more attention. Am I interpreting the numbers wrongly or does terran show a MAJOR defficit in MMR?
|
skel can you make a graph on winrates based on mmr? so that we can see which race wins on what skill level
|
On August 26 2012 01:49 elanobissen wrote: I don't understand why this doesn't get more attention. Am I interpreting the numbers wrongly or does terran show a MAJOR defficit in MMR?
I think the fact that it's within ~50 MMR, out of 2000 plus, it's not that big a deal. Unless i'm also grossly misunderstanding the data.
From what i'm understanding (and correct me if i'm wrong), T loses an average of 1-3 more games a season than the others?
|
Yeah, Skeldark found a good objective method to prove that Terrans on the whole have lesser MMR. It is funny that only posters with some kind of stipulations of his methods are those with Zerg icons.
|
omg....U cant say that terran is doing terrible in tvz and so on lately when terrans have been getting the best results lately they have been winning all the more recent tournaments besides the recent na regional terrans have won. Mlg summer arena 1, Asus rog summer, Campus EU. IEM, ESV grandfinals and just overall good results and people say how tvz zerg is so imba when terrans have been doing just fine against zerg and I feel if u really are losing every game against zerg with terran u don't know how to play your race when there are terrans like taeja with a 82% win rate in tvz. and as Sheth says u cant say your race is so bad when its doing really good.
and being a zerg player I have noticed that when I play against terran a terran player will do some of the stupidest stuff ever and then they will lose because of it and its stupid because after they will talk about balance like they lost all because they chose to play terran no you were just the dumbest player in the game so just think as ur looking through these comments there are terrans out there that do stupid stuff and then complain about balance just because they can
and another thing I remember a time where terrans complained about protoss ALOT and how tvp protoss is op blah blah blah. AND then zerg got a buff to help deal with helions so they got a queen buff and the whloe point of this buff was to kind of help when 4 helions come to your nat so they don't slip into your main because after that happens its pretty much gg. and then after the buff no changes were made for terran or protoss but yet terrans just moved on and started complaining about zerg. and now Terran are doing really well against protoss and so on so I find that to be really dumb as well.
So overall alot of terrans out there are just stupid. thank you
|
On August 26 2012 02:05 Embir wrote: Yeah, Skeldark found a good objective method to prove that Terrans on the whole have lesser MMR. It is funny that only posters with some kind of stipulations of his methods are those with Zerg icons.
It's also no coincidence that every Terran Icon seems to aggree with him; people are okay with the race they play being "proven" to be UP, because it makes them feel better about their accomplishments, and justifies troubles they have. While Zerg (and Protoss too) who have problems against Terran, when drawing from their own experience, feel that the Terran UP data doesn't fit their personal experience, so they seek to disprove it, or find something incorrect with the data.
|
On August 26 2012 02:10 CrazyF1r3f0x wrote:Show nested quote +On August 26 2012 02:05 Embir wrote: Yeah, Skeldark found a good objective method to prove that Terrans on the whole have lesser MMR. It is funny that only posters with some kind of stipulations of his methods are those with Zerg icons.
It's also no coincidence that every Terran Icon seems to aggree with him; people are okay with the race they play being "proven" to be UP, because it makes them feel better about their accomplishments, and justifies troubles they have. While Zerg (and Protoss too) who have problems against Terran, when drawing from their own experience, feel that the Terran UP data doesn't fit their personal experience, so they seek to disprove it, or find something incorrect with the data.
You forgot about one thing in this. His method are absloutely objective.
|
On August 26 2012 02:06 System42 wrote: omg....U cant say that terran is doing terrible in tvz and so on lately when terrans have been getting the best results lately they have been winning all the more recent tournaments besides the recent na regional terrans have won. Mlg summer arena 1, Asus rog summer, Campus EU. IEM, ESV grandfinals and just overall good results and people say how tvz zerg is so imba when terrans have been doing just fine against zerg and I feel if u really are losing every game against zerg with terran u don't know how to play your race when there are terrans like taeja with a 82% win rate in tvz. and as Sheth says u cant say your race is so bad when its doing really good.
and being a zerg player I have noticed that when I play against terran a terran player will do some of the stupidest stuff ever and then they will lose because of it and its stupid because after they will talk about balance like they lost all because they chose to play terran no you were just the dumbest player in the game so just think as ur looking through these comments there are terrans out there that do stupid stuff and then complain about balance just because they can
and another thing I remember a time where terrans complained about protoss ALOT and how tvp protoss is op blah blah blah. AND then zerg got a buff to help deal with helions so they got a queen buff and the whloe point of this buff was to kind of help when 4 helions come to your nat so they don't slip into your main because after that happens its pretty much gg. and then after the buff no changes were made for terran or protoss but yet terrans just moved on and started complaining about zerg. and now Terran are doing really well against protoss and so on so I find that to be really dumb as well.
So overall alot of terrans out there are just stupid. thank you
Are u dumb, or just having difficulties with reading comprehension? Where did I say that Terrans are doing terrible? They are just doing worse on the ladder - simple as that.
Also - before huge Zerg buffs, Nestea, DRG and Fruitdealer won GSL, not even speaking about Stephano or other Zergs in foreigner tournaments who won many times before patch. You see where Im coming? Yes, fact that single player is doing good with given race doesn't prove that certain race is balanced or not.
User was temp banned for this post.
|
On August 26 2012 02:19 Embir wrote:Show nested quote +On August 26 2012 02:06 System42 wrote: omg....U cant say that terran is doing terrible in tvz and so on lately when terrans have been getting the best results lately they have been winning all the more recent tournaments besides the recent na regional terrans have won. Mlg summer arena 1, Asus rog summer, Campus EU. IEM, ESV grandfinals and just overall good results and people say how tvz zerg is so imba when terrans have been doing just fine against zerg and I feel if u really are losing every game against zerg with terran u don't know how to play your race when there are terrans like taeja with a 82% win rate in tvz. and as Sheth says u cant say your race is so bad when its doing really good.
and being a zerg player I have noticed that when I play against terran a terran player will do some of the stupidest stuff ever and then they will lose because of it and its stupid because after they will talk about balance like they lost all because they chose to play terran no you were just the dumbest player in the game so just think as ur looking through these comments there are terrans out there that do stupid stuff and then complain about balance just because they can
and another thing I remember a time where terrans complained about protoss ALOT and how tvp protoss is op blah blah blah. AND then zerg got a buff to help deal with helions so they got a queen buff and the whloe point of this buff was to kind of help when 4 helions come to your nat so they don't slip into your main because after that happens its pretty much gg. and then after the buff no changes were made for terran or protoss but yet terrans just moved on and started complaining about zerg. and now Terran are doing really well against protoss and so on so I find that to be really dumb as well.
So overall alot of terrans out there are just stupid. thank you Are u dumb, or just having difficulties with reading comprehension? Where did I say that Terrans are doing terrible? They are just doing worse on the ladder - simple as that. Also - before huge Zerg buffs, Nestea, DRG and Fruitdealer won GSL, not even speaking about Stephano or other Zergs in foreigner tournaments who won many times before patch. You see where Im coming? Yes, fact that single player is doing good with given race doesn't prove that certain race is balanced or not. well thanks for showing me and everyone else that ur the most retarded person in the world did I quote you no did I say anything about you no I just that overall think next time please and hey dumbass terran has the most gsl title wins I hope u feel really stupid right now and embarrassed and signle player? LOL Taeja, Merineking, Alive, Supernova, MVP, MMA, all winning huge tournaments recently so I recommend u stfu because thats what most terran players were doing.
User was temp banned for this post.
|
On August 26 2012 02:13 Embir wrote:Show nested quote +On August 26 2012 02:10 CrazyF1r3f0x wrote:On August 26 2012 02:05 Embir wrote: Yeah, Skeldark found a good objective method to prove that Terrans on the whole have lesser MMR. It is funny that only posters with some kind of stipulations of his methods are those with Zerg icons.
It's also no coincidence that every Terran Icon seems to aggree with him; people are okay with the race they play being "proven" to be UP, because it makes them feel better about their accomplishments, and justifies troubles they have. While Zerg (and Protoss too) who have problems against Terran, when drawing from their own experience, feel that the Terran UP data doesn't fit their personal experience, so they seek to disprove it, or find something incorrect with the data. You forgot about one thing in this. His method are absloutely objective. -_- I'm not saying they're right, I'm saying in explains the behavior. It doesn't matter how objective the method is, people like hearing there race is harder than the others, and don't enjoy it vice versa.
|
On August 26 2012 02:32 System42 wrote:Show nested quote +On August 26 2012 02:19 Embir wrote:On August 26 2012 02:06 System42 wrote: omg....U cant say that terran is doing terrible in tvz and so on lately when terrans have been getting the best results lately they have been winning all the more recent tournaments besides the recent na regional terrans have won. Mlg summer arena 1, Asus rog summer, Campus EU. IEM, ESV grandfinals and just overall good results and people say how tvz zerg is so imba when terrans have been doing just fine against zerg and I feel if u really are losing every game against zerg with terran u don't know how to play your race when there are terrans like taeja with a 82% win rate in tvz. and as Sheth says u cant say your race is so bad when its doing really good.
and being a zerg player I have noticed that when I play against terran a terran player will do some of the stupidest stuff ever and then they will lose because of it and its stupid because after they will talk about balance like they lost all because they chose to play terran no you were just the dumbest player in the game so just think as ur looking through these comments there are terrans out there that do stupid stuff and then complain about balance just because they can
and another thing I remember a time where terrans complained about protoss ALOT and how tvp protoss is op blah blah blah. AND then zerg got a buff to help deal with helions so they got a queen buff and the whloe point of this buff was to kind of help when 4 helions come to your nat so they don't slip into your main because after that happens its pretty much gg. and then after the buff no changes were made for terran or protoss but yet terrans just moved on and started complaining about zerg. and now Terran are doing really well against protoss and so on so I find that to be really dumb as well.
So overall alot of terrans out there are just stupid. thank you Are u dumb, or just having difficulties with reading comprehension? Where did I say that Terrans are doing terrible? They are just doing worse on the ladder - simple as that. Also - before huge Zerg buffs, Nestea, DRG and Fruitdealer won GSL, not even speaking about Stephano or other Zergs in foreigner tournaments who won many times before patch. You see where Im coming? Yes, fact that single player is doing good with given race doesn't prove that certain race is balanced or not. well thanks for showing me and everyone else that ur the most retarded person in the world did I quote you no did I say anything about you no I just that overall think next time please and hey dumbass terran has the most gsl title wins I hope u feel really stupid right now and embarrassed and signle player? LOL Taeja, Merineking, Alive, Supernova, MVP, MMA, all winning huge tournaments recently so I recommend u stfu because thats what most terran players were doing.
Enjoy your ban.
|
On August 26 2012 03:05 CrazyF1r3f0x wrote:Show nested quote +On August 26 2012 02:13 Embir wrote:On August 26 2012 02:10 CrazyF1r3f0x wrote:On August 26 2012 02:05 Embir wrote: Yeah, Skeldark found a good objective method to prove that Terrans on the whole have lesser MMR. It is funny that only posters with some kind of stipulations of his methods are those with Zerg icons.
It's also no coincidence that every Terran Icon seems to aggree with him; people are okay with the race they play being "proven" to be UP, because it makes them feel better about their accomplishments, and justifies troubles they have. While Zerg (and Protoss too) who have problems against Terran, when drawing from their own experience, feel that the Terran UP data doesn't fit their personal experience, so they seek to disprove it, or find something incorrect with the data. You forgot about one thing in this. His method are absloutely objective. -_- I'm not saying they're right, I'm saying in explains the behavior. It doesn't matter how objective the method is, people like hearing there race is harder than the others, and don't enjoy it vice versa.
Oh so this was only psychology off topic? Thanks.
|
|
|
|