|
On July 12 2012 06:38 skeldark wrote: if you talk about my mmr-calculator: it shows your MMR. For that i have to know nothing about the Bayesian skill rating models, becaues i back-engineer towards it.
I think you focuss to much on the skill system here. Thats not what this is about...
I edited my post to remove my comments on this because I figured you did something like that and I didn't want to be unfair or over-critical.
|
On July 12 2012 06:40 Lysenko wrote:Show nested quote +On July 12 2012 06:38 skeldark wrote: if you talk about my mmr-calculator: it shows your MMR. For that i have to know nothing about the Bayesian skill rating models, becaues i back-engineer towards it.
I think you focuss to much on the skill system here. Thats not what this is about...
I edited my post to remove my comments on this because I figured you did something like that and I didn't want to be unfair or over-critical. 
Edit my too. Btw i bet the blizzard guys have a lot of fun reading our discussion 
|
Well i am not sure if you have your the basic statistical requirements considered?
As far as i am concerned you want to draw conclusions from a smaller sample size to draw a conclusion to a bigger sample size.
Define:
What is the smaller sample size? What is the whole data set you want to draw conclusions about? How do you select the smaller sample size from the bigger data set? (has each element in the whole data set the same chance to get into the sample?) What statistical distribution underlies the value you want to analyze? What is the probability of error in your sample size? How big has your sample size to be? (which what statistical significance niveau are you working?) What is your Hypothesis? you want to prove or disprove?
You can't just crunch some numbers (dont want to down play your work) and say that they state balance overall. They may tell balance of your smaller sample, but are we sure they also tell us anything about overall balance?
|
Interesting results. Thank you for doing this. ^_^
|
On July 12 2012 06:49 freetgy wrote: Well i am not sure if you have your the basic statistical things were considered:
As far as i am concerned you want to draw conclusions from a smaller sample size to draw a conclusion to a bigger sample size.
Define:
What is the smaller sample size? What is the whole data set you want to draw conclusions about? How do you select the smaller sample size from the bigger data set? (has each element in the whole data set the same chance to get into the sample?) What statistical distribution underlies the value you want to analyze?
-my collected data -all active sc2 players -No, explanation why and how this affect the result you see when you start reading the op -I have no clue whatsoever! and before you now start to run amok, start reading the op and than the last 5 pages of the discussion
How many of you guys are out there? Are you the one that wants to calculate it? or are you the one that just ask his 4 questions? data source in linked in the op.
On July 12 2012 06:52 Tsunami49 wrote: Interesting results. Thank you for doing this. ^_^ Thank you very much.
|
- How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing
|
On July 12 2012 06:59 freetgy wrote: - How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing
then why did you ask questions that are answered in the op? If you did not understand what you read there should read it 3 time.
"Define all active sc2 players..." Serious some people here...
|
Also grandmaster is the worst example because its not at all top 200. But that is an different story.
True, but it shows that above silver Terran is under represented globally in every league.
I guess the next step is to integrate mmr calculations into an online database, and then into sc2gears...
|
So the OP tried proving imbalance by using ladder MMR data that can only be gotten by people running his addon. This gives a very small sample size, especially as can be seen by the data he provided. I was expecting more statistics to go with his claims, maybe average MMRs for the 3 races, standard deviations, outliers, maybe giving us the total number of players he has in his data. Also saying "he deviation shows, that the diffrence of the race-values are to big, to be explained with an random errors." is a really weak statement when not backed up with data showing how likely it is to get such a deviation, perhaps with a P value.
|
Ok guys last time: Is this a universtiy paper about sc2 balance ? No! Do i get money for it ? No! Do i care which race is imba because i want to whine ? No!
Beside all of that before you act like you can test me make sure you understood what i did dont randomly assume stuff. Look at post above me
I publish the data i collected with my own program that i wrote to backcalculate mmr. I found a very interesting anomalie in the race data. I programmed a quick testroutine to show this anomalie and that is very unlikely that its a random source. If you are interested in making it a university paper i will not stop you . I give you all my source data and you dont even have to say "thank you".
If someone have something PRODUCTIVE to add, just pm me.
On July 12 2012 07:04 Natespank wrote:Show nested quote +Also grandmaster is the worst example because its not at all top 200. But that is an different story. True, but it shows that above silver Terran is under represented globally in every league. I guess the next step is to integrate mmr calculations into an online database, and then into sc2gears...
? thats not the next step , thats where i have my data from! I did this 1 moth ago. http://www.teamliquid.net/forum/viewmessage.php?topic_id=334561
|
On July 12 2012 07:02 skeldark wrote:Show nested quote +On July 12 2012 06:59 freetgy wrote: - How did you collect your data? does it meet the criteria of a random sample? (http://en.wikipedia.org/wiki/Random_sample) - Define all active sc2 players ("ladder-balance-data"???) - I read the op at least twice
If you want to use you statistical methods to prove something, be clear at what you are analysing then why did you ask questions that are answered in the op? If you did not understand what you read there should read it 3 time. "Define all active sc2 players..." Serious some people here...
No you do not understand that the statistics you used have to be at least made clear so we can understand over what you did draw conclusions on.
1) As far as i understand, people did manually upload their replays to your tool. Whichs means you only use sampling out of this data set therefore can only draw conclusions over this data set and not over "ladder-balance".
2) Because of this is ensured that your data set is unbiased? One basic assumption that could be made is that in your data set people on the extremes are over represented, because those are the ones that are interested in balance most and are active here in this forum, while the real average mostlikely could be underrepresented.
Don't get offended just because we ask questions to see if your work is solid. I appreciate your wrk but it is the presentation by you that is lacking.
|
On July 11 2012 11:42 skeldark wrote:Show nested quote +On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391Show nested quote + EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering.
http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4.
Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution.
That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check).
|
Thats my problem. You come here ask your question but did not check the information first. And you are not the first you are nr. 100
As far as i understand, people did manually upload their replays to your tool. Wrong, they upload bnet data while playing
Whichs means you only use sampling out of this data Wrong i use mostly the opponent they play not the user that have my program
Because of this is ensured that your data set is unbiased? Wrong because the opponent is unbiased but his skill range is not because user skill = opponent skill
One basic assumption that could be made is that in your data set your people on the extremes are over represented Yes. My userbase is overrepresented on the higher skill. Thats clearly pointed out in the op.
Sorry to be harsh but i really regret publishing this by now
|
On July 12 2012 07:15 VediVeci wrote:Show nested quote +On July 11 2012 11:42 skeldark wrote:On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391 EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering. http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4. Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution. That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check). Its more likely ELO not Bayesian inference!
Did not want to discuss this to the end because i tryed to make the point that it does not matter for the race value. I know its outdated i dont understand it myself. I know guass glock could be wrong. Dont overrate the picture tho. Its only 1 picture in the programm to give the people an idea where they are on the ladder. Its not used to calculate the MMR-value.
Also whatever system it is, is not so smooth., Blizzard had to correct the offset several time to get back to the 20/20/20 and it can be that they gave up on this by now.
|
On July 12 2012 07:24 skeldark wrote:Show nested quote +On July 12 2012 07:15 VediVeci wrote:On July 11 2012 11:42 skeldark wrote:On July 11 2012 11:38 VediVeci wrote: Hi, great post! I found this to be very informative and interesting, very well done.
I do have some questions about your methods though, and please forgive me if you have already addressed these.
If I have found the correct formulas you are using, you appear to be assuming an ELO rating system? I was under the impression, after listening to speech from Josh Menke at UCI, that the MMR is actually a determined using Gaussian Density Filtering? Is there a source that someone can point me to clearing this up? Regardless, your method should provide a decent approximation of MMR anyway, and ELO is certainly a valid ranking system in its own right.
if you are interested in the mmr calculation: Here you find a lot of information about how to calculate DMMr from ladderpoints. (DMMR = mmr not cleaned from his division yet) http://www.teamliquid.net/forum/viewmessage.php?topic_id=332391 EDIT: Also, you mentioned in a comment that you don't know if players are normally distributed, but doesn't ELO assume normal distribution? Assuming a similar distribution I don't think it would affect it too significantly though
Exactly. the player/mmr base is normal by definition: have a look at my program: ![[image loading]](http://i.imgur.com/9Ag8n.jpg) If this race data is , i dont know but i think so. Someone can test it. I looked at the link to the MMR article that you gave, but I'm still unclear that ELO is in fact the method used for MMR since it didnt cite source material. ELO is generally considered to be a bit outdated at this point, most major ranking systems use other, albeit similar, methods. Can someone point me to a link from blizzard here? In this speech by Josh Menke, blizzards head rankings guy, he seems to imply (around 45:00) that MMR is actually calculated using Gaussian Density filtering. http://www.ics.uci.edu/~develop/Lectures/CGVW031412_MP4 360p (16x9).mp4. Also, player skill is not necessarily normally distributed. Chess skill, for example, is more closely modelled by a logistic distribution. That being said, I don't necessarily think these issues are enough to invalidate your work even if I'm correct (if I am right, I'll look into it more and try to check). Its ELO not Bayesian inference! did not want to discuss this to the end because i tryed to make clear that it does not matter for the race value. I know its outdated i dont understand it myself. I know the guass could be wrong. Dont overrate the picture tho. Its only 1 picture in the prog to give the people an idea where they are on the ladder. Its not used to calculate the mmr value. Also whatever system it is the offest correction of the past show us the graph is not so smooth., Blizzard had to correct the offset several time to get back to the 20/20/20 and it can be that they gave up on this by now.
Ok, I'm still not convinced that MMR is ELO but that's just an aside really. The work you did was impressive and I don't mean to diminish it by asking, I just wanted to satisfy my own curiosity. Even if Blizzard doesn't use ELO, your conclusions are still valid in terms of relative race strength given the data, and their model would also reflect this.
And oops, I meant to remove the picture, I wasn't considering it in my arguments though anyway.
Edit: added "wasn't" to final sentence.
|
On July 12 2012 06:20 skeldark wrote:
I understand. Same for me. this is a site project. The time i have i use to work on the mmr analyser. lolcanoe ? how long do you need?
Whether this is a community forum or a university-level discussion doesn't change the validity or importance of the critiques here. Where your argument is posted unfortunately has no bearing on the scrutiny it deserves.
And why I haven't been helpful? Because I wanted to see if the self-admitted arrogance here was justified or not. I wanted to confirm my suspicions that you actually do not have a strong grasp of statistics and I wasn't just misinterpreting your calculations. Now that my suspicions are confirmed, let me clearly layout problems that I see.
1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal.
2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average.
3. In these calculations, independence between populations is a fair concern - and should likewise be noted.
4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well.
|
You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction.
|
On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced.
The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data.
|
On July 12 2012 07:55 lolcanoe wrote:Show nested quote +On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced. The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data.
And what exactly are you attempting to prove? That there isn't a sum total of 5-6 wins difference between terran and zerg and that random players don't have a significantly decreased chance to win games at reasonably high levels of play? Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought!
Do you have a better method of representing this data than has been presented? Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate? Now I am not going to pretend to be a mathematician as I'm a physicist and we do retard maths, but there are plenty of people who are. Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
|
On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction.
Spoken like a true physicist
|
|
|
|