|
On July 13 2012 04:53 monkybone wrote:Show nested quote +On July 13 2012 04:40 skeldark wrote:On July 13 2012 04:32 monkybone wrote:On July 13 2012 04:17 skeldark wrote:On July 13 2012 04:14 monkybone wrote: Why does uneven skill distribution not affect average MMR in an ideal situation with perfect balance? because a situation with perfect balance = even skill distribution When i talk about balance i talk about even skill distribution of races. That the balance of the Property (race) of the data (account skill) This DONT have to be game design balance. Last one is a social term and can be calculated because its not even clear defined. If all Terran pro players are ill and can not play is the game still balanced? I say no. You could say yesl Its not a mathematical question. Of course the game could be balanced even in absurd situations where all Terran players were complete scrubs or any other skill distribution. Now you talking about game balance. So what is game balance. Define... I say the fact that all terrans are scrubs = inbalance. Its imblanced because all terran are scrubs. My definition of balance = not all terrans are scrubs! You say its balanced because the reasons is the player not the game. Reason can be anything. That is not a mathematical value! You are free to find statistic methods or social analyses what is a reason for inbalance in the data. Thats not what i did. I look IF there is inbalance in the data not what cases it. Didn't you read the whole post? You want to make a statistic about how much someone is improving compare to how much he trains. I did not see what this have to do with what i did thats why i ignored it. I think we talk about different topics here. You think its more interesting to see the mmr change over trainging i calculated the total mmr average. What should i say to this? yes it could be interesting to see how the mmr change with more games depending on the race.
|
Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting.
|
I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o
|
On July 11 2012 01:34 skeldark wrote:
[...]
Result TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT Datasize 10063 Average MMR: 1593.1 Min Difference to be significant: 90% : +-16 99% : +-24 99,99% : +-36 [...]
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT MMR Filter: Only Master+ Datasize 2278 Average MMR: 2278.03 Min Difference to be significant: 90% : +-15 99% : +-23 99,99% : +-35 [...]
The significance intervall is lower for the larger sample? That doesn't make sense to me.
Someone help me out here please.
|
On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993,
On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
|
On July 13 2012 05:03 xian_ wrote:Show nested quote +On July 11 2012 01:34 skeldark wrote:
[...]
Result TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT Datasize 10063 Average MMR: 1593.1 Min Difference to be significant: 90% : +-16 99% : +-24 99,99% : +-36 [...]
TIME Filter: only between 1 Jan 1970 00:00:00 GMT - 12 Jul 2012 16:52:47 GMT MMR Filter: Only Master+ Datasize 2278 Average MMR: 2278.03 Min Difference to be significant: 90% : +-15 99% : +-23 99,99% : +-35 [...]
The significance intervall is lower for the larger sample? That doesn't make sense to me. Someone help me out here please. The lower sample have a way lower range. ( only master vs all skillranges) Ask myself the same question when i saw the output ^^
|
Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
Indeed, concerning the sample size, it may not be relevant, that not all races are played equally distributed in different countries, which have their own respective disconnect-rates. So even if there were like 70% of the Russians playing Terran and suffer from an increased disconnect-rate of say 100%, that might not really bias the sample at all. Might become a point, when you crack the trillion-observation-sample-size :D
|
On July 13 2012 04:53 lolcanoe wrote:Show nested quote +On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself.
First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here.
Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n).
But I don't see any of that happenning here...
Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
|
On July 13 2012 05:08 BBS wrote:Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? Indeed, concerning the sample size, it may not be relevant, that not all races are played equally distributed in different countries, which have their own respective disconnect-rates. So even if there were like 70% of the Russians playing Terran and suffer from an increased disconnect-rate of say 100%, that might not really bias the sample at all. Might become a point, when you crack the trillion-observation-sample-size :D Seriously guys... this sort of strain would have an entirely neglible effect on anything at all. If you studied statistics you'd know that proportional errors are pretty independent of sample size, so if it's a neglible effect to begin with it'll continue to be that way even as the data pool grows.
|
On July 13 2012 05:29 lolcanoe wrote:Show nested quote +On July 13 2012 04:53 lolcanoe wrote:On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself. First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here. Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n). But I don't see any of that happenning here... Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
what you see on the graph is the deviation of the playerbase in total. It have nothing to do with my calculation. i did not say its not normal. I dont now what it is.
I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
The "mix" was an quote to you because you asked something about why different groups . I try to explain you that with 3 different races i need 3 different groups , one for each race or i can not calculate their average.
Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation.
|
On July 13 2012 05:03 skeldark wrote:Show nested quote +On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
TY, is there any information on how high masters MMR goes or about where GM starts? Or is that not available due to the nature of GM. I believe there's still a cap on masters MMR though, wonder if that number is available somewhere.
|
On July 13 2012 05:47 hunts wrote:Show nested quote +On July 13 2012 05:03 skeldark wrote:On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? TY, is there any information on how high masters MMR goes or about where GM starts? Or is that not available due to the nature of GM. I believe there's still a cap on masters MMR though, wonder if that number is available somewhere. ^^ i edited it out of the copy paste. Because this value change all the time and is not very accurate. +800 on master is a good hand rule for gm But gm is... fucked up. Its not top 200 at all. Its just a bad system.
there is no cap for master mmr only lower leagues have a cap that they can not fall under 0 of their tier. If you mean an Maximum MMR we are not sure if it exist. When it exists this would be terrible because it would destroy the skillfunction on the high end. We just hope blizzard is clever enough not to build one in or build it only in the match finding algorithm.
|
On July 13 2012 05:38 skeldark wrote: [I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
Jesus christ I don't think an explanation could be more obfuscated than that.
On July 13 2012 05:38 skeldark wrote: Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation.
The question are not random. What's random is your procedure. Your tests are inconsistent with any conventional test and the conclusions are based on pseudo-stats at best.
In light of the bullshit here, let me post very very concisely what you can do to the data to make this test statistically more sound.
Assume normality - but add a disclaimer that normality is an assumption that you do not take lightly..
Take T, Z, P data pools and calculate the following:
Mean: Average MMR of each race. SD: Use a SD calculator to find the standard deviations of each race.
Test two races at a time (start with T vs Z, etc), using the 2 sample t test. See http://ccnmtl.columbia.edu/projects/qmss/the_ttest/twosample_ttest.html
Test the t-statistic against a P of .01.
Post the data with caveats about the conclusion and caveats about the normality assumption.
- Done.
|
On July 13 2012 05:29 lolcanoe wrote:Show nested quote +On July 13 2012 04:53 lolcanoe wrote:On July 13 2012 04:50 skeldark wrote:On July 13 2012 04:46 lolcanoe wrote: You seem to be ignoring the more important questions about defining what you mean by a "data group" and skipping to what you like rehashing 100 times. a data group is a group of the datapool. A subgroup I can not mix all together because i have 3 different races. So to find out the average of one race i have to take all player of only this race. This is the data group E.g. terran. All terran players of the data. A random data group is when i give every Player a random number and than take the group where this random number is 1. Yes but what's the statistical value of creating random data groups? And what do you mean you didn't mix them together - aren't you data values calculated by average(t) - average(t,z,p)? And you did use a player-weighted average right? I find it pretty disturbing that I have to requote myself everytime i ask questions about the data itself. It so hard to get to the bottom of how you calculated the confidence interval that I'm suspecting that you don't understand what's going on yourself. First we argue about normality, the next moment you tell me it's not based on normality at all. I press further to ask how you are getting the intervals then, you tell me that 99.99% of "random groups" fall are in a certain range. Now there's good reason to ask why these random groups exist, what they are, and what value they add but once again you seem to be dodging what's important here. Just by looking at the MMR graph alone, it seems like we have a pretty high standard deviation regardless of what assumptions we are making. I suppose what you really want (since you are dealing with averages) is the standard error of the mean adjusted by the sqrt(n). But I don't see any of that happenning here... Finally for those of you who keep asking us to do the questions ourselves, the complete data package doesn't seem to be readily available. It also seems to be constantly changing...
Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing.
|
On July 13 2012 05:52 lolcanoe wrote:Show nested quote +On July 13 2012 05:38 skeldark wrote: [I add the random groups to show that my data is not just a random point, because it is outsite of the range that random points would produce. I try to come up with simple examples and i just dont know how to explain it different than i did.
Jesus christ I don't think an explanation could be more obfuscated than that. Show nested quote +On July 13 2012 05:38 skeldark wrote: Your questions are kind of random. I cant explain you every sentence i wrote in the op and than you want me to explain every sentence of the explanation. I think i explain decent what i did. I dont see any way to explain it better. I even made an simple example of the calculation. The question are not random. What's random is your procedure. Your tests are inconsistent with any conventional test and the conclusions are based on pseudo-stats at best. In light of the bullshit here, let me post very very concisely what you can do to the data to make this test statistically more sound. Assume normality - but add a disclaimer that normality is an assumption that you do not take lightly.. Take T, Z, P data pools and calculate the following: Mean: Average MMR of each race. SD: Use a SD calculator to find the standard deviations of each race. Test two races at a time (start with T vs Z, etc), using the 2 sample t test. See http://ccnmtl.columbia.edu/projects/qmss/the_ttest/twosample_ttest.html Test the t-statistic against a P of .01. Post the data with caveats about the conclusion and caveats about the normality assumption. - Done.
Nice!
Post when you are done. Datasource is linked in the op.
You could already do it in all the time you spend complaining about what i did. But wait telling other people what you think they should do is more fun than working yourself right?
|
Excellent work. Its cool too see that at certain skill range sure one race may appear more dominate. But with improvement of your own skill comes balance. Nice to see.
|
On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing.
If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course.
The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation.
As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered.
Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work.
|
On July 13 2012 06:10 lolcanoe wrote:Show nested quote +On July 13 2012 05:53 VediVeci wrote: Im all for the rigorous application of statistics, but I think you may be a bit too high up in your ivory tower here. This is clearly not publishable work, on that everyone seems to agree. It is however, clever manipulation of the data which returned some interesting results. There comes a point where you need to accept this for what it is because you aren't getting anywhere with what you're doing. If by clever manipulation you mean random incoherent manipulation than fine. But there is no ivory tower here - all these basic tests are taught in every college-level stats course. The t-test, SD, and mean, and normality caveats aren't complex entities that require much explanation. As far running the test, I've only been avoiding it because the data itself has gone through so many changes and critiques over the past days. No point in running the test on data that is still being accumulated or filtered. Skeledark, assure me that the datafile in the OP is up to date with the most recent corrections and I'll do the test as soon as I get back from work. it is updated 1h ago. , separated csv file It did not change since i created the topic before, search for an other excuse. Also missed the point where you ask me for a stable file, so you can make some test. I think i missed it in your complains.
The source file is never filtered the filter comes after saving the file.
source file
Good luck.
|
On July 13 2012 05:03 skeldark wrote:Show nested quote +On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, Show nested quote +On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another?
Because the sample size of your Terran/Protoss/Zerg groups aren't really close to equal, so if Terran (the smaller group) has the same amount of disconnects as Protoss or Zerg, then it affects it more?
Or since disconnects discourage play time, I can see how it's mostly meaningless, since you had more active members participating than inactive.
|
On July 13 2012 06:19 furerkip wrote:Show nested quote +On July 13 2012 05:03 skeldark wrote:On July 13 2012 05:00 hunts wrote: Sorry if you answered this before somewhere, but do you have a rough estimate of the MMR ranges of each league? I'd really like to see that as it would be very interesting. depends on the server us / eu #START PROMOTE_OFFSETS bronce - master 0,754,1050,1280,1536,1993, On July 13 2012 05:00 BBS wrote: I failed to read about how you canceld out effects like disconnects or other cases in which the game is not representative :o I dont think that disconnects are race depending. Why should one race have more disconnects than another? Because the sample size of your Terran/Protoss/Zerg groups aren't really close to equal, so if Terran (the smaller group) has the same amount of disconnects as Protoss or Zerg, then it affects it more? When you assume that the smaller group terran, have the same amount of disconnect that he bigger group zerg, you assume that terran have more % of disconnect than zerg.
Like i said , I just dont see why the race you play should affect the amount of disconnects.
But my program dont track this anyway:
if you disconnect or leave the game to fast my program cant detect the game and will not work ^^
|
|
|
|