|
On September 03 2010 11:54 yourwhiteshadow wrote:Show nested quote +On September 03 2010 10:44 Tray wrote:On September 03 2010 10:40 blacktoss wrote:Yes I can claim that. It's EXACTLY what I'm claiming and it's true. The person you're referring to did the worst math and completely ignored the fact that Terran isn't expected to be 33% representative. It should be relative to the total population of players that play terran if skill is evenly distributed between the three races, which yes, we can assume.
So his one in a billion was not even remotely accurate. In fact these numbers near the tail end are without any doubt are statistically insignificant. He didn't lie, he just didn't know what he was doing. Here's the kicker genius: You're fighting a straw man. The claim made was that Terran make up a disproportionate number of high level diamond players such that the deviation from an even distribution is not due to random chance. The claim is CORRECT. The math is CORRECT. You are the one coming in here and putting words in peoples' mouths and then saying "haha no u r rong I will now insult you". I don't think you know how statistics works at all. The hypothesis tested was the hypothesis that Terran should make up 33% of the racial distribution in high diamond league (I am not sure if he took into account Random). The statistical test he used showed with high confidence that this hypothesis was false. End of story that is all the reasoning used. You talk about statistical insignificance but your criticism does not address the point's validity, it attacks it on the basis that "it is not the right hypothesis". Sorry, but that does not say anything about the conclusion drawn. Maybe you should just look it up for yourself and get smarter instead of being like everyone else in here and demanding I teach you all how statistics work.
This is 5 seconds of google.
A typical example would be when a statistician wishes to estimate the arithmetic mean of a quantitative random variable (for example, the height of a person). Assuming that they have a random sample with independent observations, and also that the variability of the population (as measured by the standard deviation σ) is known, then the standard error of the sample mean is given by the formula:
σ/sq.root(n)
As N becomes smaller and smaller, where N is the population of the sample size, the error increases.
if n = 25, error = std/5 (std = 1) = 20% if n = 50, error = std/7.07 = 14% if n = 100, error = std/10 = 10% if n = 1000, error = std/31.6 = 3% Once again, you are full of shit. You come in here saying you are better than anyone else, throw around a few big words, and then cite google and say "hurdy hur". It doesn't matter if the variance is high, because when YOU DO THE ACTUAL TEST (instead of bullshitting with jargon), you find that p is very high. So high in fact that you MUST discard the hypothesis that "Terran is not overrepresented in high diamond". You can say "WELL SIGMA IS HIGH" but it doesn't matter. Statistical tests are robust. They take sigma into account. I am beginning to feel like you don't know anything about statistics at all. Dude you are incredibly stupid. My analysis is correct. Assuming Terran should make up 33% of the distrubtion at the top is NOT correct. It should be EQUAL to the total % of players that play terran. Period. It's not debatable. As the sample drops, the error in the 'real' value increases and becomes less accurate. Period. Stay in school. PRECISELY. you have to take into account the number of players for each race. plus, aside all this BS about whether its imba or not and how we can or cannot show it mathematically. has anyone though blizzard's system might be IMBA or broken. its a VERY aribitrary way of assigning "skill", points, or whatever. might one ask a totally different question, that is, "is blizzards ranking system even set up such that in ideal conditions it will yield a gaussian distribution?"
Your question doesn't make sense. Please revise it?
|
He wants the distribution of player points to follow a bell curve. That is, about 2/3 of players fall within one standard deviation of points, 95% of players fall within 2 standard deviations, etc.
The correct answer is: No, I would hope that Blizzard's system is not set to yield a Gaussian distribution. Player skill is not Gaussian, as shown by extensive analysis of chess games.
|
sorry, "is blizzards ranking system even set up such that in ideal conditions (a perfectly balanced game where only skill is a determining factor in wins/losses) it will yield a gaussian distribution?"
we're talking about a game that is supposed to bring in revenue for blizzard. if everyone had the perception that they are average, would they continue to play the game? let's say the avg at so and so month is 600 pts, do you want to be an average diamond playaer? HECK NO, you're in diamond. if you were posting about strategy, you would start off with: "hi i'm a xxx pts INSERT RACE player". what i'm trying to say is, that blizzard might intentionally want to pwn the distribution, or it might just be inherently retarded because blizzard sucks at making a ladder. also, like someone else said, this ladder thing isn't really indicative of skill. you can mass games and be at the top, i mean there is even a BONUS pool.
|
Aotearoa39261 Posts
Troll eliminated. Discussion, please continue.
|
aww troll is gone while i wanted to side with his arguments, his trolling was just...trollerific
|
Thank you, Plexa. Your efforts are much appreciated.
So, uh, does anyone remember where we were before Tro- I mean Tray showed up?
|
the last 2 sections of the graph are pretty lol :D
interesting how many protoss in the mid-level ranges
|
So sad to see all the people arguing "statistical significance" with that horrible troll for 10 pages
alone the data is questionable as to its significance yes
but its not alone is it.....?
its paired with hundreds of pages of anecdotal evidence, hundreds of hours of personal experience playing and watching games and multiple credible expert sourse indicating there is some imbalance......
so to argue that it carries no significance, statistical or otherwise is incredibly Naive or deliberately misleading
also i didn't like the trolls attitude...so arrogant and full of personal attacks, that is completely unnecessary
good job mods
|
So, I've actually been working on a similar analysis for the past few days. I'm going to bring up a few points:
1) It's entirely possible for the "sample size" to be too small, even if it represents the entire population. If you are trying to find a correlation between two variables (race preference and rank, in this case), you have ventured into the mystic realm of inferential statistics. Many inferential methods yield poor results if there aren't enough data points. (This was a major problem in my initial analysis attempts, because the low number of random players was causing problems with the method I employed).
2) I disagree with your method of grouping players by point range. It makes the upper ranges less reliable (due to smaller population, as described above), and that's the most relevant players. I chose to separate players into bins of 50, sorted by rank (as pulled from SC2Ranks). That netted me a graph like this:
As you can see, there is still a possible favoring of Terran, but it's not quite so smooth. Is it just a coincidence that all those Terrans are in the top bin?
3) There are actually ways to mathematically calculate the likelyhood of this. I chose to use something called the chi-square goodness of fit test, which essentially tests whether a population matches an expected ratio. Testing each bin, I created the graph below:
The red line marks the threshold of statistical significance. As you can see the top and bottom bins do deviate from the average ratio of the whole in a significant way. This is exactly what the graph should look like if some races perform better than others.
4) So this proves Terran are imbalanced? Unfortunately, no. The method I used has some problems with it.
First, it doesn't specify which races are imbalanced.
Second, it is possible that Random players are "imbalanced," in the sense that players who focus on a specific race play better than players who focus on all races. In that case, such an "imbalance" would not reflect an actual problem with gameplay. See that significant result in bin 12? That's because there is a group of four random players in it. It doesn't take many to skew the results.
Third, this reflects only one region (North America) at only one point in time (two days ago, when I pulled this data from SC2Ranks). If further analysis showed that margin growing or shrinking, or showed different results in different regions, then a fundamental imbalance in gameplay would be unlikely.
I am working on this at least a few hours a day, and I hope to have something more conclusive posted this weekend.
-The Scarlet Mathematician
|
On September 03 2010 10:44 Tray wrote: Dude you are incredibly stupid. My analysis is correct. Assuming Terran should make up 33% of the distrubtion at the top is NOT correct. It should be EQUAL to the total % of players that play terran. Period. It's not debatable.
I know with 95% certainty that in an internet argument, the guy who is not throwing around personal insults is right and the guy who is doing that is wrong.
By the way, please consider what the reasons might be that so many highly skilled players want to use Terran when they are playing to win, and don't choose to use Zerg. (And if you think it's because of the campaign, you must be joking.)
On September 03 2010 10:56 Tray wrote: This is not a correct statement. Blizzard is not balancing the game around statistically insignificant data.
Likely they are using data of a majority of Diamond and then looking at individual games from tournaments and very top ranked players to try to manually spot what we refer to as 'cheese' and 'abuse.' This is because the game is evolving and probably does so from the top down. LOL, they're nerfing BCs and ultras and leaving marauders alone. Blizzard's balancing techniques involve dartboards and monkeys without a doubt.
|
U guys are reading this completely wrong... Zerg is underpowered and needs fixed. Thats pretty well the only thing that is conclusive from this chart.
|
Valhalla18444 Posts
On September 03 2010 16:00 Scarmath wrote: post
oh wow... please update ASAP
|
Maybe this has been provided already, but where can I get the player data in a csv file? I'd like to play around with some tests as well
|
On September 03 2010 11:59 blacktoss wrote:Show nested quote +On September 03 2010 11:54 yourwhiteshadow wrote:On September 03 2010 10:44 Tray wrote:On September 03 2010 10:40 blacktoss wrote:Yes I can claim that. It's EXACTLY what I'm claiming and it's true. The person you're referring to did the worst math and completely ignored the fact that Terran isn't expected to be 33% representative. It should be relative to the total population of players that play terran if skill is evenly distributed between the three races, which yes, we can assume.
So his one in a billion was not even remotely accurate. In fact these numbers near the tail end are without any doubt are statistically insignificant. He didn't lie, he just didn't know what he was doing. Here's the kicker genius: You're fighting a straw man. The claim made was that Terran make up a disproportionate number of high level diamond players such that the deviation from an even distribution is not due to random chance. The claim is CORRECT. The math is CORRECT. You are the one coming in here and putting words in peoples' mouths and then saying "haha no u r rong I will now insult you". I don't think you know how statistics works at all. The hypothesis tested was the hypothesis that Terran should make up 33% of the racial distribution in high diamond league (I am not sure if he took into account Random). The statistical test he used showed with high confidence that this hypothesis was false. End of story that is all the reasoning used. You talk about statistical insignificance but your criticism does not address the point's validity, it attacks it on the basis that "it is not the right hypothesis". Sorry, but that does not say anything about the conclusion drawn. Maybe you should just look it up for yourself and get smarter instead of being like everyone else in here and demanding I teach you all how statistics work.
This is 5 seconds of google.
A typical example would be when a statistician wishes to estimate the arithmetic mean of a quantitative random variable (for example, the height of a person). Assuming that they have a random sample with independent observations, and also that the variability of the population (as measured by the standard deviation σ) is known, then the standard error of the sample mean is given by the formula:
σ/sq.root(n)
As N becomes smaller and smaller, where N is the population of the sample size, the error increases.
if n = 25, error = std/5 (std = 1) = 20% if n = 50, error = std/7.07 = 14% if n = 100, error = std/10 = 10% if n = 1000, error = std/31.6 = 3% Once again, you are full of shit. You come in here saying you are better than anyone else, throw around a few big words, and then cite google and say "hurdy hur". It doesn't matter if the variance is high, because when YOU DO THE ACTUAL TEST (instead of bullshitting with jargon), you find that p is very high. So high in fact that you MUST discard the hypothesis that "Terran is not overrepresented in high diamond". You can say "WELL SIGMA IS HIGH" but it doesn't matter. Statistical tests are robust. They take sigma into account. I am beginning to feel like you don't know anything about statistics at all. Dude you are incredibly stupid. My analysis is correct. Assuming Terran should make up 33% of the distrubtion at the top is NOT correct. It should be EQUAL to the total % of players that play terran. Period. It's not debatable. As the sample drops, the error in the 'real' value increases and becomes less accurate. Period. Stay in school. PRECISELY. you have to take into account the number of players for each race. plus, aside all this BS about whether its imba or not and how we can or cannot show it mathematically. has anyone though blizzard's system might be IMBA or broken. its a VERY aribitrary way of assigning "skill", points, or whatever. might one ask a totally different question, that is, "is blizzards ranking system even set up such that in ideal conditions it will yield a gaussian distribution?" Your question doesn't make sense. Please revise it?
Don't bother that argument is pointless anyways. Because there is no sample size. It's the entire population above a certain pt level. Therefor sigma is zero. There is no margin of error becuase you have taken the whole population, there is no more ppl you can add to the sample to change the results.
In case anyone else doesn't know sigma represents the % your results could be diffrent from the actual results if you had the entire population. If the sample = population then sigma is 0.
|
On September 03 2010 20:55 Flyingdutchman wrote: Maybe this has been provided already, but where can I get the player data in a csv file? I'd like to play around with some tests as well
first page. ctrl+f "csv" you'll see it.
|
@Scarmath I like what you have done, still I got some questions. What exactly do the bins represent? Have you crawled only the Diamond ladder players, and where is your cut if you have done so - or did you read out the ranking from position 1 to 1000 grouping the results per 50 each into a bin giving you 20 bins? Did you include the high point low league players, too? I am also not sure how to interpret the graph, since it still is discret and you show us curves, have you run a density approximation over it or is it polynomial fitting? Since it ends right in the middle of the first bin I think the graph itself represents a smoothed curve fitting.
What exactly have you done with the Chi Square test (or better if you want you can post your data as csv - or give a timestamp)?
What do you think is needed to solve the random problem? We could do a very bad approximation and tell that random is 33% Terran, Zerg, Protoss (leading to some minor errors, aka it would screw some findings because gain and loss of a race would not be taken into account equally). I've read that you only watched the NA server, this may further alter your results in comparision with mine. Even since Terran is overrepresented in Korea there are some more Zerg than in the US.
Still I think the ultimate findings and your tests give some nice results (even if I think the random problem has to be solved), about general non diamond league distribution.
I'm looking forward to see your later findings.
*Edit format and made a sentence readable*
|
On September 03 2010 22:01 yourwhiteshadow wrote:Show nested quote +On September 03 2010 20:55 Flyingdutchman wrote: Maybe this has been provided already, but where can I get the player data in a csv file? I'd like to play around with some tests as well first page. ctrl+f "csv" you'll see it.
thanks, but the data isn't exactly in the format I was looking for. I was looking for an export of the entire diamond ladder, either over all regions, but grouped into different regions would be better. The csv in the OP just has the total number of players of each race in the different groupings (900-1000; 1001-1100; etc). Does anybody know of a way to pull the ladderdata into a csv? like player A, x1 points, player B, x2 points? edit: and the data should of course include the preferred race of the player
|
On September 03 2010 16:00 Scarmath wrote:3) There are actually ways to mathematically calculate the likelyhood of this. I chose to use something called the chi-square goodness of fit test, which essentially tests whether a population matches an expected ratio. Testing each bin, I created the graph below: The red line marks the threshold of statistical significance. As you can see the top and bottom bins do deviate from the average ratio of the whole in a significant way. This is exactly what the graph should look like if some races perform better than others.
Can you please give more detail how you did this? Depending on how you treat the data, a chi-square test may or may not be appropriate. I'm also interested in how you combined the fits for the different races.
I think it is incorrect to say the OP's binning method "makes the upper ranges less reliable (due to smaller population, as described above)". There is no reliability to the rankings -- they are what they are. It is not a random sampling of the population, it is the population.
Perhaps you could do your same analysis by running a sliding window of 20 players over the entire population? I think this would yield a very interesting graph. If you do this, you would expect about 1 in 20 points to fall beyond statistical significance; if more than 1 in 20 do at the high range, then there is truly something fishy going on. I'd also be happy to do it myself, but it looks like you already have the spreadsheet to do it easily.
|
On September 03 2010 22:28 ReplayArk wrote: @Scarmath I like what you have done, still I got some questions. What exactly do the bins represent?
The bins are segments of 50 players each. Bin 1 is ranks 1-50, bin 2 is ranks 51-100, etc.
Have you crawled only the Diamond ladder players, and where is your cut if you have done so - or did you read out the ranking from position 1 to 1000 grouping the results per 50 each into a bin giving you 20 bins?
I basically extract the data from the ranking tables on SC2Ranks.com. There is nothing stopping me from doing more or less, but 1000 seemed like a nice starting point.
Did you include the high point low league players, too?
No, that was outside the scope of what I was trying to do.
I am also not sure how to interpret the graph, since it still is discret and you show us curves, have you run a density approximation over it or is it polynomial fitting? Since it ends right in the middle of the first bin I think the graph itself represents a smoothed curve fitting.
I smoothed the lines to make it pretty. (I've actually stopped doing this on my more recent graphs, but this was the one I had finished).
What exactly have you done with the Chi Square test (or better if you want you can post your data as csv - or give a timestamp)?
The Chi-Square test I used basically tests if an observed population distribution matches a hypothesized distribution. In this case I hypothesized that each bin's distribution matched the overall distribution.
I'll see if I can post the spreadsheet somewhere.
What do you think is needed to solve the random problem? We could do a very bad approximation and tell that random is 33% Terran, Zerg, Protoss (leading to some minor errors, aka it would screw some findings because gain and loss of a race would not be taken into account equally).
Either by testing each race preference individually (i.e. testing the success of Terran vs non-Terran), or by finding a test that can leave out random. I have three or four things I'm looking at, I just need time to slog through the math to make sure they're applicable.
I've read that you only watched the NA server, this may further alter your results in comparision with mine. Even since Terran is overrepresented in Korea there are some more Zerg than in the US.
Still I think the ultimate findings and your tests give some nice results (even if I think the random problem has to be solved), about general non diamond league distribution.
I'm looking forward to see your later findings.
Yes. I think, ultimately, Diamond is the best place to analyze balance because there is no way for good players to be promoted out of it. A comparison between Platnium and Diamond might be interesting, though.
|
Can you please give more detail how you did this? Depending on how you treat the data, a chi-square test may or may not be appropriate. I'm also interested in how you combined the fits for the different races.
I think I answered this above.
I think it is incorrect to say the OP's binning method "makes the upper ranges less reliable (due to smaller population, as described above)". There is no reliability to the rankings -- they are what they are. It is not a random sampling of the population, it is the population.
To take, as an example, the chi-square test, it relies on never expecting a population of zero, and usually expecting a population more than 5. If I set my bin size to 10 instead of 50, in many cases the expected number of random players is less than one. The rankings "are what they are," but the number of people affects how confidently we can say anything about them. Binning by point range means that you have very small bin sizes at the top, which will sap the reliability of analyzing them. That's why I prefer static bin sizes.
Perhaps you could do your same analysis by running a sliding window of 20 players over the entire population? I think this would yield a very interesting graph. If you do this, you would expect about 1 in 20 points to fall beyond statistical significance; if more than 1 in 20 do at the high range, then there is truly something fishy going on. I'd also be happy to do it myself, but it looks like you already have the spreadsheet to do it easily.
A bin size of 20 is too small (even a bin size of 25 is pushing it), but I did try this with a sliding window of 50. It was...weird. Actually, hold on, let me whip up a graph of it, so you can see what I mean...
EDIT: ...OK, that didn't take as long as I though. Here is the racial preference by rolling bin chart:
and here are the corresponding p-values:
The graph actually says pretty much the same thing. I went back to static, separate bins, though, because I felt it was easier to visually parse.
|
|
|
|