|
On July 12 2012 07:48 lolcanoe wrote:Show nested quote +On July 12 2012 06:20 skeldark wrote:
I understand. Same for me. this is a site project. The time i have i use to work on the mmr analyser. lolcanoe ? how long do you need?
Whether this is a community forum or a university-level discussion doesn't change the validity or importance of the critiques here. Where your argument is posted unfortunately has no bearing on the scrutiny it deserves. And why I haven't been helpful? Because I wanted to see if the self-admitted arrogance here was justified or not. I wanted to confirm my suspicions that you actually do not have a strong grasp of statistics and I wasn't just misinterpreting your calculations. Now that my suspicions are confirmed, let me clearly layout problems that I see. 1. Run an Anderson–Darling test on the data. This can be done with 3 clicks through Minitab which will automatically give you a P-value for whether or not the data is normal. If you cannot run this test or it tells you that your normality is problematic - note in the OP that your test assumes normality but was not verified to be normal. 2. The specific question here is whether or not one race has a signficantly higher MMR average than another. What your current test is actually testing for (although somewhat incorrectly), is whether or not the sample average varies significantly from the population mean. If executed correctly, this test also has application to understanding balance, but it doesn't answer the specific question. The specific question should be tested for under a very simple 2 sample t test (google it) and be tested 3 times - tz, pz, and zt. This is a much better test to fit the question and allows you to ignore the further confusion of taking another average. 3. In these calculations, independence between populations is a fair concern - and should likewise be noted. 4. Finally, be very clear about your conclusion. Your data allows you to conclude that the average skill rating of a certain race is potentially different than the skill rating of another. It is yet another jump to equate this difference to a problem in a balance, due to a potential cause-correlation problem (ie: Does terran make players bad, or do bad players pick terran?). Unfortunately, there's no way to resolve this concern with the data that you possess, so you'll have to make note of this caveat as well.
His methodology is flawed, and obviously he could never publish this or anything, but it seems to be a significant improvement over most posts discussing balance. I didn't look through the calculations process very much, but it seems likely that his conclusions are meaningful (if not rigorous). And since blizzard is not going to use these results to inform their balance discussions, "likely meaningful" isn't an unreasonable standard.
|
On July 12 2012 08:04 Evangelist wrote:Show nested quote +On July 12 2012 07:55 lolcanoe wrote:On July 12 2012 07:51 Evangelist wrote: You know, you would think that given that he basically published his data for anyone else to verify, all of the statistical geniuses who keep hammering on him would go do what real mathematicians would do and analyse the data yourself. It seems like a lot of the people who are doing so are desperate to prove something - I'm not sure what, but they remind me of people desperately coming up with ways to disprove supersymmetry in the wake of a Higgs boson discovery.
I suspect they don't like your conclusions, skeldark, though they amount to little more than a few wins in either direction. Your cynicism is understandable but misplaced. The complete data package (with all the updates) is not as transparently presented as you'd think, and it's a large concern to those of us who don't want to attempt conclusions from potentially misinformed, incomplete, or misunderstood data. And what exactly are you attempting to prove? That there isn't a sum total of 5-6 wins difference between terran and zerg and that random players don't have a significantly decreased chance to win games at reasonably high levels of play? Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought! Do you have a better method of representing this data than has been presented? Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate? Now I am not going to pretend to be a mathematician as I'm a physicist and we do retard maths, but there are plenty of people who are. Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
This is a bit ridiculous. If you say "I looked out across a large field and didn't see any curvature, so the earth is flat," I don't have to tell you what shape the world is to tell you that your methodology is flawed. lolcanoe isn't saying that the conclusions are necessarily untrue, just that they aren't valid.
|
And what exactly are you attempting to prove?
A statistician has no agenda outside assuring the proper use of statistics, or in the next best case, qualifying the statistics to incorporate important assumptions. Respectfully, the OP himself has noted that there should be no pre-test bias.
Anyone and their donkey can conclude that terran winrates are likely to be lower because TvP can only be reliably won before 20 minutes and both early and late game TvZ are, as of last patch, completely fucked! The fact that this effect has been dampened on the ladder suggests balance is more robust than we all thought!
You're right. "Can" being the key word. They "can" do what they want. And yes, they are entitled to their beliefs and dogmas, just as you are, but until data has been established, your verbal analysis to me is little better than a 4 year old's conclusion that Terran is completed fucked because it starts with a "T".
Do you have a better method of representing this data than has been presented?
Yes. See previous post.
Do you actually have the means to accurately calculate Blizzard MMR? Do you also have the ability to correct this data to a form more likely to be accurate?
I've never found issue with his MMR calculations but neither have I attempted an analysis.
Maybe you do have valid critiques, but valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
You aren't a fundamentalist Christian by chance, are you? Because I've seen this before. "God is real! Fuck science because you guys don't have a valid explanation for the creation of universe either!".
No, I don't need to present an alternative solution even though I have. A viable critique does not need to come with a viable solution - ie, I could point out that you're too stupid to be in this debate, but I'm not qualified to provide a solution because I wasn't an abortion doctor prior to your birth.
|
On July 12 2012 07:24 skeldark wrote: Its more likely ELO not Bayesian inference!
If you mean Blizzard's system, it's not Elo. You really need to watch that video we've been linking.
|
On July 12 2012 08:04 Evangelist wrote: valid critiques should come with viable solutions rather than simple accusations of incompetence and assumptions.
Skeldark's done a lot of work that's potentially interesting, but it has some problems. Whether someone goes back and correctly analyzes his data or not, I think it's in everyone's interest to avoid a situation where in six months people are running around this forum saying "Six months ago, Skeldark proved <whatever>" when in fact the effort wasn't rigorous enough to prove anything.
People are too ready to accept the bottom-line result of someone with charts and numbers that they don't understand or have time to understand, and I think making a case that there are well-founded, legitimate questions about how this was done will ensure that it spurs further (and hopefully more rigorous) analysis instead of a lot of "common knowledge" with a shaky foundation.
It does not make my or lolcanoe's concerns less valid that we don't have the time or inclination to conduct a complete study of the data skeldark has collected.
There may be something there, but we won't really know unless someone looks at it all much more rigorously than the OP has.
|
I just realized that Skeldark was using average MMR and not the latest observed MMR of players.
|
I'm a terran in mid masters and i just played some games as zerg just for shits and giggles. I was ripping those terrans so hard. I felt kind of bad A moving my entire army at him, especially when he was trying his heart out dropping everywhere. It just feels like i dont have to work very hard for those Ws with zerg
User was warned for this post
|
On July 12 2012 08:57 NoobCrunch wrote: I just realized that Skeldark was using average MMR and not the latest observed MMR of players.
He did post numbers correcting this after I called him on it -- it's buried in one of the posts deep in the thread. This still doesn't answer my larger concerns but that was a good change.
|
I did a study of my own. I compiled some incidences of HIV/ SIV amongst various primates that supports my null - climbing trees or walking on 4 limbs is why it is not observed in non-human primates. Thanks a lot for all your help. I was not biased in my data collection and I am happy to discover the cure for AIDS.
|
On July 12 2012 08:41 Lysenko wrote:Show nested quote +On July 12 2012 07:24 skeldark wrote: Its more likely ELO not Bayesian inference!
If you mean Blizzard's system, it's not Elo. You really need to watch that video we've been linking.
In his defence it's not Bayesian inference either, its Gaussian Density filtering (I don't believe that the latter is a subset of the former though I could be wrong, Gaussian Density filtering is over my head right now). Either way, it seems that blizzard also keeps track using an ELO ranking system in parallel? Going back and analysing a players performance using ELO shouldn't be a huge issue here though, if you can start with the right data, which I'm not convince he did. Further more, looking through the calculations he used (from another thread, links in the OP), they appear to be only a crude approximation of ELO, i.e. not using the correct update formulas. I think that would still be sufficient though to draw some non-rigorous conclusions from the data.
|
On July 12 2012 09:04 Lysenko wrote:Show nested quote +On July 12 2012 08:57 NoobCrunch wrote: I just realized that Skeldark was using average MMR and not the latest observed MMR of players. He did post numbers correcting this after I called him on it -- it's buried in one of the posts deep in the thread. This still doesn't answer my larger concerns but that was a good change.
I honestly don't think that using average or latest mmr makes any difference.
My only issue is that the original post doesn't really show what he did which was essentially a simple t-test for comparing means (mean mmr) from two different samples (zerg and terran). The p-values for these tests were low meaning that zerg players had statistically significantly higher mmr than terran players. I'm still thinking about the independence stuff in between ladder games and I even busted out some of my old statistics textbooks.
The question I have is if you observe a zerg with 600 mmr does that change the probability of finding a terran with low mmr. Since MMR is a zero-sum game, a zerg with 600 mmr means that someone else (or groups) must have taken that mmr away and you would be more likely to find a terran with higher mmr. Since observing a zerg with lower mmr changes the probability of finding a terran or protoss with higher mmr does that mean that independence is violated. If so, is it ok to do the test?
I've done a lot of projects and stuff in the past where major assumptions were violated and it was ok to deviate from them.
|
On July 12 2012 09:24 NoobCrunch wrote: I honestly don't think that using average or latest mmr makes any difference.
It's never correct to average one player's MMR over time, because the MMR is already cumulative of all previous games. The simple case is that a new player's MMR ramps smoothly from 0 up to their actual skill level -- the average will be half what it should be.
More generally, using average and standard deviation to characterize measurements that aren't independent is not correct, and each MMR data point for a single player depends very strongly on the previous one.
The question I have is if you observe a zerg with 600 mmr does that change the probability of finding a terran with low mmr.
Yes, that's probably why Elo is better fit by a logistic distribution than a normal distribution, if I had to guess.
As far as the impact of violating your own assumptions -- you can get away with it only in one of two cases, either where you do the work correctly and demonstrate that the results didn't change much, or you make a numerical estimate of the magnitude of the error that your violating the assumption will introduce, and show that it's small and stable as you add to your data set.
If you don't do either of those, however roughly, it's just not possible to know what the impact is. In any case, the result for racial differences that the OP claims is quite small in an absolute sense, and could easily be the result of a very small systematic error.
|
On July 12 2012 09:40 Lysenko wrote:Show nested quote +On July 12 2012 09:24 NoobCrunch wrote: I honestly don't think that using average or latest mmr makes any difference. It's never correct to average one player's MMR over time, because the MMR is already cumulative of all previous games. The simple case is that a new player's MMR ramps smoothly from 0 up to their actual skill level -- the average will be half what it should be. More generally, using average and standard deviation to characterize measurements that aren't independent is not correct, and each MMR data point for a single player depends very strongly on the previous one.
I know but we're not dealing with time series data.
|
On July 12 2012 09:43 NoobCrunch wrote: I know but we're not dealing with time series data.
He was taking a single player's MMR and averaging multiple values of it from different times, yes.
|
I find Skeldar's data to be accurate and his methods/assumptions are reasonable. It is not surprising that Terran is underpowered as this also consistent with the latest tournament results for pro-Terran players.
|
wut, what happened to this thread? Is this TL peer review?
My main concern is still your formulation of the conclusions:
1)Terran is significant underpowered compared to the total data pool We can not tell if this unbalance comes from design or other reason. still will be a trigger to say that the queen range increase broke the game.
After discussing it with you, I know what you mean. But I certainly didn't at the first read, and neither did most of the other readers it seems. And that comes from you using the words "unbalance" and "underpowered" to mean lower mean MMR. While everyone else use that word to refer to the design of the game. You tried to clarify in the second line, but that only makes it confusing, as it seems to contradict the first line with the standard use od the word "underpowered". You agreed with me that the lower MMR did not necessarily mean that the stats of the units were flawed, but that is still not how most will read your OP.
If you instead would write something like:
1)Terran has on average a lower MMR than the other races. This can be due to a large number of reasons, for example, but not limited to: - Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
It is from this analysis impossible to tell what the reason is. I think you would avoid a lot of the trouble you've ended up in in this thread.
|
On July 12 2012 10:53 Cascade wrote:- Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
I support this list. Of course, I think that we should also add the caveat that even though it can't be established from this data, option 1 is significantly more likely than, say, option 4, and option 2 is probably the least likely given what we've seen in the days of GomTvT. In a sense, all of them are possible options, but Occam's Razor would do well to help us find the correct reason once we decide on the proper way to frame the data. Option 3, for instance, can pretty much be considered false until any evidence supporting it is brought forward, because it introduces an additional condition. Options 1 and 2 are the simplest because they are simple reformulations of the conclusion "Terran has the [weakest] MMR" viz a vis players behaving the way players might be expected to behave.
In order to show 3, for example, and for it to be significant, you'd need to show that the rate of race switching is higher for players who start Terran than for players who start Zerg, which seems rather difficult to do. Further, it doesn't seem like there's any reason a priori to believe that something about the Terran race makes people more likely to switch away from it, while there are very intuitive reasons to think that more players would choose Terran from the start (i.e. human beings being biased toward human beings etc).
Not really making an argument here, just saying that simply because these possibilities are equal from the point of view of the analysis itself, many of them can be readily dismissed.
|
On July 12 2012 11:03 Shiori wrote:Show nested quote +On July 12 2012 10:53 Cascade wrote:- Terran is by design a weaker race, and harder to win games with.
- Lower level players tend play terran more than higher level players.
- Players tend to start with Terran, and then switch race as they get better.
- Dustin Browder manually hacks into the ladder and decreases the MMR of Terran players.
I support this list. Of course, I think that we should also add the caveat that even though it can't be established from this data, option 1 is significantly more likely than, say, option 4, and option 2 is probably the least likely given what we've seen in the days of GomTvT. In a sense, all of them are possible options, but Occam's Razor would do well to help us find the correct reason once we decide on the proper way to frame the data. Option 3, for instance, can pretty much be considered false until any evidence supporting it is brought forward, because it introduces an additional condition. Options 1 and 2 are the simplest because they are simple reformulations of the conclusion "Terran has the [weakest] MMR" viz a vis players behaving the way players might be expected to behave. In order to show 3, for example, and for it to be significant, you'd need to show that the rate of race switching is higher for players who start Terran than for players who start Zerg, which seems rather difficult to do. Further, it doesn't seem like there's any reason a priori to believe that something about the Terran race makes people more likely to switch away from it, while there are very intuitive reasons to think that more players would choose Terran from the start (i.e. human beings being biased toward human beings etc). Not really making an argument here, just saying that simply because these possibilities are equal from the point of view of the analysis itself, many of them can be readily dismissed. Yeah, you can make arguments for and against the different reasons. And no matter how straight forward and reasonable the arguments may seem to you, you risk a 10 page heated discussion about it. Possibly involving Higgs bosons and Christian religion.
Just saying that the OP probably would be better of keeping his head out of that beehive. We get that discussion enough anyway, and he has enough material to present a much cleaner OP without going there.
edit: see, with the current formulation, we get posts like the one below.
|
wish this showed the number of people who got their rank from skill and who got their rank due to imbalances in the game. Regardless this data is great, and now I dont feel so bad for playing Terran now that I know I am playing a race that is at a disadvantage. Then again you ask any Protoss player they will say Terran is OP lol
Keep up the good work Skeldark!!
|
Searched throw the new post: -Not a singe valid argument why the numbers are wrong. -Not a singe calculation over my source data. ( You can ignore my results i published the source data you can analyse it yourself)
MMR: You tell me to watch information that you only know because me or not_that discovered it. You talk about that my the mmr i analyse is not 100% correct without understanding its race independent. Besite the fact, that no one of you know how i analyse MMR, how i correct derivation and that the method is working flawless for 100.000 games by now. Every possible mistake i do in mmr calculation don't affect the result of this calculation because my MMR calculation is race independent. A simple fact, i point out in the op and most here ignore.
Definition of imbalance: I am not responsible for people who misinterpret my data. Many of you "statistic guys" do so too! You complain that my definition of imbalance is not the one that people on TL use.
I thought this is clear to people with statistic background but to point it clearly out: I detect unbalance in MMR values. Not the reason because this one is not mathematical traceable. Not for me not for blizzard, for no one.
|
|
|
|