|
On May 04 2012 10:40 shaldengeki wrote:Show nested quote +On May 04 2012 10:35 LaM wrote:On May 04 2012 10:27 shaldengeki wrote:On May 04 2012 10:17 LaM wrote:On May 04 2012 10:08 shaldengeki wrote:While I agree that there are significant issues with the way that many statistics on TL are presented - I've posted on this before and I was the guy who nudged the monthly winrates graphs to add error bars in the first place - you're not doing the discourse any favors by reposting this, I think. There are hardly any statistical arguments actually made in the post - for instance: On May 04 2012 09:00 Cyberonic wrote: The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments.
There is something deeply hypocritical about decrying statistics discussions on TL for being superficial and then totally failing to present statistical evidence for your assertion that games outcomes are not independent. One would think that the actual mathematics would be pretty trivial, so simply asserting that "they are not independent because they are players" is committing exactly the sin that you're supposedly railing against. I don't agree with you at all. I think it is pretty clear that game results are not independent outcomes. Consider a 10game match between DRG and Joe "Code B Protoss" Schmoe. DRG wins the first 9 games. Any rational, logical observer would favor him greatly to win the 10th game, right? But winrate graphs still assume the outcome of Game 10 should have a 50/50 chance of going either way, like it's just a coin flip. Now, in massive sample sizes this would be corrected by enough players from every race being better than their opponent in any series so that it would smooth out any errors, but in month long samples from a tiny group of pros the deviations don't get corrected. I think the math for showing that is extremely hard, but the logic behind it is very strong. Similar to how I'm sure you won't debate that 2+2 is valid, but you would have a pretty damn hard time mathematically proving addition to me. I think you're probably mistaking what the purpose of the winrate charts is, and what "independent outcomes" means in the context of repeated experiments. Of course you wouldn't apply the winrate charts to the situation you're describing - what they do is aggregate results across several skill levels and regions to provide a general indicator of race balance. Nobody is claiming that every single game between a protoss and a zerg has a 50/50 chance of going either way, and if this is how you're interpreting the winrate charts, that's definitely a problem on your end! The issue you describe with skill impacting win chances is actually not an issue of independent events at all. If the events were dependent, then the results from all prior games between all zergs and all protosses would impact the win probability of the next game between a zerg and a protoss. This is not the issue at hand in your scenario, where you're talking about skill level of each player impacting win probabilities. That's the realm of ELO, and the winrate charts make no attempt at gauging the skill levels of each player. And you are again making the mistake that winrate charts indicate win probabilities and balance, which is the whole point, THEY DON'T! You wanted error bars added to winrate charts? Why? What error? They are cataloging winrates from the past month, where is the error coming from? My mistake was in even ceding that error bars should be part of the chart and make any sense with them. They don't. I agree my explanation isn't applicable to the charts, but that isn't because my explanation is assuming things incorrectly, it's because the error bars shouldn't be there in the first place... Please remain calm. I'd love to have a level-headed discussion with you! The winrate charts indicate win probabilities aggregated across each race. This is indisputable. They provide what I believe is a general indicator of balance - I don't believe that there are statistically-significant differences in skill between races, so it stands to reason that in the aggregate, this provides some information on the balance between races. The error bars allow you to determine whether one month's average is significantly different from previous months. This is hugely important as before their addition people were making all sorts of wild claims as to how certain patches were throwing race balance off. Now that we can determine whether or not each month was significantly-different from previous months, we can more reasonably talk about whether or not changes to the game are having effects on winrates.
I'm very calm thanks 
It's just you seem to be very oddly ignoring the relevant points of discussion in a biased attempt to defend your error bars.
You keep making the claim, with no backing, that the win rates provide some reflection of balance. People, including you in this discussion, always like to attack the straw man that people saying winrates don't indicate balance are making that assumption because of differences in the skill level of the races. If you read the OP, it is addressing exactly why they don't show any useful information about balance, for a set of reasons completely different than the one straw man reason you do address.
I'm still confused as to what you think the error bars add to the discussion. The winrates are results, not predictions. That means the only source of error would be missing results or incorrectly entering results into the graph. But that certainly isn't how you are presenting them. In this case you seem to be advocating them so that people can see a win rate of X% with an imagined deviation indicating it falls within a range from Y% to Z% and therefore not freak out since last month the win rate was A% which also falls between Y% and Z%. That's not really a useful error bar, its just tacking some extra shit on so that people can continue to try and draw erroneous balance conclusions from something that indicates nothing about balance.
Again, your viewing winrate graphs as balance graphs. I think that's the root of why we disagree about the error bars.
|
I guess I was hoping for a little bit more of a comprehensive approach to this. Rather than a set of three, relatively specific, "fallacies."
I think there are a lot of people who don't think critically on statistics shown to them, but this doesn't do much to help them. You're just pointing out three possible ways mistakes could be made and then explaining them away with common sense. It's not really an approach from statistics and its not really about statistics, it's logic. All of what you explained was logic.
Now while people certainly could use a reminder to use logic more often, this is not related to their misunderstanding of the fundamentals of statistics. From you introduction, I got the impression that you were going to explain some fundamentals of statistics and explain how you should apply them to SC2 data.
While this post does indeed mention statistics, it is just used as an arbitrary example for illustrating logical fallacies which are not fundamentally related to statistics in any way. A more apt title would be "Mistakes in reasoning."
|
Reddit is the new 4chan. That said, he has some points in the methodology and idea behind some of the things people commonly mistake; can't say I'm a fan of the way he went about it, but he uses Reddit, so that explains that.
|
All trivially obvious points. I suppose some people have these misconceptions, but this is hardly news. Based on the title, I was hoping for some deeper insights, or at least something more thoughtful, like this.
|
Are win rates an indicator of true balance? I guess not.
But can we even know what the true balance is? I guess not either.
What we use win rates for is to see if the game remains competitive, which will reflect a perceived balance. That is what almost everybody cares about more than a remote fantasy of true balance.
|
Makes solid points. If there was one thing I learned in my stats class is how broken statistics can get. It is a very tricky subject to understand.
|
Why do people have to be aggressive and insulting when they're trying to have an intellectual argument?
|
The working assumption of balancing is that each race has equal skill representation among the pool of players. It's a flawed assumption, of course, but there isn't any better way. The samples of professional games aren't big enough, but that's all we got to work with; the ladder is less reliable in terms of quality. Should we be cautious in our claims, sure. Finally, the time intervals winrates just need added weight in number of games. Otherwise Tasteless is like omg 100% winrate 0-5 minutes; while it most likely means only one game ended so soon and it happened to be a win.
|
Even though not every one can agree on this subject, we're likely to agree that the blizzard dev team is trying TOO HARD to tweak the game. They screw it up the majority of the time instead of waiting and doing minor tweaks every 6 months and such. The game never gets to develop before someone gets the nerf bat just because another race is winning in the metagame. Terran continually get screwed over in late game because the early game is relatively powerful to try and make a ~50% ratio. They've shown countless times that they want a pretty statistic more than any other aspect... and they do it based off their crappy map pool of the past which is down right idiotic. Bigger maps would've been better than the barracks +5 second build time, for example, instead of having to deal with bigger maps and the change now.
|
I was going to read the OP until I read the first post. He talks about statistics but starts off with saying 99% people don't know things twice. I don't think thats a proper way to start an argument about statistics. Especially if you're going to ask the "re-poster" if you will to post that first paragraph about TL calling us out on how we aren't the best at writing threads.
|
His points are all sensible, and to me they all seemed like common sense.
|
On May 04 2012 09:00 Cyberonic wrote:This post is originally written by Drabzalver on Reddit. Since he does not have a TL account I asked and was allowed to repost if I include this:Drabzalver on Reddit:Show nested quote +TL is the [swear word] of praetentious [swear word] where moderators reward supposed 'high quality posts' which are full of statistical and scientific garbage I just outlined just because they are 'praesented nicely' and the mods basically think that any long post with a lot of images and no swear words is intellectually advanced while often a lot of it is total garbage filled with wrong interpretations and grave statistical errors. Also, I don't have a TL account, you're welcome to repost, but do include this qualifier, should be fun,
I somewhat agree about the above statement and I think the fallacies stuff just posted were targeted to those balance whiners in TL
|
With a bunch of you lot replying because of his statement and not because of the article proves his point, I'm just saying.
|
On May 04 2012 10:20 SappigeKutVolKots wrote:Okay, I wanted to stay away from this site, but I couldn't let some stuff go unanswered, I am drab, anyone can message me on reddit to verify this.Show nested quote +On May 04 2012 09:40 windsupernova wrote: As much as I agree with him in some points. I don't like how he comes off as someone pretty arrogant and doesn't even present some kind of credentials on why he understand statistics more than 99% of people.I mean for all we know he could be some arrogant College kid who just passed his 1st statistics class. I'm not making an argument from authority, I don't need credentials, even if I was a cow or an anencephalic protozoan, it doesn't matter, there is no need for credentials because I'm making an argument from reason, not from authority, I do not even need to cite any sources because my argument is purely rational, not empirical. If you ask for 'credentials' to verify this post then you lost and don't know how to verify academic literature. My credentials are irrelevant, I'm not making an argument from authority. If you do not find yourself to have the confidence to check the correctness of my argument then you shouldn't agree or disagree either way. Say to yourself 'I don't understand what he's saying', above all, don't comment on a thread whose opening you don't understand, and move on with life. Yes, I am very smug, I'm not even smug, I'm condescending, I'm not condescending because I have a higher education, I'm condescending because I'm fed up with stupidity, the arguments I put out are very easy and basic to understand and honestly, anyone reading those graphs should come to those conclusions, yet I've seen countless and countless people misinterpreting all those graphs without coming to the realization of these very basic givens, on both reddit and TL. I've seen 50 pages of TL posts discussing those graphics about probability of races to win at certain time intervals in matchups, and maybe 1-2 people pointed out how misleading it was because of the arguments I put out, and no one listened and other people go discuss trivial and unrelated stuff like 'sample size' while there are much bigger problems. I've seen the TLPD winrates posted on both TL and reddit and people discussed them for days and so few people initially pointed out that the lines between the graphs in the old aesthetic were completely ludicrous and they should be bars, and even fewer people were critical of the fact that the error bars were calculated by a means which assumes independent experiments, which they are not. It doesn't take a genius to see this, it just takes allowing yourself to be critical. As soon as I ask a lot of those people 'There are some grave fallacies with those stats,c an you point them out?' they will most likely come with at least 80% of the shit I pointed out and probably with some things I overlooked. It doesn't take a brain, it takes not being a mindless drone and being critical of stuff that is being posted. As for credentials, I guarantee you that the people who post those TLPD winrate graphs either have no statistical credentials, or are wilfully lying to people and oversimplifying it, because it's just statistical faux pas. Show nested quote +But then he doesn't say how we should go about interpreting those statistics and providing proof. We should interpret them as what they are. They are the winrates for this month, it says nothing of balance or any other interpretations you can make of them. You see what you get, and the error bars are, simply put, incorrect and a statistical gaffe. I'm not sure what they are supposed to mean, they don't mean anything if the map scores aren't independent probability experiments. Show nested quote +That being said I do think most of the people take a really simplistic approach to statistics, but well statistics are a hard subject to tackle Nope, it's very easy, it's more that people like to see things that you can't conclude from stuff. Show nested quote +On May 04 2012 10:09 LaM wrote:On May 04 2012 10:05 Reptilia wrote: because reddit is so much better. Lol. i wouldnt be surprised if that guy had a tl account and got banned and got so butthurt he posted that. Did you read anything past the qualifier that was added on after his original post on Reddit? Doesn't look like it. Your post has nothing to do with the vast majority of his post. Or where you saying he got butthurt that he got banned from TL so he went to Reddit and wrote an intelligent post about balance statistics and how they can be misleading? All these douchey little TL > Reddit posts are the type of annoying shit that makes people think TL is pretentious anyways. I know he isn't any better for his equally douchey qualifier, but at least he followed it up with an informative, well-written post. Something I haven't seen much of from the Reddit bashers here. At least on Reddit your pathetic contributions would be downvoted enough so that I wouldn't have to waste time responding to them and could help clarify things for people who give a shit about having a meaningful discussion. As linked in the OP, I did not add that qualifier on top myself, I never added the qualifier formally, someone asked me 'Have you posted it on TL' (the OP here), I said 'Nope', he asked 'why?', I said that which he quoted. That said, I never mentioned TL in the original post, I was mainly critical of screddit and its continued misuse of statistics and it got upvoted to be the #1 post on the screddit first page. This exemplifies a quality of screddit that I feel TL heavily lacks.Edit: Also: Pepper_MD just sent you a month of reddit gold! Wasn't that nice? Here's a note that was included: I have degree in Stats. All I have to say is Thank You.I have no idea what reddit gold is, is it good?
Even if your argument is true, it was poorly written and therefore deserves the terrible response that you're getting.
You need to realize that most people don't really care about the content of the OP, and it's not because they're stupid. What you're writing about is a niche interest. Stupidity isn't rampant because more people (and by people I mean forum posters, which I imagine is primarily below college age, by the way) don't spend their time critically evaluating video game statistics.
It's amazing to me that you're so indignant about people reacting negatively towards your condescension, simply because your argument is true or whatever it is you think justifies your tone. You're writing as if all that matters is the accuracy of what you're saying and not how you present it. Statistics can't teach you common sense I guess
|
Lol at how pretentiously he spells pretentious. (Also, wrongly, this is English not Latin)
|
On May 04 2012 10:08 shaldengeki wrote:While I agree that there are significant issues with the way that many statistics on TL are presented - I've posted on this before and I was the guy who nudged the monthly winrates graphs to add error bars in the first place - you're not doing the discourse any favors by reposting this, I think. There are hardly any statistical arguments actually made in the post - for instance: Show nested quote +On May 04 2012 09:00 Cyberonic wrote: The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments.
There is something deeply hypocritical about decrying statistics discussions on TL for being superficial and then totally failing to present statistical evidence for your assertion that games outcomes are not independent. One would think that the actual mathematics would be pretty trivial, so simply asserting that "they are not independent because they are players" is committing exactly the sin that you're supposedly railing against. Please reconsider reposting topics like this in the future, or at the very least, try to be productive and rigorous in your arguments if you truly want TL to be a community that is rigorous in its discussion!
I think this is really more helpful than the OP. For something that starts with a complaint about TL being pretentious, I found the post really condesending and a gross over simplification. Yes, it is true that people on TL do awful things with statistics, but as a statistician at a University, the people that I provide numbers to at work often do just as awful things with much more serious numbers. As well, for a post decrying people's statistical knowledge, it doesn't help to begin your post by making up some stats.
Also, if you want to prove something about math, you have to use math. Don't just tell people that they are wrong, help people understand how to use stats properly.
|
On May 04 2012 10:59 LaM wrote:Show nested quote +On May 04 2012 10:40 shaldengeki wrote:On May 04 2012 10:35 LaM wrote:On May 04 2012 10:27 shaldengeki wrote:On May 04 2012 10:17 LaM wrote:On May 04 2012 10:08 shaldengeki wrote:While I agree that there are significant issues with the way that many statistics on TL are presented - I've posted on this before and I was the guy who nudged the monthly winrates graphs to add error bars in the first place - you're not doing the discourse any favors by reposting this, I think. There are hardly any statistical arguments actually made in the post - for instance: On May 04 2012 09:00 Cyberonic wrote: The TLPD winrate graphs are praetentious and amateuristic, sorry to say it but that's how it is, the error bars there are pure bollocks and are calculated using the rules of independent probability experiments, that is to say, it is assumed that the results of every series has no effect on the others, as if you flip a coin. If they were independent, sample size would be enough by a large margin to say something, but they are not independent. Because you're dealing with players, not just games. Good players simply ruin the idea of independent experiments.
There is something deeply hypocritical about decrying statistics discussions on TL for being superficial and then totally failing to present statistical evidence for your assertion that games outcomes are not independent. One would think that the actual mathematics would be pretty trivial, so simply asserting that "they are not independent because they are players" is committing exactly the sin that you're supposedly railing against. I don't agree with you at all. I think it is pretty clear that game results are not independent outcomes. Consider a 10game match between DRG and Joe "Code B Protoss" Schmoe. DRG wins the first 9 games. Any rational, logical observer would favor him greatly to win the 10th game, right? But winrate graphs still assume the outcome of Game 10 should have a 50/50 chance of going either way, like it's just a coin flip. Now, in massive sample sizes this would be corrected by enough players from every race being better than their opponent in any series so that it would smooth out any errors, but in month long samples from a tiny group of pros the deviations don't get corrected. I think the math for showing that is extremely hard, but the logic behind it is very strong. Similar to how I'm sure you won't debate that 2+2 is valid, but you would have a pretty damn hard time mathematically proving addition to me. I think you're probably mistaking what the purpose of the winrate charts is, and what "independent outcomes" means in the context of repeated experiments. Of course you wouldn't apply the winrate charts to the situation you're describing - what they do is aggregate results across several skill levels and regions to provide a general indicator of race balance. Nobody is claiming that every single game between a protoss and a zerg has a 50/50 chance of going either way, and if this is how you're interpreting the winrate charts, that's definitely a problem on your end! The issue you describe with skill impacting win chances is actually not an issue of independent events at all. If the events were dependent, then the results from all prior games between all zergs and all protosses would impact the win probability of the next game between a zerg and a protoss. This is not the issue at hand in your scenario, where you're talking about skill level of each player impacting win probabilities. That's the realm of ELO, and the winrate charts make no attempt at gauging the skill levels of each player. And you are again making the mistake that winrate charts indicate win probabilities and balance, which is the whole point, THEY DON'T! You wanted error bars added to winrate charts? Why? What error? They are cataloging winrates from the past month, where is the error coming from? My mistake was in even ceding that error bars should be part of the chart and make any sense with them. They don't. I agree my explanation isn't applicable to the charts, but that isn't because my explanation is assuming things incorrectly, it's because the error bars shouldn't be there in the first place... Please remain calm. I'd love to have a level-headed discussion with you! The winrate charts indicate win probabilities aggregated across each race. This is indisputable. They provide what I believe is a general indicator of balance - I don't believe that there are statistically-significant differences in skill between races, so it stands to reason that in the aggregate, this provides some information on the balance between races. The error bars allow you to determine whether one month's average is significantly different from previous months. This is hugely important as before their addition people were making all sorts of wild claims as to how certain patches were throwing race balance off. Now that we can determine whether or not each month was significantly-different from previous months, we can more reasonably talk about whether or not changes to the game are having effects on winrates. I'm very calm thanks  It's just you seem to be very oddly ignoring the relevant points of discussion in a biased attempt to defend your error bars. You keep making the claim, with no backing, that the win rates provide some reflection of balance. People, including you in this discussion, always like to attack the straw man that people saying winrates don't indicate balance are making that assumption because of differences in the skill level of the races. If you read the OP, it is addressing exactly why they don't show any useful information about balance, for a set of reasons completely different than the one straw man reason you do address. I'm still confused as to what you think the error bars add to the discussion. The winrates are results, not predictions. That means the only source of error would be missing results or incorrectly entering results into the graph. But that certainly isn't how you are presenting them. In this case you seem to be advocating them so that people can see a win rate of X% with an imagined deviation indicating it falls within a range from Y% to Z% and therefore not freak out since last month the win rate was A% which also falls between Y% and Z%. That's not really a useful error bar, its just tacking some extra shit on so that people can continue to try and draw erroneous balance conclusions from something that indicates nothing about balance. Again, your viewing winrate graphs as balance graphs. I think that's the root of why we disagree about the error bars.
I want to stick up for the error bars. Just because the graph displays results and not predictions, doesn't mean that the error bars are meaningless. If I measure something in the lab, I still place an uncertainty on that value even though it is a direct measurement. Likewise, even if I directly measure winrates, you still have an error rate on it. Why? Because not all games are included. Because these games aren't scientific outcomes and are not reproducable. Because the factors from one month to the next change and error rates are a simple way to show that. Sure, they aren't perfect, but don't think that just because they are a direct measurement that they don't have error and uncertainty attached.
|
You guys don't get it, his post has got nothing to do with statistics, it's about the TL community. He wants to prove his point that you create a thread in TL with a bunch of fancy writing, but the bottom line is, his true intention is to bash the people in TL. Again, I somewhat agree.
|
On May 04 2012 12:34 pOnarreT wrote: You guys don't get it, his post has got nothing to do with statistics, it's about the TL community. He wants to prove his point that you create a thread in TL with a bunch of fancy writing, but the bottom line is, his true intention is to bash the people in TL. Again, I somewhat agree. I'm sure that's why he originally named his thread "What reddit does not understand about statistics" and posted it on reddit. Clearly all to do with TL.
|
Exactly true. What the OP doesn't mention is that Blizz "balances" based on statistics gathered this way. I can't say they are foolish in interpreting the data, but the skewing of data towards balanced by the matchmaker and the statistical points in the OP make balancing really hard.
|
|
|
|