
ZvP is imbalanced - Page 33
Forum Index > BW General |
Elite00fm
United States548 Posts
![]() | ||
![]()
motbob
![]()
United States12546 Posts
On October 20 2009 14:46 Elite00fm wrote: also, there is a very high chance you miscalculated the standard deviation ![]() OK tell me what it actually is and how you got it, please. | ||
Elite00fm
United States548 Posts
On October 20 2009 14:46 motbob wrote: Getting that data would be pure hell. No thanks. Yeah good point, I guess we could estimate it at like 53-55% though | ||
zulu_nation8
China26351 Posts
On October 20 2009 14:44 Elite00fm wrote: set null hypothesis to the winrate zerg had for the past 5 years or so before march 1st of this year should it be that or the average of the zvp stats over every 7 month period ever in progaming? Since it should be the same sample sizes? | ||
zulu_nation8
China26351 Posts
| ||
![]()
motbob
![]()
United States12546 Posts
On October 20 2009 14:48 zulu_nation8 wrote: should it be that or the average of the zvp stats over every 7 month period ever in progaming? Since it should be the same sample sizes? No. Elite's suggestion is OK because it compares the recent win rate to the historical ZvP winrate, which might actually serve the purposes of this thread better. My method compares the recent winrate to a rate of 50%. But your suggestion doesn't make that much sense... a rate is a rate. | ||
![]()
Heyoka
Katowice25012 Posts
| ||
![]()
motbob
![]()
United States12546 Posts
On October 20 2009 14:59 heyoka wrote: Is a rate still a rate when you estimate the average at 50% but then go on to say your expected variance is from 1% to 99%? Uh, yeah, that's the nature of binary data. | ||
zulu_nation8
China26351 Posts
| ||
Elite00fm
United States548 Posts
edit: i'm a little rusty when it comes to stats rofl | ||
![]()
motbob
![]()
United States12546 Posts
On October 20 2009 15:22 zulu_nation8 wrote: Standard deviation means how far the mean % from other samples of 800 games in the history of progaming can deviate from the null hypothesis. Which should be something like .05 or .1. What your test proved was that basically your numbers are wrong. Go into excel and use the command stdev on a bunch of numbers. That's the standard deviation I'm talking about. You plug that into this equation (for omega): ![]() Please don't criticize my methods again until you do a statistical test of your own. After all, you said you would. | ||
Matrijs
United States147 Posts
If we use the historical average, we invite the argument that Zergs have had a historical advantage over Protoss players, which would corrupt our test. The goal for mapmakers should be 50% winrates over time for each race in all three matchups - why shouldn't we measure their results against that goal? | ||
Elite00fm
United States548 Posts
On October 20 2009 15:44 Matrijs wrote: If the purpose is to prove that Zergs have had the advantage over Protoss players the last 3 months or whatever time period, why isn't a null hypothesis of winrate = 50% ideal for that purpose? If we use the historical average, we invite the argument that Zergs have had a historical advantage over Protoss players, which would corrupt our test. The goal for mapmakers should be 50% winrates over time for each race in all three matchups - why shouldn't we measure their results against that goal? Because the game has always been slightly T>Z>P>T, and this sort of equilibrium has been deemed balanced. It is already assumed that zergs have had a historical advantage over protoss, what we are trying to determine is if in the past 7 months is if this increased winrate of the zergs is so much more than the historical figure that the probability of this occurring to do variance is very small, and if infact an imbalance has emerged in the matchup. | ||
Matrijs
United States147 Posts
On October 20 2009 15:51 Elite00fm wrote: Because the game has always been slightly T>Z>P>T, and this sort of equilibrium has been deemed balanced. It is already assumed that zergs have had a historical advantage over protoss, what we are trying to determine is if in the past 7 months is if this increased winrate of the zergs is so much more than the historical figure that the probability of this occurring to do variance is very small, and if infact an imbalance has emerged in the matchup. I don't see this as a particularly compelling argument. If maps are sufficient to significantly alter, and even reverse, the T>Z>P>T historical pattern of imbalance, why should we accept that imbalance? Why shouldn't we aim for T=Z=P=T? Ignoring that, it seems to me that the proper test is a one-tailed one-proportion z-test: http://www.acastat.com/Statbook/ztest1.htm The null hypothesis would be that the Zerg winrate over the sampled period equals the historical rate, which we will approximate conservatively as 55%. The alternative hypothesis would be that the Zerg winrate over the sampled period exceeds the historical rate. By my calculation, that test gives us a z-value around 2.5, which is easily high enough to conclude that the current Zerg winrate exceeds the historical rate, even given the conservative assumption of a 55% historical winrate. Edit: Including my calculations so others can check my work: Standard error = sqrt((.55)(.45)/855) ~ .01701 Z-value = (.5921-.55)/.01701 ~ 2.475 | ||
Elite00fm
United States548 Posts
| ||
Matrijs
United States147 Posts
On October 20 2009 16:21 Elite00fm wrote: How did you get that standard error? The formula's in the link I posted. SE = square root (p(q)/n) where p = population proportion (here, the estimated historical Zerg winrate, .55), q = (p-1) and n = sample size (855 games sampled) So, it seems to me that, yes, something has changed recently. I see several possibilities: 1) Metagame shift. Protoss players may be struggling to find a good counter for the current popular 3 hatch spire to 5 hatch hydra build. This could be either a temporary effect, which will disappear or reverse itself once Protoss players discover an effective counter, or it could be a permanent effect, if the matchup is sufficiently "played out" strategically. 2) Maps. The new maps may be more Zerg-favored in this matchup than previous maps. 3) Mechanics. No one denies that the mechanics of modern pro players are vastly superior to those in the past. It may be that improved mechanics have more of a positive impact on a Zerg's effectiveness than they do on that of a Protoss. The bottom line, it seems to me, is that unless we see a reversal of the trend over the next few months, tournament and league organizers should start looking at ways to tweak the existing map pool to bring the matchup back into balance, regardless of the cause. A 60% win rate for one race over another is just bad for the game at the competitive level. | ||
zulu_nation8
China26351 Posts
On October 20 2009 15:33 motbob wrote: Go into excel and use the command stdev on a bunch of numbers. That's the standard deviation I'm talking about. You plug that into this equation (for omega): ![]() Please don't criticize my methods again until you do a statistical test of your own. After all, you said you would. motbob i think its pretty obvious a standard deviation of 50% is wrong, the sooner you realize this and drop the im an econ major i know stats attitude, the faster we can move on. A win is not 100%, and a loss is not 0%, that would be the standard deviation if brood war had like 80% half wins or something, even then that would not make sense since there would be no statistical significance since EVERYTHING would fall under the range of 0 and 1, thats why your numbers are so messed up. | ||
Black Gun
Germany4482 Posts
On October 20 2009 13:26 motbob wrote: OK I just found a much easier way to compile map matchup data! So when I get access to Stata, I'll have better data. I'll do this for all stats since March 1st, 2009. Byzantium 3: 25-13 Byzantium 2: 30-11 Tears of the Moon: 1-0 New Autumn Wind: 3-1 Medusa: 34-23 Tau Cross: 7-7 Carthage 2: 2-4 Carthage: 0-1 Battle Royale: 4-5 Holy World: 4-3 Shades of Twilight: 1-3 Colosseum II: 2-4 Andromeda: 7-19 (?????) Neo Harmony: 5-0 God's Garden: 56-44 Carthage 3: 1-0 Outsider: 41-27 Neo Medusa: 34-25 Return of the King: 47-22 Eye of the Storm: 1-1 El Niño: 1-1 Destination: 110-72 (this changed significantly since the time of the OP... EVER OSL prelims used it) Tornado: 5-1 Outsider SE: 2-0 Moon Glaive: 2-3 Match Point: 3-4 Heartbreak Ridge: 90-64 Fighting Spirit: 6-3 Overall: 524-361, or 59.21% the variable we are discussing here is binary, hence the estimator of the mean is the proportion p = 524/(524+361) = 0.592. the sample size is large enough to use a normal approximation. if we assume a null-hypothesis of a balanced winrate of p0 = 50%, then in the corresponding test we need to use this p0 and not p in the formula for the standard deviation! the test statistic then is: Z = sqrt(n)*(p - p0)/sqrt[p0*(1-p0)] = sqrt(885)*(0.5921 - 0.5)/sqrt(0.5*(1-0.5)) = 5.479 -> highly significant. if we assume a null-hypothesis of p0 = 0.55, then we obtain a Z of 2.517 -> p-value of 0.0059, ie significant even on a confidence level of 99%. so the ZvP-winrate during that timeframe significantly exceeds 55%. | ||
zulu_nation8
China26351 Posts
| ||
Muirhead
United States556 Posts
ZvP stats are 524-361 If you flip a coin 885 times, the chance of heads coming up 361 times or less is (Sum(i=0 to 361) (885 C i))/(2^885) If the coin has, say, a historical 47% chance of heads, then the chance of heads coming up 361 times or less is Sum(i=0 to 361) (885 C i) * (.47)^i*(.53)^(885-i) Someone can figure these out in 10 seconds with their TI-89 or Mathematica... unfortunately I can't right now. No need to hide behind fancy stats here! | ||
| ||