|
On October 21 2009 12:43 motbob wrote:Show nested quote +On October 21 2009 12:30 zulu_nation8 wrote: yea skill level certainly doesn't affect the outcome of a game. Skill doesn't affect one race over another, which means it doesn't have to be taken into account in a statistical analysis. EDIT: unwise -_- 3 good zerg players 3 bad protoss players z v p 3 - 0 z imba vs p = true
Of course it matters. You can pretend it doesn't but it does. Always. Skill level difference in the games must be taken into account for accurate analysis.
|
motbob
United States12546 Posts
On October 21 2009 12:45 zulu_nation8 wrote:Show nested quote +On October 21 2009 12:43 motbob wrote:On October 21 2009 12:30 zulu_nation8 wrote: yea skill level certainly doesn't affect the outcome of a game. Skill doesn't affect one race over another, which means it doesn't have to be taken into account in a statistical analysis. EDIT: unwise -_- skill affects the outcome of the game, which is what you're plotting. yes but if the independent variable doesn't affect one race over the other the coefficient in the regression would be zero... so it wouldn't affect anything.
Now normally I would say "well, whatever, there's no harm in adjusting for skill on the off-chance that the coefficient might not be zero" but there's no easy way to measure player skill.
Uh, actually, I think I might be wrong about that. I just had an inspiration about maybe how to measure for skill without majorly fucking up the data (which would happen if we used ZvP ELO)
|
motbob
United States12546 Posts
On October 21 2009 12:47 Shikyo wrote:Show nested quote +On October 21 2009 12:43 motbob wrote:On October 21 2009 12:30 zulu_nation8 wrote: yea skill level certainly doesn't affect the outcome of a game. Skill doesn't affect one race over another, which means it doesn't have to be taken into account in a statistical analysis. EDIT: unwise -_- 3 good zerg players 3 bad protoss players z v p 3 - 0 z imba vs p = true Of course it matters. You can pretend it doesn't but it does. Always. Skill level difference in the games must be taken into account for accurate analysis. WTF this argument makes no sense. If there are an equal amount of good zergs and good protosses in the world, this effect will have no effect overall.
|
I think we have to assume however that the skill levels of the players should be approximately equal across this large of a population, so long as we have a large enough n value for our population we assume a normal curve.
Anyways I have no idea why everyone is doing all these strange tests when a simple 1 proportion z-test is all thats needed to analyze the data.
So all conditions are met for a 1-prop z-test including a large enough n value.
So we run the test using n=885 x=524 proportion of .5 normally
Null hypothesis that the the proportion is greater than .5
We wind up with a z score of 5.479 Which gives us a probability of 2.14e-8 Which means that there is an incredibly tiny chance of this proportion occuring completely due to chance and thus ZvP must be inbalanced.
|
motbob
United States12546 Posts
On October 21 2009 13:17 Traveler wrote: Anyways I have no idea why everyone is doing all these strange tests when a simple 1 proportion z-test is all thats needed to analyze the data. Heh. That's a fair question but if you look a few pages back a bunch of us did z-tests. Unfortunately it didn't convince anyone
|
On October 21 2009 09:00 motbob wrote: OK, my econometrics textbook says that my way is correct but I think it might be wrong and I think that thus I did the test slightly wrong. I can see why your method of getting the SD from the null hypothesis is better than the way I'm doing it. I'll keep doing research. The standard deviation (as you are using it per observation) from the null hypothesis is also .49, so in this case the difference is moot.
On October 21 2009 12:20 mahnini wrote: what about individual players? practice times? team makeup? map trends, style shifts, build order revelations? there are a lot of things aren't being taken into account that should be. Ideally, I agree with all of this.
Realistically, there isn't (and won't be) enough data to account for all of the variables.
What I'd do is this:
1) Treat Maps as if they are a part of the "balance" being measured.
I think it's universally agreed that the balance of any matchup is influenced by the maps. Saying "Z>P" kinda implies that "Z>P on the maps that are being used in progaming," because obviously P>>Z on something like FPM (as a ridiculous example) and Z>>P on a map like TotM. In practice, balance can be tweaked by either making map changes that favor one race over another, or by modifying unit stats, so they are different tools for doing a similar job.
Since Blizzard is not going to repatch the game (balance-wise), if an imbalance is found then the only way to fix it will be through the maps, regardless of whether the imbalance was somehow found to be more due to maps or more due to some theorycraft issue of how the units/timings interact. (maps change possible timings and unit combinations anyway)
2) Treat Player / Practice / Team / Style / BO as a part of "skill"
If Bisu uses the "Bisu build" to win 65% pvz, while Stork uses the "Stork build" to win only 45% pvz, then the player style is already built into any reasonable way we have of measuring that player's skill in the matchup. In fact, if Stork tried "Bisu style" pvz, he wouldn't be as good at it (it's not what he usually does), and therefore you'd almost have to treat Stork playing like Bisu as if he's a different player than Stork doing Stork-style. There's really no way to separate style from skill. We can assume that players use a style that fits their individual abilities the best.
Same goes for BO. Also using a non-optimal build order in one game (say the threat of going 9-pool to prevent P from just going 14 nex, or occasionally going 9/10 gate to prevent the Z from using a super-greedy build of their own).
Furthermore, I think we can assume that progamers are using the best mix of builds and strategies that a gamer today can reasonably employ. Maybe in the future there will be better metagame, but in the present this is as close to optimal on both sides as we have.
Practice, team, coaching... these all directly influence a player's real skill. Maybe a player who doesn't practice enough won't achieve his full potential, or a player who practices too much might not be rested enough to play at peak efficiency, but again this is already so deeply built into the "player" variable that it's an exercise in futility to try to separate it out. If Jaedong frequently plays in an exhausted state from over-practice, it'll already show up in his elo or whatever is being used to gauge player skill.
A test I'd like to see is to take the data that was already analyzed, and add a column for the P's ELO and the Z's ELO (perhaps once with overall elo, once with matchup-specific elo. The latter might seem preferable, but I think it could also suffer from directly influencing the dependent variable -- ie, suppose hypothetically the matchup is imbalanced, then ZvP elos are going to be artificially inflated even more than overall Z elos, so the inflated elo will appear to explain the winrate)
Then calculate a linear equation
Expected winrate (ZvP) = A + B*(P elo) + C*(Z elo)
B and C would tell us if the bias changes as the skill level of both players increases/decreases. I have a pet theory that PvZ becomes relatively harder for the P as both players get better, even within the ranks of progamers (most would agree this is the case within a group of low-skill players, so I see no reason why the same trend wouldn't exist at least somewhat within a group of high-skill players), this would be a way to test that.
I don't know if anyone has time to do that... honestly I don't want to type in 1700+ elo entries, although once that's done, finding the equation is simple with about any software.
|
Skills does affect the outcome but It is still okay for it not to be in the equation as long as it doesnt affect other things that also affect outcome
|
motbob
United States12546 Posts
ok this is going to depend a lot on whether I can get TLPD data in raw form or not.
...but if I do get this data, here's what I'll do.
I'll run a probit regression model with a shitload of variables. I'll have a variable for Z player ELO and P player ELO. I'll have a seperate variable for each map (with entries 1 or 0). I'll add more as people suggest them (although I think signet's post above is very smart and I'll probably be using his "lumping" method of counting a bunch of different things under skill)
One of the variables will be a binary variable about whether the match was played after March 1st, 2009. The purpose of the regression will be to see whether this variable is statistically significant.
The way probit works is that it finds the probability that the dependent variable (whether the zerg won or not) is 1. It measures how the independent variables (maps and such) affect this.
Anyway, that's my plan. I'll carry it out if I can get TLPD data.
|
On October 21 2009 13:17 Traveler wrote: I think we have to assume however that the skill levels of the players should be approximately equal across this large of a population, so long as we have a large enough n value for our population we assume a normal curve.
Anyways I have no idea why everyone is doing all these strange tests when a simple 1 proportion z-test is all thats needed to analyze the data.
So all conditions are met for a 1-prop z-test including a large enough n value.
So we run the test using n=885 x=524 proportion of .5 normally
Null hypothesis that the the proportion is greater than .5
We wind up with a z score of 5.479 Which gives us a probability of 2.14e-8 Which means that there is an incredibly tiny chance of this proportion occuring completely due to chance and thus ZvP must be inbalanced.
Look at my post on page 33. I did this, only I used .55 as the normal proportion, because what we're really trying to find out is whether or not ZvP is MORE imbalanced than it has been historically. The results clearly show that ZvP is, in fact, more Zerg-favored now than it has been historically. The only remaining question is why.
It's not player skill - we've accounted for that variable by including a wide selection of Zerg and Protoss pro gamers. I don't think it's the maps - just looking at the map breakdown over the period, we can see that the newer maps aren't any more Zerg-favored than the old maps, which indicates that something else is at work here. This leads us to general strategic advantage. Clearly, Zerg pro players have relatively recently worked out a consistently effective strategy that Protoss players haven't figured out how to counter. It remains to be seen if they will be able to figure out how to counter it. That said, even if we think that they will somewhere in the future, the statistics today justify an adjustment to the map pool to bring the matchup closer to 50%.
|
On October 21 2009 13:37 Matrijs wrote:Show nested quote +On October 21 2009 13:17 Traveler wrote: I think we have to assume however that the skill levels of the players should be approximately equal across this large of a population, so long as we have a large enough n value for our population we assume a normal curve.
Anyways I have no idea why everyone is doing all these strange tests when a simple 1 proportion z-test is all thats needed to analyze the data.
So all conditions are met for a 1-prop z-test including a large enough n value.
So we run the test using n=885 x=524 proportion of .5 normally
Null hypothesis that the the proportion is greater than .5
We wind up with a z score of 5.479 Which gives us a probability of 2.14e-8 Which means that there is an incredibly tiny chance of this proportion occuring completely due to chance and thus ZvP must be inbalanced. It's not player skill - we've accounted for that variable by including a wide selection of Zerg and Protoss pro gamers. most of the perceived imbalance will come from the same zerg players though as they will be the ones who are used more often in proleague or advance farther in individual league and thus play more games. unless you are cherry picking data it would be near impossible to get a truly random assortment of players anyway simply due to more wins = more playtime.
|
On October 21 2009 13:37 motbob wrote: ok this is going to depend a lot on whether I can get TLPD data in raw form or not.
...but if I do get this data, here's what I'll do.
I'll run a probit regression model with a shitload of variables. I'll have a variable for Z player ELO and P player ELO. I'll have a seperate variable for each map (with entries 1 or 0). I'll add more as people suggest them (although I think signet's post above is very smart and I'll probably be using his "lumping" method of counting a bunch of different things under skill)
One of the variables will be a binary variable about whether the match was played after March 1st, 2009. The purpose of the regression will be to see whether this variable is statistically significant.
The way probit works is that it finds the probability that the dependent variable (whether the zerg won or not) is 1. It measures how the independent variables (maps and such) affect this.
Anyway, that's my plan. I'll carry it out if I can get TLPD data.
Isnt it that the variable of your interest captures also everything that happens after March 1st, assuming the map and skills are fixed? How do you explain after that which one (in alot of factors) after that point of time that seems to affect the prob? Inclusion of Interactive terms?
|
motbob
United States12546 Posts
On October 21 2009 14:40 economist_ wrote:Show nested quote +On October 21 2009 13:37 motbob wrote: ok this is going to depend a lot on whether I can get TLPD data in raw form or not.
...but if I do get this data, here's what I'll do.
I'll run a probit regression model with a shitload of variables. I'll have a variable for Z player ELO and P player ELO. I'll have a seperate variable for each map (with entries 1 or 0). I'll add more as people suggest them (although I think signet's post above is very smart and I'll probably be using his "lumping" method of counting a bunch of different things under skill)
One of the variables will be a binary variable about whether the match was played after March 1st, 2009. The purpose of the regression will be to see whether this variable is statistically significant.
The way probit works is that it finds the probability that the dependent variable (whether the zerg won or not) is 1. It measures how the independent variables (maps and such) affect this.
Anyway, that's my plan. I'll carry it out if I can get TLPD data. Isnt it that the variable of your interest captures also everything that happens after March 1st, assuming the map and skills are fixed? How do you explain after that which one (in alot of factors) after that point of time that seems to affect the prob? Inclusion of Interactive terms? uhhh not quite sure what you're asking here.
|
If I understand correctly, then you have the binary variable that tells you that whether the match played after March 1st will make a difference in affecting the prob of z winning against p. But the question is which factor that happens after March 1st actually significantly lead to such a variation in prob of winning. For instance if maps are important then you interact map variables with the time dummy to know whether it is the case.
|
motbob
United States12546 Posts
On October 21 2009 14:53 economist_ wrote: If I understand correctly, then you have the binary variable that tells you that whether the match played after March 1st will make a difference in affecting the prob of z winning against p. But the question is which factor that happens after March 1st actually significantly lead to such a variation in prob of winning. For instance if maps are important then you interact map variables with the time dummy to know whether it is the case. Stata will spit out what effect that it thinks the maps have on the winrate of Z... and whatever it spits out for my time dummy should be corrected for the things like maps etc.
|
But it doesnt spit out the effect of both maps and time dummy, i.e the maps effect after March 1st, if there is any differences in map effects before and after March 1st.
|
motbob
United States12546 Posts
On October 21 2009 15:31 economist_ wrote: But it doesnt spit out the effect of both maps and time dummy, i.e the maps effect after March 1st, if there is any differences in map effects before and after March 1st. OH you're saying that since some maps were only used after March 1st then that might skew the results.
Wow thanks, I'll have to work out how to fix that. What exactly is the variable that you think I should add to the regression?
|
On October 21 2009 14:07 mahnini wrote:Show nested quote +On October 21 2009 13:37 Matrijs wrote:On October 21 2009 13:17 Traveler wrote: I think we have to assume however that the skill levels of the players should be approximately equal across this large of a population, so long as we have a large enough n value for our population we assume a normal curve.
Anyways I have no idea why everyone is doing all these strange tests when a simple 1 proportion z-test is all thats needed to analyze the data.
So all conditions are met for a 1-prop z-test including a large enough n value.
So we run the test using n=885 x=524 proportion of .5 normally
Null hypothesis that the the proportion is greater than .5
We wind up with a z score of 5.479 Which gives us a probability of 2.14e-8 Which means that there is an incredibly tiny chance of this proportion occuring completely due to chance and thus ZvP must be inbalanced. It's not player skill - we've accounted for that variable by including a wide selection of Zerg and Protoss pro gamers. most of the perceived imbalance will come from the same zerg players though as they will be the ones who are used more often in proleague or advance farther in individual league and thus play more games. unless you are cherry picking data it would be near impossible to get a truly random assortment of players anyway simply due to more wins = more playtime.
The same holds true for Protoss players, though. The best Protoss players will be used more often in proleague and will advance farther in individual leagues and will thus play more games. There's no reason to think either that this has changed from the way it was historically or that it would be different for Zerg players than it is for Protoss players.
The argument that it is player skill, when stated fully, is clearly wrong. The argument runs that, all of a sudden, about 7 months ago, all or most Zerg players suddenly got a lot better and/or all or most Protoss players suddenly got a lot worse. What, did the Protoss players all get together and agree not to practice for a while? Did the Zerg players all get together and suddenly decide that they needed to practice ZvP a lot more than they ever had before? It's just implausible. A shift in the metagame as a result of some new strategy being adopted by Zerg players is a much more plausible alternative, and I think it's the best explanation.
|
Skills matters NOT
If, lets assume, for some reason that all Zerg players due to linkage to the overmind has 2x the "raw skill" of Protoss players and win all their matches. Even in such a situation, it would make sense to use maps that favors the protoss so that they can win otherwise the scene would be boring.
Sure, professional leagues should reward skill, but the most important part is generate entertainment.
So if we are getting a ton of ZvZ, it is time to do something about it. -------- There is frankly, no complete way to measure skill across race boundaries anyways. One could not say that Bisu is more skilled than Jaedong or even, hell, Hyuk since the skills just don't translate perfect over. To be truly honest one would have to guess statistically with assumptions like (players join all three races with roughly equal potential and their training is equally efficient and so on) which breaks down when dealing with small samples of something like the top of the progaming scene.
|
On October 21 2009 15:42 motbob wrote:Show nested quote +On October 21 2009 15:31 economist_ wrote: But it doesnt spit out the effect of both maps and time dummy, i.e the maps effect after March 1st, if there is any differences in map effects before and after March 1st. OH you're saying that since some maps were only used after March 1st then that might skew the results. Wow thanks, I'll have to work out how to fix that. What exactly is the variable that you think I should add to the regression?
You simply interact map variable with time dummy to come up with new variable, and adding this new variable to the equation will correct the effect of certain map not played before (or after) March 1st on the outcome. This would also differentiate the impact of map before and after March 1st on the outcome if that map was played both before and after. I am just taking map as an example and its not very strong hypo as there is little difference in effect before and after. Similar logic could be applied to players skills as well as practice time, type of league...
Though this is worth pursuing, I dont think the results would be convincing people. You have so many things to control for and most of them might be mismeasured. The measure of skills would be resulting in a HUGE debate. Even the selection of time of March 1st has not been discussed yet or I may overlook the stats posted already
|
somebody had to bump into this after jangbi vs kwanro im sorry.
lol
|
|
|
|