Statistical Analysis of StarCraft 2 Balance - Page 3
Forum Index > SC2 General |
Apokilipse
United States2 Posts
| ||
d_ijk_stra
United States36 Posts
You can still question the adequacy of my model anyways, and thus further question adequacy of my estimated parameters. But at least, those values are from data, not from my personal understanding of a game. Actually I personally think P > Z, T if skills are equal, but this is what I got. | ||
Thrombozyt
Germany1269 Posts
You cannot really group different patches together, as potential 'imbalance' from a former patch will reflect on current patches. Also by using only current data (say March 2011 and onwards) but drawing from more tourneys you actually reduce the number of maps played and therefore the number of parameters you have to determine (as each map carries 3 beta values for the matchups) from a limited set of data. Edit: Changing the data set would also improve the quality of the analysis, because you wouldn't have to make the assumption that the Korean style is the 'gold standard' and rather take data from all over the world avoiding local bias. | ||
Primadog
United States4411 Posts
On May 05 2011 14:26 Nontrivial wrote: Although I'm no math major what I do understand I'm quite impressed with. I do have one question though how close to this is what the balence team talked about at Blizzcon? Here is the link to what I'm referring to: Link This paper's approach differs from the balance team's. d_ijk_stra's approach is to creating a statistics model for competitive StarCraft that uses only two variables: (1) player skill (2) map racial bias. He then proves that the model is a good fit for the GSL data. Finally, he asks the question: Does this model demonstrate any strong racial biase (using an average of the map racial bias variable) and concluded that there's no significant biase observed thus far. What is significant here is that his approach uses competitive play data, which the community generally consider a better indicator of game balance compared to the ladder. Secondarily, he created a model that separated player skills and map racial preference that fits this data, which is important to study the question of whether there's an imbalance in the game. | ||
palanq
United States761 Posts
are you going to do more, or was this just for a class or something? if so, you should scrape TLPD for broodwar proleague games or something, which would give you a lot more data, enough to do multi-period analysis and see how the parameter estimates change over time. plus you don't have as many inter-game dependencies that there are with best-of-X series. | ||
aksfjh
United States4853 Posts
The only "beef" I have with it is the fact that it covers a rather volatile period of SC2 (with frequent patches completely changing matchups), along with a region that has been predominantly Terran based since release. Not only that, but the Protoss from that region have also failed performed on an individual basis in individual matches. | ||
space_yes
United States548 Posts
| ||
space_yes
United States548 Posts
| ||
d_ijk_stra
United States36 Posts
On May 05 2011 14:57 Thrombozyt wrote: I guess it would be better to use a different data set, as the game has vastly changed from Oktober 2010. With Steppes of War and Delta Quadrant still being in the map pool and many balance changes not being in place (roach range increase anyone?). You cannot really group different patches together, as potential 'imbalance' from a former patch will reflect on current patches. Also by using only current data (say March 2011 and onwards) but drawing from more tourneys you actually reduce the number of maps played and therefore the number of parameters you have to determine (as each map carries 3 beta values for the matchups) from a limited set of data. Edit: Changing the data set would also improve the quality of the analysis, because you wouldn't have to make the assumption that the Korean style is the 'gold standard' and rather take data from all over the world avoiding local bias. I strongly agree with you and 'space-yes''s comment. At the time I was conducting the analysis, it was March and I didn't have good understandings on tournaments other than GSL. Moreover, gamers in GSL were isolated from others. But I didn't have enough GSL games per each patch, so I had to aggregate them all. I also feel very uncomfortable about this. Now the situation is a little different. There are many ongoing "global" leagues like NASL/TSL which I also enjoy to watch, thus I have more number of games worldwide and it might be enough to conduct a valid analysis. I hope I can do follow-up analysis anytime soon! | ||
slyboogie
United States3423 Posts
| ||
Valroth
New Zealand28 Posts
| ||
GhettoSheep
United States150 Posts
| ||
TheRabidDeer
United States3806 Posts
On May 05 2011 15:29 GhettoSheep wrote: I like how you admit that your results aren't statistically significant. There is nothing to admit, its stating a fact. Saying he admits to something makes it sound like its something bad. Anyway, look forward to the next one! GL with all of your coursework! EDIT: Or, I think maybe you misunderstood what statistical significance is? | ||
d_ijk_stra
United States36 Posts
On May 05 2011 15:24 Valroth wrote: A lot of effort for a fundamentally flawed analysis. You say that you've taken player skill into account, which is something that cannot be measured statistically in matches between different races. Measuring player skill based on mirror matches and then using that to add/reduce weight to balance statistics in matches between different races is logically misleading. I found it interesting anyway. This is a good point, but well I don't think this is fundamentally flawed. This model assumes that each player's skill is the same for every match. Well it may not be true, as we know from BW that some gamer is really good vs. specific race and sucks vs. another. But I think most gamers show coherent level of skill between games, and then overall analysis may not be that misleading. Yes, actually without such an assumption it's impossible to quantify the balance between two races... ![]() You may still disagree with this, and then deny the results. Every statistical model makes assumptions to overcome data parsimony, and I think whether the assumption is valid or not is a constructive discussion. I think the assumption is not that strong... But it's reasonable to question it. I have some ideas about more sophisticated models to account for this... Hope I can show results soon ![]() | ||
han_han
United States205 Posts
| ||
Primadog
United States4411 Posts
On May 05 2011 15:24 Valroth wrote: A lot of effort for a fundamentally flawed analysis. You say that you've taken player skill into account, which is something that cannot be measured statistically in matches between different races. Measuring player skill based on mirror matches and then using that to add/reduce weight to balance statistics in matches between different races is logically misleading. I found it interesting anyway. There's not enough data points available to estimate every player's skill level in particular match-ups, but the tests he used showed that his model fits the dataset well despite this flaw. You also mischaracterized how skill is measured and used in the first place. When you make a statistics model, you have to make certain assumptions that may not completely reflect reality. It's the nature of dealing with any large set of data. If you believe an assumption is incorrect, create a better model and demonstrate that it better fits the data. Believing that making assumptions somehow discredits a model simply shows that you have absolutely no idea how Statistics as a hard science works. | ||
Techno
1900 Posts
On May 05 2011 13:54 d_ijk_stra wrote: Techno/ Well this is what is called 'Latent Variable' method, which enables you to model which cannot be observed. It need not be defined or observed, although it's convenient to 'interpret' it that way. Actually the method of latent variable is very popular technique these days, although not covered in basic statistics courses (even in the graduate level). I think you confused it with random effects / hierarchical model in ANOVA. You don't really need to assume latent variable to follow normal distribution. Of course, without any regularization it will overfit data, and using the assumption of normal distribution is a good way to regularize your parameters. But you can also use other types of regularization... I used L1 penalty for other reasons. However, I guess you may not want to discuss this much of technical details ![]() I really think it would have been better if you had used win rates of certain leagues assuming skill is either non present, or normally distributed, as it is debatable that skill even exists outside of winning, and should you include skill, you should include variables like: - Skills affect on Racial Performance - Skills affect on this map - Skills affect on this strategy (perhaps strategy is a part of skill, perhaps not) I feel like skill is a very abstract concept, that cannot be precisely defined by even God. I feel like it has no place in statistical analyses. I may be wrong, but that's just my thoughts. I mean no disrespect to your report, in fact I respect it. | ||
Primadog
United States4411 Posts
| ||
awesomoecalypse
United States2235 Posts
On May 06 2011 05:04 Primadog wrote: Skill as a normally distributed variable that influence win-rate is the foundemental part of games and sports ratings dating back to the beginnings of Chess ELO. Every ELO, true-skill, or computerize/holistic-ranking system you see in major sports and gaming sites are based on the concept of skill as a measurable variable. There's nothing innovative or surprising about this assumption. this is true, but all these assumptions correlate winrate to skill, which is something some players dispute. a guy like IdrA would argue that cheesy players are "unskilled" even when they win, something formula would clearly dispute. But, as someone who thinks that mindset is counterproductive nonsense, and that a win is a win, I'm all for this system. | ||
hypnobean
89 Posts
| ||
| ||