Scientific proof that SC2 is imbalanced (sorta) - Page 5

johnlee

United States242 Posts

August 17 2010 01:25 GMT

#81

Lots of effort put in this, so propz for that.

But what you've essentially done was SIMPLY show variance between winning percentages for each race. So if the data suggested at every level all the races' winning percentages were 50%, would it mean that the game was balanced?

Nope. Not. At. All. Absolutely not.

When we're discussing the term "imbalance" in SC2 we're talking about a specific part of the game that gives an advantage to a specific race such that the other race or races must either rely on the opponent's mistakes to win or be disadvantaged in the game.

For example, when SC1 first came out and spawning pools were 150 minerals. 4 pool was UNBELIEVABLY strong and anyone who chose to use that build order would most likely win.

BUT for different reasons such as "not wanting to be cheap" or "wanting a standard macro game" people would opt out of that option, and we'd still see a good number of protoss and terran at the top of the ladders. Does that mean the game was balanced? No.

I'm not suggesting that SC2 is balanced, nor am I suggesting that it is imbalanced. I just want to point out that imbalance should only be figured out by seeing which BO or units provide an un-overcome-able (couldn't think of a word LOL) or at least very-difficult-to-overcome advantage to a race between players of EQUAL skill. Exactly how can we quantify that? I don't know.

But we can't show imbalance by using your numbers. I feel bad cause I feel like I'm shutting down your work... but it's wrong

All of this with due respect.

MamiyaOtaru

United States1687 Posts

August 17 2010 01:26 GMT

#82

On August 17 2010 10:21 GagnarTheUnruly wrote:

Show nested quote +

But we would see that on a graph of race use vs. placement. In that case we would see more zerg in lower leagues and fewer in higher leagues. We don't see such a pattern, suggesting that if matchmaking is having an effect, it probably isn't a strong one. Of course there's no way to know for sure without testing it directly.

that assumes each race is chosen equally. What we do know is zerg is the least played. The percentage goes up as you go higher, but it's hard to rule out the possibility that trend is due to newbies playing them less (because they are harder lol). It's almost impossible to measure balance outside of the very top players

koswinner

United Kingdom27 Posts

August 17 2010 01:35 GMT

#83

On August 17 2010 10:13 GagnarTheUnruly wrote:

Show nested quote +

Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

Show nested quote +

I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

Show nested quote +

This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.

1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

petered

United States1817 Posts

August 17 2010 01:37 GMT

#84

I am not suggesting that any of your math is incorrect, just the way you are interpreting it. You are making really big assumptions that just don't work in my mind.

Within any given game, the chosen race has very little implication on likelihood of victory. Ok, this is a true fact which you have proven.

However, you then go on to state that if there were an imbalance, it would most likely show up in the distribution of races to the different league. This is the spot where you are making assumptions. You are assuming little to no race changing, you are assuming(as I said before) that people of different skill level have the same probability of choosing race X, and you are assuming that people in the same league are getting matched up against similar opponents, which we don't even know. The top players in a league might get matched up against players from the higher league more often.

I really appreciate your efforts but you just can't make conclusions from the tests you have done. Likely there is no way to determine racial imbalances from the data provided.

Bitters

Canada303 Posts

August 17 2010 01:40 GMT

#85

Not sure if this has been mentioned yet but...

This only looks at leagues instead of ranks within. Really, this seems to be more of a test of the matchmaking system than anything.

If in each league, the top 66.66% ranks were a mix of terran and protoss, with zerg comprising the bottom 33% ranks, how would your test account for this? If the ranks were laid out like this, there would be a clear imbalance since "diamond" zergs couldn't outrank "diamond toss" and would be getting more wins from platinum players. Obviously, this is an extreme case, but it does raise an issue.

Also, looking at racial wins versus other races might be enlightening. What are the TvZ and ZvP percentage wins by race? If balanced, you would assume 50% in each case. However, with this data, you could still run into the problem of breaking it down only by leagues.

gods_basement

United States305 Posts

August 17 2010 01:47 GMT

#86

one of the main pillars of the "Terran Imbalance" argument is that terran players are just worse than the rest of the world, so therefore the game is imbalanced, because the statistics are even when skill levels are not.

GagnarTheUnruly

United States655 Posts

August 17 2010 01:53 GMT

#87

@koswinner

I have to admit that the fact that you personally attacked me in your post caused me not to read yours very carfefully. I'll try not to repeat that mistake but I still don't know how I could accurately paramatize 'value of investment.' I suppose it would be interesting to see if people do better with one race the more they played with it to the exclusion of the other races, but this would only influence balance if it was more important in some races than others. I just don't think it's the most likely alternative explanation for the results I showed.

As far as omitting variables from a regression, obviously you reduce the predictive power of your model, but it doesn't really affect your ability to determine the importance of the effects that you can model. It's all sort of a moot point for the time being because I have no ability to get player-specific data to run a regression model.

In any case, the reason I did this was for fun. Hopefully people are enjoying this post, or at least having fun picking on me. Defininitely people pointed out some things that I didn't think of, but I still think it's neat to think about the results I got and what they might imply about the state of balance of the game. And yes, I do think my results constitute 'evidence.' It's obvious you aren't convinced, but I'm glad others seem to have found some value in my little project.

@ negative feedback people: no worries, I'm not bothered. It's nice to get constructive feedback even if it's negative!

GagnarTheUnruly

United States655 Posts

August 17 2010 01:58 GMT

#88

On August 17 2010 10:40 Bitters wrote:
Not sure if this has been mentioned yet but...

This only looks at leagues instead of ranks within. Really, this seems to be more of a test of the matchmaking system than anything.

If in each league, the top 66.66% ranks were a mix of terran and protoss, with zerg comprising the bottom 33% ranks, how would your test account for this? If the ranks were laid out like this, there would be a clear imbalance since "diamond" zergs couldn't outrank "diamond toss" and would be getting more wins from platinum players. Obviously, this is an extreme case, but it does raise an issue.

Also, looking at racial wins versus other races might be enlightening. What are the TvZ and ZvP percentage wins by race? If balanced, you would assume 50% in each case. However, with this data, you could still run into the problem of breaking it down only by leagues.

Both good points. As for the first, it seems unlikely that races would be nonrandomly distributed within leagues and not among leagues. In fact, if you go to some of the sources that have been posted, you can see that race distributions are even within leagues. As for the cause, it's hard to say, but the most parsimonious explanation is that the matchmaking system is responding to the races equally. This is consistent with a hypothesis of good general balance.

I'd really like to do the second part. Maybe someone at sc2ranks will see this and send me some raw data.

However, you then go on to state that if there were an imbalance, it would most likely show up in the distribution of races to the different league. This is the spot where you are making assumptions. You are assuming little to no race changing, you are assuming(as I said before) that people of different skill level have the same probability of choosing race X, and you are assuming that people in the same league are getting matched up against similar opponents, which we don't even know. The top players in a league might get matched up against players from the higher league more often.

My reasoning is this: Let's assume that the races are imbalanced and that players have race affinity and never change affinity. If the matchmaking system works, the players with race affinity for a weak race will lose more games than the players with affinity for the strongest race. They'll have trouble rising in the ranks, and the strong-race players will rise quickly, resulting in an uneven distribution of races.

Now relax the assumption about race switching. The players who are unsuccessful will switch over to the superior race, amplifying the effect.

Now assume the matchmaking system doesn't work -- so players are randomly distributed among the leagues -- and the races are imbalanced. In that case the strong race will be more likely to win its matches within each league, because the matchmaking system isn't biasing the leagues at all. The strong race will emerge as the race with the best win %. Even if the matchmaking system could explicity move players based on both their points and their race (which it doesn't) the only way for it to balance win ratios would be for it to place the entire spectrum of player skills for each race into each league (which we know it doesn't do for obvious reasons).

On the other hand if the game is balanced and the matchmaking system works, then we get the observed results in the most parsimonious way imagineable. Because of it's parsimony, and it's ability to explain the observed (and admittedly incomplete) data, it's the best starting point as a hypothesis for future investigation.

koswinner

United Kingdom27 Posts

August 17 2010 02:07 GMT

#89

On August 17 2010 10:53 GagnarTheUnruly wrote:
@koswinner

I have to admit that the fact that you personally attacked me in your post caused me not to read yours very carfefully. I'll try not to repeat that mistake but I still don't know how I could accurately paramatize 'value of investment.' I suppose it would be interesting to see if people do better with one race the more they played with it to the exclusion of the other races, but this would only influence balance if it was more important in some races than others. I just don't think it's the most likely alternative explanation for the results I showed.

As far as omitting variables from a regression, obviously you reduce the predictive power of your model, but it doesn't really affect your ability to determine the importance of the effects that you can model. It's all sort of a moot point for the time being because I have no ability to get player-specific data to run a regression model.

In any case, the reason I did this was for fun. Hopefully people are enjoying this post, or at least having fun picking on me. Defininitely people pointed out some things that I didn't think of, but I still think it's neat to think about the results I got and what they might imply about the state of balance of the game. And yes, I do think my results constitute 'evidence.' It's obvious you aren't convinced, but I'm glad others seem to have found some value in my little project.

@ negative feedback people: no worries, I'm not bothered. It's nice to get constructive feedback even if it's negative!

actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.

TheMick

Great Britain164 Posts

August 17 2010 02:16 GMT

#90

so the percentages difference is barely noticable 1-2%, can just mean there is slightly better players playing terran.
good work thou, and nicely layed out.

Bitters

Canada303 Posts

August 17 2010 02:17 GMT

#91

another interesting thing to look at might be how these stats change over time. we are still less than a month after release, and people are learning new tricks and how to abuse features of their race.

if terran is overpowered due to whatever reasons, we may see these trends increase over time as their players lean towards whatever unit or composition, etc. makes them imba.

what might be an interesting stat to test (if possible) would be the average diamond league points by race. divisional ranks would be good two, however with how divisions work (like new ones) ranks themselves aren't very meaningful. summing all diamond league points by race might give a better insight on how terran is performing within the league.

mahnini

United States6862 Posts

August 17 2010 02:18 GMT

#92

On August 17 2010 10:35 koswinner wrote:

Show nested quote +

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.

RMmanlots

United States95 Posts

August 17 2010 02:19 GMT

#93

I hate to say it, but this analysis misses a critically important point. Because of blizzards auto-match making system, all players should win about the same % of games. If a crappy player plays an OP race, they should rise in standings until they level out at about 50% winrate.

Basically, win rate % is not a reliable way of determining if a race is over powered. The problem isn't Terran's win rate %, its the considerably lower level of skill needed to achieve that %.

GagnarTheUnruly

United States655 Posts

August 17 2010 02:20 GMT

#94

On August 17 2010 11:07 koswinner wrote:actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.

This kind of stats argument might be best for PM's because it's drifting from the OT, but I want to make it clear that I never claimed (or should have claimed) that my data had predictive power. That's not really sensible with a simple chi-square analysis.

Regarding regressions, excluding important effect variables can cause your r2 to go down, but adding unimportant effect variables artificially inflates r2 and reduces the accuracy of other parameters due to autocorrelations and spurious effects of the excess predictors. Adding predictors will always increase r2 but it's not always a good idea to add predictors. Often it's not the r2 that's important, but the parameter estimates. It's really unimportant because I don't have the data to do a regression and I probably never will.

Biochemist

United States1008 Posts

August 17 2010 02:36 GMT

#95

I love how 80% of the posts in this thread are pointing out the exact same flaw in the study.

Sixes

Canada1123 Posts

August 17 2010 02:37 GMT

#96

On August 17 2010 07:29 Toids wrote:
Ya.... you can't pull numbers from the ladder to explain balance. You need to get data from outside of the matchmaking system.

Having reached the same conclusion I was wondering if someone had done this.

Really there is only 1 sample I can think of, all the pro games in tournaments since release (as no balance changes have been made).

Taking every game (including rounds of 16 and up or so) should give a large sample size (though still likely biased by the skill of some individuals). Interesting stats would be the win percentages, mostly the matchup specific ones (say if Z was way better against P than T) as this might avoid some of the player specific error.

Anyone feel up to it?

mahnini

United States6862 Posts

August 17 2010 02:41 GMT

#97

On August 17 2010 11:37 Sixes wrote:

Show nested quote +

wouldn't that be far more susceptible to variance? we'd be able to rack up a lot of data but they would still be from the same 10-20 players over time.

EZjijy

United States1039 Posts

August 17 2010 02:47 GMT

#98

The game hasn't even been out for a month yet. I'm sure the data is still premature, although those graphs do not look too bad at all. I mean, how balanced can you expect? A straight 50% is never going to be possible. There is also the variable of better players playing more of a certain race.

TheGeo

United States51 Posts

August 17 2010 02:52 GMT

#99

I do love the statistics behind but you don't take into account other things that could effect the data enough. Things like how a lot of the good players will play the race they think is the most "OP" which is usually Terran. The not so good players will not care because they are not nearly as concerned about winning or losing. This skews the data to favor Terran in the high levels of play.

koswinner

United Kingdom27 Posts

August 17 2010 03:02 GMT

#100

On August 17 2010 11:18 mahnini wrote:

Show nested quote +

On August 17 2010 10:35 koswinner wrote:

On August 17 2010 10:13 GagnarTheUnruly wrote:

On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.

I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21

Attachness actually isn't that difficult if you use some indirect way of testing it. For example you could just test the correlation between change in proportion of each race and the opionion pools about which race is considered as most powerful/imba. Even if for SC2 there isn't enough sample space yet, but we could certainly use other similar type of games such as WC3 and SC1, which probably could be some valid proxy. The datas were available but just nobody really bothered to record it. Some simpler indicator could be some poll asking about whether players would consider/ is considering changing race if their race is having problem. Of course these does not distinguish between different attachness between different skilled players. If you want, just do the same survey for different groups.
Your argument is valid ONLY IF proportion of players in each level represents the balance, i.e. ONLY IF other factors are not affecting player's pick of race and change of race, and assuming each race's population has homogenous characteristics, i.e. they have similar ability, some of them does not struggle harder to get their status as opposed to other races. Then with a more detailed breakdown of bracket such as to top 20 or 10 and tournament oriented top pros it will probably be some valid test. But obviously some of the assumptions are just tooo strong/unrealistic, as players do change away from weak races to stronger ones.. Just look at WC3. Though this effect is much less signficant among top players, who already invested significantly in their particular race.

So if you see quite a number of, or even a significant proportion of the very top players of one race (which is considered as weakest) is changing to other races (mainly the commonly considered imba one), while no top players from other races changed their race, is this just an accident? Does it say anything? How about win rate of these top players with their respective races in major tournaments? All of them, I believe, are much better / more reliable indicator compared to yours, as they require much simpler/realistic assumptions and are obvious enough to overwhelm the rest.

Prev 1 2 3 4 5 6 7 11 12 13 Next All

Please or register to reply.

Scientific proof that SC2 is imbalanced (sorta) - Page 5

Completed

Ongoing

Upcoming