|
Balance Discussion Math(Best of N format analysis)
Please refrain from discussing balance itself here. Nowhere in this thread do I use the words: Terran/Zerg/Protoss. It is unnecessary to use these terms to prove my point, or disagree with my argument.
INTRODUCTION Many of you opened this thread probably because of the word “Balance” in the title. Balance discussion threads have been rampant for years despite the fact Designated Balance Discussion Thread exists. (Do balance talk there, not here) Those balance related threads attract many viewers and posters as well as whiners and defenders. This is not another “this race/unit is so OP and needs nerf” thread. Instead, this thread attempts to mathematically prove that using tournament results as a source of balance discussion could be misleading if not used correctly. I recently posted random stuffs like Artosis pylon Art or Underground Activities in Starcraft 2, but this thread is more relevant and not as entertaining. No funny images but only graphs and tables. Thoroughness is the only common theme among all of my posts. Basic high school level math might be required to fully understand or independently confirm my work. Let me know if there is any miscalculation.
Definition + Show Spoiler + OP = overpowered. UP = underpowered. Best of 1 = 1 win is required to beat your opponent. Best of 3 Series= 2 wins are required to beat your opponent. Best of 5 Series = 3 wins are required to beat your opponent. N = {1,3,5} Could be higher odd numbers. x = win rate of OP race. 0.500 < x < 1.0000 1-x = win rate of UP race. 0.0000 < 1-x < 0.5000 f(x) = win rate of OP race in best of 1 as a function of x g(x) = win rate of OP race in best of 3 as a function of x h(x) = win rate of OP race in best of 5 as a function of x Race A = OP race. It wins more than 50% vs Race B. This thread assumes OP race exists. Statistical probability is the only attribute it has, not player skill/map/meta game etc. Race B = UP race. It wins less than 50% vs Race A. Race C = the other race. Basically ignored so that 1 race vs 1 race calculation becomes clear. OPlevel = x in percentage. 50% < OPlevel < 100%
Best of N win rate Calculation + Show Spoiler +Visual Representation of Best of 3 ![[image loading]](http://i.imgur.com/JHQ2u.png) ![[image loading]](http://i.imgur.com/6spw1.png) Visual Representation of Best of 5 ![[image loading]](http://i.imgur.com/4HAmP.png) ![[image loading]](http://i.imgur.com/krIE9.png) Best of 1 win rate is f(x) = x for obvious reason. All of these 3 functions are defined between 50% and 100%.
Graph + Show Spoiler +
Multiple Best of N series + Show Spoiler +This section assumes ideal tournaments. One Race A player plays against Race B players twice. At a separate setting, one race B player plays against Race A players twice. For the first Race A player, it is as if winning a 4 player tournament with ABBB race distribution. For the latter Race B player at a separate setting, it is as if wining a 4 player tournament with BAAA race distribution. Since there are 4 players involved, both have 1/4 = 25.00% chances to win the separate tournaments if 2 races are totally balanced at 50-50. Take a look at how OPlevel affects the win rate. ![[image loading]](http://i.imgur.com/GYM0v.png) ![[image loading]](http://i.imgur.com/VlIeH.png) Extending the idea, ABBBBBBB & BAAAAAAA 8 player tournaments with 1/8 = 12.50% standard chance to win. ![[image loading]](http://i.imgur.com/lk8hq.png) ![[image loading]](http://i.imgur.com/RufKn.png) These tournaments don’t happen in reality due to mirror matches and existence of Race C. Nevertheless, these graphs essentially show how easier it is for OP race to place high in a tournament compared to UP race even with relatively small OPlevel like 52.50% or 55.00%. 70% OPlevel is not required to “look like” 70% OP from tournament placements. Since almost all tournaments have multiple rounds of Best of 3 or 5, the tournament format itself emphasizes OP&UP states in this way.
Conclusion + Show Spoiler + Whenever you see someone argue balance with statements like “There are 5 Race A players in Round of 8” or “7 tournaments are won by Race A players in the last 10,” you have to take these with a grain of salt. They are trying to impose the image that Race A is super OP with 70-80% OPlevel, when in fact; Race A is likely to have only 55% or so OPlevel. Although one cannot deny the fact that Race A is OP, it is not as much as people make it to be from those results. Along with player skill and other factors, the tournament format factor plays a role here as I presented above. Citing recent tournament results as a source of balance discussion could be misleading because OPlevel always gets exaggerated. You always have to examine individual map-by-map results instead of how many of which race placed high in a tournament. Racial distribution section of tournament liquipedia is not a good source to start your balance discussion.
Sadly, we only spectate tournament matches. Therefore, our image of a race tends to be exaggerated by seeing Race A players more often due to them advancing further, partially thanks to the format that makes Race A look more OP than its actual OPlevel unlike 1-match-only ladder games. This may be one of the reasons why actual overall win rate of UP race is not as bad as some think it is. Tournaments with multiple rounds of best of N series are destined to be biased in favor of OP race. We have no choice, but accept it. Just know that it is a biased format.
Final Thought + Show Spoiler + Race A could be any of the 3 races. Which race is Race A is 100% irrelevant to my point. It could be one race today, but it could be another race 1 year from now. My point still stands no matter which race is widely considered Race A at that particular point of time. Note that I already acknowledged that Race A is OP. Nowhere in this thread did I deny that fact. “OP race is OP anyways” goes nowhere and is missing the point.
Some might have thought this was another cool picture thread from me, but it’s not. I don’t think this thread was interesting for majority or readers, but I hope at least some found it intriguiging.
Orek's Articles/Guides + Show Spoiler +
|
This is a really interesting article. Somehow this always was in my mind, but I've never done the math on it and I never assumed that the impact would be that high.
Keep up that amazing work!
|
United Kingdom3482 Posts
It's fascinating just how small the OPness can be to create a large skew in tournament results. It highlights just how volatile balance is even in a two race setup. It's also funny that having more games in a series actually increases the effect of OPness when I think most people's assumption would be that it decreases the effect.
|
I like the concept and I like your conclusion because I often thought that "overall win rate" is not really relevant of balance. Like you, having a high degree in science (with lot of math in it) help me to understand but it's quite difficult to make it clear to everybody. I think it still needs some more explanation because all the graphs are quite confusion (especially the X best of N series win rates). May be some legend on the axis should have been appreciated.
Keep up the good work, I really like what you are doing.
|
I'm confused.
Win rate x on mapA is not the same as win rate x on mapB. It should be x'. And x'' on mapC.
|
Overall winrate is averaged through maps and racial matchups. A BoX series keeps racial matchup consistent but changes maps throughout the series which is not at all factored in this analysis. Winrate is not a independent random variable, it's a random process.
|
please keep making articles, very interesting!^^
|
Very interesting read. I wish more people who post about balance whine would read and consider these sorts of facts before they get all emotional and nonsensical about "the states of things".
|
I just want to say you're turning into my favourite poster in this site. A great honour, I know, and I do not use those words lightly.
|
The approach can also be translated well into individual players:
It correlates well with how a longer series helps the "OP" one of the 2 competitors more, as higher number of matches played lowers the volatility and allows the skill difference in % (aka OPness) to shine thorugh.
Interestingly enough, having BoX setups in competitive gaming also helps create legends, it works so that the small difference between the top players can be magnified and allow the better ones (even if better by a relatively small margin) to have runs in tournaments.
|
EDIT: pretty much what the post above mine said :D + Show Spoiler +If you change "OPlevel" to "skill level" it pretty much sums the whole reason to do larger boX games (the better player is more likely to win the more games they play).
I do like the point that it takes only a small level of imbalance to strongly favor an OP race deeper in a tournament... Although, maybe that actually gives some credibility to "having more race A in the ro4 onward" actually suggesting there is a (small) imbalance? o.O Granted, with the wide range of skill between players as well as other factors (mind games, jet lag, lag, w/e) it's impossible to show anything from just a few tournaments
|
On August 21 2012 23:47 Juliette wrote: please keep making articles, very interesting!^^
I fully support this.
|
Dude this is awesome! well written. I don't know if you knew this but Blizzard considers a 45% to 55% "balanced" which means that if a race is balanced according to them it would simply dominate the tournaments, very interesting to think about
|
good work really intresting read allthough my math is somewhat lackluster since i havent used it in forever (more or less)
also this should pretty much be the same if we say Player A is lets say 4% better than player B so its 54% and therefore BO5 offc gives him an edge which is offc why high bo's are standard, The issue offc being how do you messure player skill :-)
|
Orek this is very well written and shows a strong understanding of the math underlying balance. I'm certain the balance guys at Blizzard take this into consideration.
An interesting additional point: "Race A" and "Race B" can easily be replaced with "Player A" and "Player B" regardless of race. Indeed this is how we assume the best players win the tournaments they do. However, I think given this the win rates of tournaments, even of individual games/maps, is deeply confounded with the players who are playing. Every Race A vs Race B matchup is also a Player X vs Player Y matchup. How do we decide it was a Race related discrepancy and not a Player related one to explain any win rate phenomena? When you have a vast pool of individual games pulled from tournaments you have not actually randomly selected matchups... you have a deeply confounded morass from which it is hard to draw any reasonable conclusion.
|
Calgary25981 Posts
Interesting. Makes sense.
|
Hi. OP here.
As some pointed out, yes, the result can be applied to 2 players where one is favored. Best of 5 favors overall better player more than best of 3 does. Also, introducing losers bracket to increase the number of best of N favors overall better player as well. Although I wouldn't do it here, it is possible to calculate which of the two formats favors real good players: single elimination best of 5 OR double elimination best of 3. It must depend on the number of participants.
As I expected, math-related thread didn't explode in viewership like cool picture thread did. Nevertheless, thanks for the support and positive feedback. If I don't give up on the way, I have another math project coming up as well as picture thread.
|
France9034 Posts
I find this very interesting. Wouldn't have done it myself, lazy as I am ^_^
And by the way, it's not the only argument one could use to say "talking about OPness is irrelevant".
Indeed, players aren't robots (well... I know a few guys that could argue about Mvp :D), and aren't playing exactly the same games on the same maps against the same opponents, even for the most consistent of them.
And the fact that there are different skill levels, that sometimes a player can have a glimpse of brilliance, or the exact opposite, won't help in having the 3 race at the exact same winrates of 50% in all match-ups...
Great post ! Don't think in terms of viewership for your thread, quality is relevant, not quantity ^_^
|
A comment on the presentation of your graphs. It's generally poor form to not label the ordinate and abscissa.
As for the actual content, I'm glad you took the time to do this. It's of course expected, but it's nice to put your finger on exactly how important a BOX improves the odds of the better player advancing.
|
First of all, very very detailed work by OP, and marvelous job done. But I keep getting confused by all those colours and graphs. IMO drawing all those colourful tables will confuse people instead of clarifying. Why not use probability trees instead? IMO I think they are much better diagrams than tables. Just my 2 cents =D
+ Show Spoiler +Anyway, a simpler way to explain will be to use the Binomial Theorem.
Let P(A) = Probability that OP race wins a best of n series, P(B) = Probability that UP race wins a best of n series
p = probability that OP race will win the match q = 1 -p = probability UP race will win the match n = total number of games played rp = number of games won by OP race rq = number of games won by UP race
P(A) = nCrp * (p)^rp * (q)^(n-rp) for rp = 1,2,3,..,n
P(B) = nCrq * (p)^rq * (q)^(n-rq) for rq = 1,2,3,...,n
We'll notice that P(A) - P(B) will get more apparent the higher the value of n.
|
Good post, not sure why it's counter-intuitive though. BoX series increases the chances of the better player the more X is big, and OP'ness makes one of the player artificially better, so it's logical that BoX series amplifies the difference.
|
On August 21 2012 23:31 imallinson wrote: It's fascinating just how small the OPness can be to create a large skew in tournament results. It highlights just how volatile balance is even in a two race setup. It's also funny that having more games in a series actually increases the effect of OPness when I think most people's assumption would be that it decreases the effect.
No, it's pretty obvious that if you have a better chance to win it'll be amplified over a boX (where X is 3 or greater) rather than a single game. It's because of this that bo5/bo7s are preferred; the player who has a better shot at winning (usually through being a better player, sometimes through imbalance though) has a better shot of advancing as he's more likely to win games than his opponent.
|
Interesting stuff.
Logically, this also ought to apply to player win probability - to pick a couple players at random, Grubby is far more likely to manage to win one game against MVP than a Bo3 series, and more likely to squeak out a Bo3 than string 3 wins together to take a Bo5, etc.
But - this is completely tangential to your point - this suggests there's a point at which the entertainment value of a tournament is better served by shorter series (higher probability of upsets), while "accuracy" of results (the "best player" wins) demands longer series.
|
On August 22 2012 22:52 Setev wrote:First of all, very very detailed work by OP, and marvelous job done. But I keep getting confused by all those colours and graphs. IMO drawing all those colourful tables will confuse people instead of clarifying. Why not use probability trees instead? IMO I think they are much better diagrams than tables. Just my 2 cents =D + Show Spoiler +Anyway, a simpler way to explain will be to use the Binomial Theorem.
Let P(A) = Probability that OP race wins a best of n series, P(B) = Probability that UP race wins a best of n series
p = probability that OP race will win the match q = 1 -p = probability UP race will win the match n = total number of games played rp = number of games won by OP race rq = number of games won by UP race
P(A) = nCrp * (p)^rp * (q)^(n-rp) for rp = 1,2,3,..,n
P(B) = nCrq * (p)^rq * (q)^(n-rq) for rq = 1,2,3,...,n
We'll notice that P(A) - P(B) will get more apparent the higher the value of n.
You are absolutely right. Thanks for your input. It's just that what is easy to do on my notebook is not necessarily what is easy to upload on TL. I probably need to use scanner for my next math project to fully deliver the work. As you pointed out, yes, Binomial Theorem plays an important role here. From Best of 5 onward, it is so much easier to calculate win rate that way. General win rate of OP race is:
![[image loading]](http://i.imgur.com/NY3hn.png) Using this equation, it is possible to calculate something like Best of 15 win rate. Knowing this, I still thought using the excel table image was more intuitive at relatively small best of 5 or less.
|
Interesting from a math perspective. However, I don't see how this could add to the balance discussion as it is currently. Blizzard provides the community with win-rates on ladder. TLDP brings win rates from tournaments. We get the actual win rates from there. Seeing as there's no mathematical way of separating player skill from racial imbalances these calculations quickly becomes irrelevant to the debate. ie MKP would probably beat any mid master player as off race in an open bracket MLG. (and I saw Welmu (normally P but off raced as T) 2-0 Grubby on ladder).
BoN series stats would be interesting overall, because I believe the mental state of each player in the series accounts for most win rates, not the balance of the game itself. At least not in this state where the game is currently. I'd like to discover if there's varying results in terms of win orders in comparison to race in a series. e.g. Zerg has a higher probability to come back from a 2-1 deficit in a Bo5 against Terran. Combine that with various game stats like time data, etc, for each game and I'd find that much more valuable to the discussion, more in depth stats.
I think most sports fans have accepted the idea of a extended BoN beyond Bo1 favors the "better" player/s. What you concluded here, atleast to me, is that a player with >50% win rate will increase his/hers win rate exponentially if the games played increases. As a math fan it might seem intriguing, as part of a balance discussion it just confirms what most people already figured out without the exact math to support it.
You might have targeted the math people on TL though, so don't get me wrong here  You bring up a great point though in all of this and that's peoples perception of imbalance is mostly based on the wrong data.
|
Good thread! If only we could make it required reading...
|
I like your thoroughness. You should make more threads. Thank you.
|
good thread. I thought though this point was quite...obvious though. just me. but i appreciate the backing of it up.
|
You are only saying that a OP race has more chance to win if there's a higher (BO) Best of .
For exemple: next TSL no terran = underpowered this thread informe me that is the TSL qualification would have a BO5 instead of BO3, there will be less terran.
|
so if the OP race has the chance of winning 1 match 55% it has the chance to win bo5 around 58%. Not that big of a deal.
|
Sorry for bumping this thread, but I had missed it and your 8% gas thread made me read it :-)
As you said, a BOX is an amplificator of probability (that is if player A has p chance to win against player B, with p>0.5, then the player A has q chance to win against player B in a BOX with q>p). This is what we want in a tourmanent, that the best player win. Now of course the higher X is, the higher the amplification of probability will be, but more game will be played on average. One can aks if there are better system than a BOX in term of amplification by number of games?
For instance, one could look on X More Wins (XMW), where a player win if he has X more wins than the other. If each player are of equal strength, it will take on average X^2 games to designate a winner.
This blog post is pretty interesting in this regard http://www.madore.org/~david/weblog/2012-06.html#d.2012-06-02.2051 unfortunately it is in French, but you can look at the graph which give in order: BO1, BO3, BO5, 2MW, BO7, BO3 of BO3, BO9, 3MW
In particular, one case see that 2MW is a bit better than a BO5 as amplificator of probability. Moreover, if the player are of equal strength, it will require 4 games in average against 33/8 for BO5 (and more generally it will always require less games on average). The drawback of course is that in a BO5 there is a maximum of 5 games while in a 2MW there is no such limit, but I find the idea interesting.
|
|
|
|