How Z>>P cause T dominance (math model)

Cascade

Australia5405 Posts

November 03 2007 02:02 GMT

Let us go back about one year. Our now beautiful strategy forum was poluted beyond all legal restriction with Z>>>>P threads, and we were happy to see more than a handful of protoss in the topp 30 KESPA ranking. The zerg was not dominating though. The top 30 KESPA had as many terrans as zergs, if not more terrans. This seemed a bit strange to me at first: if zerg is the privileged race when it comes to imbalance, how come there are not more zerg players in the top? Having done my fair share of mathematics in my studies, I began to create a model of races in progaming.

I have now developed a mathematical model that explains how a Z>>P imbalance, together with smaller T>Z, P>T imbalances, causes a TERRAN dominance.

I will here explain it to those who are interested in three steps of increasing mathematical difficulty. If you find it too hard, then just skip ahead to the conclusions at the bottom, they are made so that you do not need to have read everything to be understandable.

[I realise that this is a loooong post, but I feel that I have found, and proved to some extent, something that can increase the understanding of progaming.]

Basic principle
First thing is to assume that ZvP is imbalanced in zergs favour by quite a bit, while TvZ favours T slightly, and PvT favours P slightly. If this is true or not is NOT the topic of this thread. instead assume that that is the case, and see what happens. I will later leave the balancing of the three matchups free to choose, see conclusions.

The principle is that zerg make its prey go extinct. As the zergs kick out all protosses from progaming, they will no longer have any easy wins. Instead, there will be loads of zergs which the terran can abuse due to the slight imbalance in terrans favour in the TvZ matchup. Also there will be very few protoss that can keep the terran numbers down. So as the zerg finish of all the protosses, they will find themselves owned by the terran that get free reign by the absence of protoss.

Easy right? let us now make a first model just to formalise.

First Toy Model
+ Show Spoiler +

This model is not very realistic, but it will show the effect introduced above. The model builds on these asumptions:

1)There is a large pool of potential progamers of roughly equal skill.

2)To become a progamer you need to have an overall win probability of 50% or more, averaged over all other progamers.

3)The matchups are imbalanced like this:
Z beats P in 70% of the games.
P beats T in 60% of the games.
T beats Z in 60% of the games.

These asumptions are not very accurate, but it is enough to show the point.

Name the fraction of Terran progamers t, the fraction of zerg progamrs z, and the fraction of protoss progamers p. z,t,p are numbers between 0 and 100%, and they need to add up to 100% together, to account for all progamers.

z + t + p = 100%

To have a stable situation, each race must win with 50% probability. Otherwise one of the races will be at the loosing end with a win statistics of below 50%, and will be kicked out according to 2). You cant have one race winning more than 50% without haveing another loosing more than 50%.

let us now try to calculate z,p and t. for that we will look at one race at a time.

Terran will of course win 50% against other terran and win 60% agianst Z, but only win 40% against P. For that to add up to a total of 50% we need to have as many protoss as zerg progamers. If there are more protoss, then terran will win less than 50%, and if there are more zerg, then terran will win more than 50%. So

z=p

Now look at protoss. They win 50% against other P, win 60% against terran, but only 30% against zerg. For those to add up to a total of 50% we need to have double as many terran as zerg to compensate for the heavy ZvP imbalance. So we get here

t=2z

Looking at zerg now. They win 70% against protoss and win 40% against terran. To add up to 50% we need double number of terrans as protoss:

t=2p

Solving these three equations (four equations with z+t+p=1) we now find:

Z = 25%
t = 50%
p = 25%

This means that 50% of the progamers will be terran players! Of the rest 25% will play toss and 25% zerg.So terran dominates completely in this scenario! It is not even more zergs than protosses! This clearly demonstrates th principle that zerg suffers from exterminating their "food" protoss. It even overstimates it. Zerg doesnt do THAT bad in the real starleagues.

This concludes the Toy model. let us now move on to the real model and see how it does.

Cascades progaming model
I will here move a bit faster and not explain the details I find too trivial. If you do not follow, ask in a reply, or skip ahead to the conclusions.

The model is based on these asumptions:

1) Progamers have different skill levels s. The number of progamers at a given skill s is proportional to

e^(-a s)

That is the the number of players falls of exponentially as skill increases. "a" is a parameter that decides how quickly the number of gamers fall. A large "a" means that the very best playes in the worlds is not THAT much better that the ones ranked around 100. A small "a" means that the top players completely own lower ranked players, even if they are not very mucher lower ranked.

This distribution can be discussed. Other ideas are welcome.

2) The probability of a player of skill s1 to beat a player of skill s2 is

1/( 1 + e^(s2-s1) )

the winning probability as function of skill difference:

This probability has the properties one could expect from a winning probability (such as the probability for the other playing winning is 1 - your winning prob). I do not need to add a parameter in the exponential as it essencially would be the same parameter as "a" above.

3) the worst progamer of each race must have the same winning probability averaged over all other progamers.

this is very intuitive. To barely make it as a progamer you need to win at least maybe 40% of your games. Any less, and you will be kicked out. I am not saying where that limit is, but the limit is the same for all races. If a terran gets kicked out at 39%, then a zerg will too.

4) Racial imbalance is represented by giving the favoured race a skill bonus.

So for example if a zerg player of skill 4 plays a protoss of skill 5, the zerg will get a bonus, for example 3, and the probability of the zerg winning will be calculated in 2) as if the zerg had a skill level of 4 + 3 = 7. I will not here fix how imbalanced the matchups are, but leave them for later. Also this is a fairly reasonable asumption in my opinion.

Solving the model
To find the race fractions z,t,p we need to do the following:

1) set up an equation describing the average winning probability of the worst player of each race. call these Wz, Wt and Wp. Each of these will be a function of all the z,t and p.

2) Solve Wz(z,t,p)=Wt(z,t,p)=wp(z,t,p) for z,t and p with help of z + t + p = 1.

I have done this.

With help from the brilliant program Mathematica, but still. to give a hint of how complicated it was, and for scientific credibility I'll show you my code.
+ Show Spoiler +

Note that you can see the expression for Wp. it is a three line function involving hypergeometric functions and Poly gammas. I dont even know what a polygamma is, but mathematica handled them.

You can also see that I solved the equations numerically.

To make it more accissible for non-mathematicians I have reformulated the skill bonuses and "a" in paramteers that are more intuitive, like average ZvP winning %.

Conclusions, go here if you do not understand!!
I have now found a way of calculating how many zerg, terran and protoss progamers there is. To do this I use how imbalanced the three matchups are, and how big the skill difference is between top and bottom progamers.

That is:

I have a program on my computer that I've written. You give the program 4 numbers:
1) Average probability to win a ZvP
2) Average probability to win a TvZ
3) Average probability to win a PvT
4) The probability that the 5:th best player in the world beats the 50:th best player in the world.

With these four number, the program will calculate how big % of all progamers will play zerg, terran and protoss.

As an example I will plug in typical value for about a year ago. At least this is what I guess is typical values. The beauty of this is that I can try any values I want.

Anyways, using this estimate of imbalance and skill:
ZvP wins 70%.
TvZ wins 60%.
PvT wins 60%.
the 5:th best player beats the 50:th best player with 65% probability.
Then the program calculates that:
35.8% of all progamers will play Zerg.
42.2% of all progamers will play Terran.
22.0% of all progamers will play Protoss.

As these values are close to the reality a year ago, it hints that my model is working! I tried a few other imbalances and skill-differences with the following result:

Note the second last row: Even when ZvP is just SLIGHTLY more imbalanced than the other two matchups, protoss will be suppresed, and terran will dominate.

I invite all of you to give me sets of these four paramters, and I will calculate z,p,t four your parameters. You dream can come true! You can finally get an answer to that "what if".

what if ZvP was balanced? what if terran would roll over toss as well? What if top ten would dominate everyone else with 80-90% statistics?

these results could also be of interest for korean mapmakers. Im imagine they are interested in having equally many of each race (Who wants to whatch mirror matchups?). If they need to get rid of the terran dominance, maybe they should make sure to balance the maps in ZvP primarily? Not what they would have guessed.

Thanks for reading.
Lollipops for those that read eveything!
+ Show Spoiler +

Aurious

Canada1772 Posts

November 03 2007 02:14 GMT

I read through only about a fraction of this and was enjoying myself will finish later nice analysis.

OctoPuSs

Canada5279 Posts

November 03 2007 02:17 GMT

Very interesting work man. Re-reading it to make sure I undestand your post in its entirety.

Myrmidon

United States9452 Posts

November 03 2007 02:19 GMT

Pretty interesting model. I read everything except for some of the Mathematica code. T_T (school has Maple licensed instead lawl, not like I know that level of statistics anyway)

I'm just not quite sure that the imbalance was on the order of 70-30. The last row in the chart is probably more like it was.

Pressure

7326 Posts

November 03 2007 02:22 GMT

Dang
really nice work about the matchup balanaces
i got so fing confused aroudn the mathematica part lol D:

The Storyteller

Singapore2486 Posts

November 03 2007 02:22 GMT

Wow, this is great. I didn't understand the more complex mathematics, of course, but the mere fact that maths could be used to calculate something like this blows me away. I have never been fond of maths, and never saw its applications in real life. Tahnks to people like you, I'm beginning to change my opinions.

Wizard

Poland5055 Posts

November 03 2007 02:36 GMT

This is quite interesting, good work.

kamehameha

Ukraine152 Posts

November 03 2007 02:38 GMT

wow very nice work, i was kinda confused on some points on the first read but im sure ill get it on the second read

this model of the terran dominance only applies to tournament games tho right?

CustomXSpunjah

United States1093 Posts

November 03 2007 02:44 GMT

lol i dunno dude....u cant generalize that pvz is imba z for that much (70% chance of winning) off my own experience i beat zergs more than i beat terrans

crazie-penguin

United States1253 Posts

November 03 2007 02:52 GMT

#10

On November 03 2007 11:44 CustomXSpunjah wrote:
lol i dunno dude....u cant generalize that pvz is imba z for that much (70% chance of winning) off my own experience i beat zergs more than i beat terrans

Reread the first few paragraphs and the last paragraph. This thing is not based on 100% true facts. He is making an assumption purely for the basis of his math. He is not saying it IS the case.

kamehameha

Ukraine152 Posts

November 03 2007 02:58 GMT

#11

but his calculations are off progamer(which tend to have closer skill levels from one another) matchs not public games.. too much skill variation so it becomes unpredictable

Wasabi

United States3085 Posts

November 03 2007 02:59 GMT

#12

--- Nuked ---

Tadzio

3340 Posts

November 03 2007 03:28 GMT

#13

I think you're right, but I suspect the best way to use this would be to invert what you're doing. Create a formula where you can plug the number of T, Z and P progamers (say the tope 30 or 50 progamers) in as variables and see if it accurately predicts the winning percentage of each race in each matchup. This way you can determine racial imbalance without making any generalized assumptions. If you do it this way you can determine, through observance of variance from the mean, to what extent map imbalance and personal skill affects the likely outcomes of games.

Polemarch

Canada1564 Posts

November 03 2007 04:08 GMT

#14

very interesting, nicely done!

it might be fun to add another layer into your model for per-map skill bonuses. say add another layer where the maps would be selected from a discrete uniform distribution. then each map has its own values for skill imbalances.

then you could answer questions that might be of even MORE interest to tournament organizers and mapmakers. e.g. what is the expected effect on racial distribution of having a mercury (z >>> p) or a paradoxxx (p >>> z) in the map pool? or given 4 maps in my map pool, what remaining 5th map should i pick to maximize racial balance?

edit: one more comment, for picking your 'a' parameter, maybe you should look into fitting it to data of the winning percentages of some players (say from TLPD's ELO top players). that would be more convincing, since if you can pick 'a' arbitrarily, it seems relatively easy for you to pick a value that matches last year's player distributions.

Cascade

Australia5405 Posts

November 03 2007 04:19 GMT

#15

Thanks for the feedback everyone.

On November 03 2007 11:19 Myrmidon wrote:
Pretty interesting model. I read everything except for some of the Mathematica code. T_T (school has Maple licensed instead lawl, not like I know that level of statistics anyway)

I'm just not quite sure that the imbalance was on the order of 70-30. The last row in the chart is probably more like it was.

70 is probably too much as you both say. So give me better suggestions! Im no sc expert, so I will need help to pick realistic values for the 4 percentages.
Also, this model uses a global average for imbalances, so one single players stats doesnt really matter a lot. But if you want it is possible to make P>Z but setting the ZvP paramter below 50%.

On November 03 2007 11:38 kamehameha wrote:
wow very nice work, i was kinda confused on some points on the first read but im sure ill get it on the second read

this model of the terran dominance only applies to tournament games tho right?

i made the model to fit the entire progaming scene. You "loose" when your team kicks you from the team because you dont deliver. But I think it could be applied also to other groups.

On November 03 2007 11:58 kamehameha wrote:
but his calculations are off progamer(which tend to have closer skill levels from one another) matchs not public games.. too much skill variation so it becomes unpredictable

Well, good thing that you can change the closeness as a parameter then.

For progamers it will in general be fairly close even when top 5 plays top 50. So set the pramter to 65% win or whatever you may think is appropriate.
for the case of a bunch of random people on bnet the difference in top 5 and top 50 will probably be much bigger, so set the winning probability to like 80% instead. The model covers both cases.

On November 03 2007 12:28 Tadzio00 wrote:
I think you're right, but I suspect the best way to use this would be to invert what you're doing. Create a formula where you can plug the number of T, Z and P progamers (say the tope 30 or 50 progamers) in as variables and see if it accurately predicts the winning percentage of each race in each matchup. This way you can determine racial imbalance without making any generalized assumptions. If you do it this way you can determine, through observance of variance from the mean, to what extent map imbalance and personal skill affects the likely outcomes of games.

It is a good thought, but it is not possible. The imbalances are 3 parameters, while the race fraction have only two free parameters: if you know how many % zerg and terran players there are, then you know also how many protoss players there are, since the rest must be protoss. the equation

z+t+p=1

removes one degree of freedom.that means that the mapping (my program) from the 3 imbalances to the race fractions does not have an inverse. many different imbalances will correspond to the same race fractions, so it is impossible to go in the other direction.

Let me also stress that this is not a "flaw" in my model, but is a general property of broodwar. Would there have been only two races, it would have been possible (one free paramter on each side) while 3 or more races is not possible with (n-1)*(n-2) matchups to balance, and only n-1 free racefractions.

On November 03 2007 13:08 Polemarch wrote:
very interesting, nicely done!

maps are averaged over in this model. Adding a z>>>p map would push the ZvP percentage a bit higher. All effect of the type of dodging certain maps etc are neglected. ZvP is the percentage that a Z will beat a P over many games on different maps in different situations. So mapmakers can easily control the three imbalances by creating maps favouring the different matchups.

For the ELO thing: It is probably a good idea. I guess it would take some time since i need to check the ELO rating of every guy they played, and preferably their rating at the time of the game. :/ To see the difference of 60% or 65% you would need about 50 samples of games of players around rank 5 against players of rank around 50. Preferably different players in different matchups. Maybe you feel like contributing with that?

Im of to sleep now. Will run any proposed percentages tomorrow.

gravity

Australia1988 Posts

November 03 2007 05:54 GMT

#16

On November 03 2007 13:19 Cascade wrote:
For the ELO thing: It is probably a good idea. I guess it would take some time since i need to check the ELO rating of every guy they played, and preferably their rating at the time of the game. :/ To see the difference of 60% or 65% you would need about 50 samples of games of players around rank 5 against players of rank around 50. Preferably different players in different matchups. Maybe you feel like contributing with that?

Why not just calculate the chance mathematically? That's the whole point of ELO ratings, after all.
For example, between the current #5 (Anytime) vs. the current #50 (Yellow) there is a difference of 172 points which gives a 73% winning probability for the #5 player. (The exact players/races don't matter of course).

gravity

Australia1988 Posts

November 03 2007 06:30 GMT

#17

Also, historically there have been 112 Protoss pros, 131 Terran pros, and 160 Zerg pros (numbers from TLPD). This is influenced by the early trend of Koreans playing Z due to the initial strong imbalance (pre-patch) in Z's favour and continuing to do so for a while afterwards, but it gives %s of about 28% P, 33% T, 40% Z, which from looking at your table seems to be a better match for a 60/40 (or less) all-time ZvP imbalance than a 70/30 one. That's a better match for the actual known results as well.

It would be interesting to have the proportions of races for players who played their first pro game from 2002 or so onwards, too.

Polemarch

Canada1564 Posts

November 03 2007 07:37 GMT

#18

yeah i was suggesting that you look into extending the per-map stuff to your model. only if you're interested in extending it, of course, but i think that could make it much more useful.

e.g. a good player would probably rather have one grossly imbalanced map out of 5 that they just throw away; rather than 5 moderately imbalanced maps. your extended model could quantitfy that sort of stuff in terms of overall racial balance for map pools.

for picking the 'a' parameter... i wouldn't put in THAT much work into it. here's a relatively easy way that's still based on the data. for a given value of 'a', draw the top 50 skill values out of a sample of say 10000, assign them random races, and treat those as your progamers. then have them randomly compete against each other. check the entropy of the winning percentages and see how it compares to the entropy of the winning percentages of the top 50 players in TLPD. (repeat the expeirment a few times to get a better sample). adjust 'a' until they're about equal. you could probably analyze it mathematically, but a simple simulation like that might be easier.

Mynock

4492 Posts

November 03 2007 08:07 GMT

#19

Wow, this is an amazing article, thank you very much for that hard work! I think the model could accept some fine-tuning, but the general principle looks fine.

Now let's make this featured!

InfeSteD

United States4658 Posts

November 03 2007 08:13 GMT

#20

Really interesting, but I'll be honest I couldn't understand the math part. I stop math a long time ago but anyways, the intro and conclusions and some parts explained by words rather than math made a lot of sense based on assumptions, you get yourself hooked with someone or (progaming database) that has all the accurate percentages and you can further this for sure.

Nice writing, keep it up

1 2 3 4 Next All

Please or register to reply.

How Z>>P cause T dominance (math model)

Completed

Ongoing

Upcoming