SKPlanet Proleague R1: Map Balance and Upsets

VGhost

United States3620 Posts

January 01 2012 21:49 GMT

I have been trying out a system for looking at map balance. These are the results after one round of play. There is not enough data yet to drown the "noise" of cheese, random screwups, and matchup coincidences, so consider this preliminary.

My method focuses on looking at players' records in a given matchup in recent games. I've attempted to limit it to games played in regular leagues in a past year, but made some allowances in order to get a reasonable number of games (≥ 10 if at all possible; not that 10 is a magic number but I had to pick one). This is mostly a problem with rookies & second year players; also EffOrt.

Based on that information I predict a win percentage between the two given players and assign a score by comparing that percentage to the actual result. As games build up, this will generate two scores for the map: an overall result score (RES), and a second indicator showing how the actual score compares to a hypothetical score if the results were the same but every game was a 50-50 matchup. I've tentatively assigned this second number the "balance" (BAL) designator, but in fact so far I am not sure which demonstrates actual balance best, if either. The combination gives a decent idea, I think.

In each matchup (PvT, TvZ, ZvT) a positive score indicates a map favoring the 1st race and a negative score indicates the map favors the second race. I also summed the absolute values to produce net scores, which represent even more loosely overall balance.

One other note is that I consider a good starting point for balance discussion the number of mirrors for each race on a given map.

THE MAPS

Chain Reaction
Mirror Bias: Zerg (5 of 9 total)
Matchups
PvT 1-1; RES -0.13, BAL -0.13
TvZ 2-0; RES 0.32, BAL -0.18
ZvP 4-3; RES 0.13, BAL -0.12
Net Results 0.58, Net Balance 0.43

Although I am really going through alphabetically, Chain Reaction allows me to talk about almost all of the features (and bugs) of the system right now. The PvT results are a nice demonstration of the problem of the small sample size. Due to the constraints of the data set, (T)

BaBy's victory over (P)

Wooki registers as an upset, tilting the PvT numbers to Terran. The TvZ numbers demonstrate how the system is supposed to work. Results favor Terran, but victories by BaBy and (T)

Leta over SKT Zergs are no indication of balance – so the second number is weighted towards Zerg. Finally the PvZ numbers are just a string of games with no particularly surprising results – overall favoring Zerg, but with the balance figure once again indicating that Protoss is not hopeless.

We can also generate race balance guesstimates off the balance figure:

Protoss is -0.13 + 0.12 (the BAL values for or against a race) = 0.01; Terran 0.13 - 0.18 = -0.06; Zerg 0.18 - 0.12 = 0.06. This incidentally confirms the mirror results indicating the map is Z>P>T.

Electric Circuit
Mirror Bias: Zerg (5 of 8 total)
Matchups
PvT 2-2; RES -0.03, BAL -0.03
TvZ 1-4; RES -1.06, BAL -0.31
ZvP 1-2; RES -0.15, BAL 0.10
Net Results 1.24, Net Balance 0.44

Electric Circuit features both the most even matchup (PvT) and least even (TvZ) of any map so far. The PvT has featured eight good players with no notable upsets, while the TvZ has not only swung to Zerg, but every Zerg to win won as a (numerical) underdog; the only Terran to win was (T)

Flash over (Z)

Modesty. (The one game I question the numbers is (T)

BarrackS vs (Z)

ggaemo; while ggaemo's ZvT is not great, BarrackS' good "record" is mostly complied against bad competition in prelims and Dream League.)

Overall racial: Protoss -0.13, Terran -0.28, Zerg 0.41, Z>P>T (matches mirror numbers).

Ground Zero (as a Stars fan I keep typing this Ground (Z)

ZerO)
Mirror Bias: Terran (3 of 4)
PvT 3-1; RES 0.40, BAL -0.10
TvZ 5-2; RES 0.64, BAL -0.11
ZvP 1-2; RES -0.40, BAL -0.15
Net Results 1.40, Net Balance 0.42

Ground Zero nicely illustrates the problem with the pure result number. In both PvT and TvZ the wins have stacked up for one race, but in neither case have there been any significant upsets, which is reflected in the balance numbers that tell you there is not much to worry about. ZvP is a different matter: both Protoss wins have been large upsets ( (P)

BeSt over

Jaedong and (P)

By.Sun over (Z)

HoeJJa) indicating either racial imbalance or a lot of practice time with (P)

Bisu.

Overal racial: Protoss 0.05, Terran -0.01, Zerg -0.04 (P>T>Z)

Early indications are that Ground Zero may quite likely be the best map of this year's selections.

Jade
Mirror Bias: Terran (5 of 9)
PvT 4-4; RES -0.30, BAL -0.30
TvZ 1-0; RES 0.20, BAL -0.05
ZvP 1-2; RES -0.26, BAL -0.01
Net Results 0.76, Net Balance 0.36

This map does tilt Terran, I think: other than (T)

Flash over (P)

Horang2, every TvP win has been at least a minor upset, culminating in (T)

PianO over (P)

JangBi. Not that 8 games provides a definitive answer. The single TvZ game was (T)

firebathero over (Z)

Action. The ZvP has been all over the place; I am not even going to try to draw a conclusion.

Overall racial: Protoss -0.29, Terran 0.25, Zerg 0.6 (T>Z>P)

This obviously needed to be said at some point: on almost every map I am looking at significantly different numbers of games in each matchup, meaning the numbers are probably not exactly analogous. I have not even begun to try to adjust for this yet, so feel free to comment but do be aware I am aware of the potential problem this represents.

Outlier
Mirror Bias: Protoss (8 of 13)
PvT none
TvZ 0-1; RES -0.19, BAL 0.06
ZvP 4-3; RES -0.04, BAL -0.29

I have to say Outlier is an aptly named map. I am not sure if the teams have even bothered to try playing a Terran on it even in practice. (T)

Mind was sent and lost to (Z)

Jaedong but that should not be enough to discourage people.

I feel like this is a test case for the system. Every single Zerg but one ( (Z)

EffOrt against (P)

Bisu) has been favored; when they win they don't affect balance much, and when they lose the number tilts to the underdog Protoss. The games, on the whole, have been good.

There is little to no point computing any net results – you can work them out for yourself if you insist. The short version is: not really a good map within the current expectations of play.

Sniper Ridge
Mirror Bias: Terran (3 of 5)
PvT 4-1; RES 0.67, BAL -0.08
TvZ 3-0; RES 0.43, BAL -0.32
ZvP 3-2; RES 0.16, BAL -0.09
Net Results 1.26, Net Balance 0.49

Another map providing an interesting look into the balance discussion. The TvZ games were a series of heavily favored Terrans winning. The PvT and ZvP have been a mess of relatively closely handicapped games with a significant number of minor upsets. Sniper Ridge seems like a fairly solid map that basically no one has really figured out how to play yet, in my opinion. We'll see if this holds.

Overall racial: Protoss 0.01, Terran -0.24, Zerg 0.23 (Z>P>T)

Note that those numbers in this case seem to make very little sense whatsoever.

Obviously there's some room for improvement here. I thought I'd share the project so far with you; there's one particular adjustment I'm trying to work out formulaically which I think will significantly improve accuracy especially dealing with large numbers of "odd" results in a matchup.

UPSETS!

As part of the project, I had to handicap games. So I thought I'd share the list of "upsets" this produced. We start at 45% because we have to start somewhere but any higher than that it is not possible to make a good argument for the game being an upset.

Minor Upsets (41-45% chance of winning)
2011-11-26 (P)

BeSt (41%) over (Z)

Jaedong on Ground Zero
2011-11-27 (T)

Reality (42%) over (P)

sHy on Jade
2011-11-27 (P)

Stork (44%) over (Z)

ZerO on Sniper Ridge
2011-12-13 (P)

Stork (43%) over (Z)

by.hero on Sniper Ridge
2011-12-17 (T)

TurN (42%) over (P)

By.Sun on Ground Zero
2011-12-24 (P)

Snow (43%) over (Z)

Jaedong on Outlier
2011-12-25 (T)

Bogus (43%) over (P)

Bisu on Sniper Ridge
2011-12-28 (P)

Stats (44%) over (T)

Bogus on Sniper Ridge
2011-12-31 (P)

Kal (41%) over (Z)

ZerO on Chain Reaction
2012-01-01 (Z)

Crazy-Hydra (44%) over (P)

Stork on Chain Reaction

Significant Upsets (30-39% chance of winning)
2011-12-09 (P)

By.Sun (38%) over (Z)

HoeJJa on Ground Zero
2011-12-14 (P)

sHy (37%) over (Z)

n.Die_soO on Chain Reaction
2012-01-01 (P)

Jaehoon (38%) over (Z)

hyvaa on Jade

Major Upsets (<30% chance of winning)
2011-11-26* (Z)

ggaemo (25%) over (T)

BarrackS on Electric Circuit
2011-11-29** (T)

BaBy (28%) over (P)

Wooki on Chain Reaction
2011-12-25 (T)

PianO (28%) over (P)

JangBi on Jade

* I don't know what to make of this: BarrackS seems legitimately good at TvZ but ggaemo embarrassed him.
** While Wooki was 2-0 PvT in the previous year, BaBy's good enough that this isn't really an upset, whatever the numbers say. But it's a sweet game, so I put it in anyway.

EDIT 1 (12-01-02): Fixed game urls; now linked in dates. FYI TLPD inside URL tags invalidates the URL.
EDIT 2 (12-01-17): Adjusted game probabilities to reflect new counting method.

Xiphos

Canada7507 Posts

January 01 2012 22:23 GMT

thanks for the effort!

So Ground Zero is the most balanced map?

mtn

729 Posts

January 01 2012 22:35 GMT

We don't know yet... There was so little games played on each map... We will be able to tell which map was/ wasn't balanced after the season or at least mid season.

Crunchums

United States11144 Posts

January 01 2012 22:38 GMT

I don't know how you can draw any conclusions about map balance from such a small sample size. I think player interviews / player selection choices are much more useful than the outcomes of games. Even if Best beat Jaedong on Ground Zero, was that because the map is P > Z, or is it just that Jaedong got outplayed?

Taekwon

United States8155 Posts

January 01 2012 22:39 GMT

I strongly disagree with your assessment of Jade.
I may come back and type my opinion when there are more statistically significant number of games but that's just
from my perspective and playing experience.

Other than that and your GZ portion I think these are solid.
Make sure to continue to update theses as the season continues!

Nice work.

TheShimmy

United States1808 Posts

January 01 2012 23:08 GMT

Wow..

This is an incredible amount of dedication you've put forth here. Great job and keep updating please please.

shaftofpleasure

Korea (North)1375 Posts

January 01 2012 23:14 GMT

Outlier sucks. If they bring back Outsider, I will be overjoyed.

Kiett

United States7639 Posts

January 01 2012 23:23 GMT

Very interesting, and thanks for the hard work you put into it. Of course it's a little premature for actual conclusions, but it's never too early to set up a system that you can refine to be more accurate over time as we get more games ^^

sviatoslavrichter

United States164 Posts

January 02 2012 07:35 GMT

Great job. There are two things you can do to make this analysis more interesting:

1) Go heavier on the stats analysis. You should look at TableSim--it's a pretty good app for doing small sample analysis.

2) Start looking at the games qualitatively. There are probably a couple of B- or A- level iCCup players that can help you with this--the central question you need to ask is "how does the map influence the playstyle/choices of the players playing it? does it unlock additional options, or constrain them more tightly?" Be aware that in this regard, the possibility of 1-time snipe builds and lots of advanced coaching will be required.

sheaRZerg

United States613 Posts

January 02 2012 08:35 GMT

#10

Interesting. I feel like it is trying to take into account the records may be difficult to do rigorously. And I suspect the uncertainty with less than 5 games per matchup would be huge (though im not even sure how you would calculate that given your method).

It will be interesting to see how this compares down the line when there is much better sample size.