|
Belgium9950 Posts
Since there's been quite a bit of discussion in the foreign BW communities around the balance of the new ASL/Ladder season maps, I decided to do some data analysis on ladder results of the last 2 ladder seasons.
As match up balance sways quite a bit between different skill levels, but I also wanted to not overly limit the sample size, I decided to look at only stats of 284 935 games played at 2200 MMR or above with over a minute in length. I would love to filter for 2300MMR+ but that means some of the match ups on the more often vetoed (read: Roaring Currents) drop to below 100 games and become less statistically significant.
The results:
If anything sticks out, it's definitely the abysmal PvT stats of Litmus. A two-player map that doesn't allow gas stealing, with the most direct and narrow push path straight to your opponent's natural is a PvT nightmare.
Now, this is not supposed to a total authority on actual map balance. As we mentioned before, certain mechanics / metas / layouts affect people differently at different skill levels. It doesn't surprise anyone that the skill gap between an average 2300 MMR player and a Pro is still enormous, and that difference in execution and knowledge can significantly skew how a certain map affects a matchup. ELOBoard, though limited in sample size for newer maps, still has the best reflection of pro-level map balance
If anything, these stats show that this difference in skill level still has a very pronounced effect on typical match up results between Pro/Semi-Pro level and S-rank level.
To illustrate this, lets take a look at PvT on Metropolis as an example. In ladder stats Protoss actually does quite decently on this map with a 48.9% winrate. However, at the pro-level, this is actually one of the most Terran-favored maps according to ELOBoard stats with an abysmal 41.4% winrate for Protoss:
This gets confirmed by what the pros have been saying in streams and picking/veto'ing in ASLs about PvT on Metropolis.
I was a bit hesitant to post this because I don't really like fueling all the matchup balance copium that goes around our community on a daily basis, even at D-rank level. I already have an idea of what to investigate next with the dataset I have available now, and expect it to have a more concrete conclusion/outcome. Feel free to drop ideas on what you would think would be interesting to look at.
Finally, I want to give a big shoutout to icza from repmastered.net for providing me access to the necessary dataset. Please support him through one of many ways to donate to repmastered.net. He makes amazing tools for the BW community and deserves more recognition.
|
This is great! PvT winrates are surprisingly low. Can you also look at lower level stats? Like 1800 and 2000 MMR? Thanks!
|
|
|
this is very nice, thanks. looks like terran is favoured pretty much every time they play a game of brood war at a high level :^)
just a small note, would you mind adjusting the color gradient to be symmetrical in regards to deviation from 50%, e.g. in a way such that 40% WR has the same color as 60% WR?
|
|
|
|
|
its extremely interesting that:
1) protoss lose on almost every map in every matchup 2) terran win on almost every map in every matchup.
it kind of reflects the larger patterns of: zerg has a weak matchup, terran does not. Protoss has a weak matchup and a slightly less weak matchup. When people complain about z>p its not that they're wrong -- rather, its the wrong target. The pvz balance is totally acceptable as long as the pvt balance more closely resembled the strong-weak matchup dichotomy that terran and zerg have. Or rather the myth of; terran has no weak match up and protoss has no strong match up. Itz not zvp thats the problem, its tesagi.
edit: sorry that i fucked up the pattern. im always that guy lol.
|
|
|
Can you do the same analysis on a data set from 3-4 years ago?
i.e. can we see if recent map design changes have made things better or worse
|
|
|
|
|
United States12240 Posts
Hey RaGe. For me, the next step would be to normalize the data against MMR-based win probabilities. That is, if an MMR gap of 100 translates to a 64/36 win probability, then introduce that as a starting weight before including the race-based data for games where the MMRs of both players are 100 apart. I'm not sure what they're using to measure MMR exactly, or whether it's a custom model, but I'm sure Glicko2's default configuration will be perfectly usable for this purpose.
I had planned to (and still do eventually plan to) use this method on ShieldBattery's ladder to apply appropriate point handicaps for ladder matches according to matchup-weighted map data, but so far I don't have a large enough dataset.
One other semi-annoying wrinkle that might be useful at a later time: factoring in start locations on 3 and 4-player maps, since positional matchups can play a big factor in how games play out. I'm not sure how to get this information but eventually it will be relevant 
I like your post though, I always enjoy seeing large-scale data analysis like this. Thanks for the parse!
|
|
|
balance Eclipse, toss winning too much there.
|
|
|
Insightful statistics. Thanks!
|
|
|
Belgium9950 Posts
On November 01 2025 12:11 Excalibur_Z wrote: Hey RaGe. For me, the next step would be to normalize the data against MMR-based win probabilities. That is, if an MMR gap of 100 translates to a 64/36 win probability, then introduce that as a starting weight before including the race-based data for games where the MMRs of both players are 100 apart. I'm not sure what they're using to measure MMR exactly, or whether it's a custom model, but I'm sure Glicko2's default configuration will be perfectly usable for this purpose.
I had planned to (and still do eventually plan to) use this method on ShieldBattery's ladder to apply appropriate point handicaps for ladder matches according to matchup-weighted map data, but so far I don't have a large enough dataset.
Yeah I thought of that, but I believe this introduces a different type of bias as well: some matchups are probably easier to be 'consistent' in / harder to have an upset in. That in itself could be also interesting to attempt to quantify, for example by plotting the average winrate/MMR gap against the average winrate/MMR gap per matchup. Obvious outliers would be ZvZ and PvP, but it would be interesting to see how this measures out against less obvious ones like PvZ and PvT.
I kept the scope of this analysis pretty limited in being just comparing matchups and maps across each other in pure winrate. I think even the quality of this simple analysis could probably benefit from limiting it to a certain range of MMR gaps between the players, but I think we need a larger sample size to be able to cut further in both MMR floor and max MMR gap and still get stat-sig results on the newer maps. I wanted to focus on this as there's been quite a bit of back and forth on specifically Roaring Currents and Litmus balance in foreign BW discords in the past months.
I might dive deeper in to some of the bigger sample size maps at some point though.
On November 01 2025 12:11 Excalibur_Z wrote:One other semi-annoying wrinkle that might be useful at a later time: factoring in start locations on 3 and 4-player maps, since positional matchups can play a big factor in how games play out. I'm not sure how to get this information but eventually it will be relevant  I like your post though, I always enjoy seeing large-scale data analysis like this. Thanks for the parse!
Well, you correctly guessed the next thing I'm working on. I'm doing some sanity checks on my math/approach still but I should be close to release, just need some time to work another hour or 2 on it.
|
|
|
|
|
|