Data analysis on 70 million replays

Kraekkling

623 Posts

November 20 2025 02:25 GMT

Updated to full data set including current ladder maps

Two years ago we examined a data sample of 8 million brood war replays from repmastered. If you're seeing this for the first time, I highly recommend reading the older post first.

The data sample has grown to 70 million replays, so it seemed like a good time to revisit. There are two parts to this. The first part is similar to what we looked at last time, i.e. how win rates for specific match-ups develop during the game, just with more data and split by map. This is now possible due to the much bigger data sample.

The second part is new, wherein we'll take a look at how balanced spawn locations are. We'll look at this both for mirror match-ups and for non-mirror matchups.

Did you ever wonder what the best spawn is as Terran in TvZ on Fighting Spirit? Did you ever feel like your timings are much more crisp when you spawn at 12 o'clock with Zerg on Dominator? Is there a difference between vertical and horizontal entrances at the natural expansion for Protoss in PvZ? Could it even be that some specific spawn locations are so much better, that an otherwise disadvantaged matchup becomes advantageous? Let's find out.

Data
+ Show Spoiler +

The dataset comprises roughly ~~70 million~~ 28 million 1v1 ladder games played since the start of 2018, but by far the biggest part came in during the last 2 years. This is perfect, since we should care much more about recent games on new ladder maps, than what happened long in the past.

The information from the dataset doesn't provide complete information from a replay but rather some extracted data. Build order or income details are not available.

The dataset did include:

Player races
Game winner
Game duration
Spawn locations
Player MMR

To refine the dataset, a few filters were applied:

Game duration > 2 minutes
Exclude draws
Exclude games with afk players
Exclude games on fastest maps and similar
Exclude games on maps with fewer than ~~100.000~~ 60.000 games (changed to include new ladder maps such as Roaring Currents)

See below for a map frequency histogram. The rectangle-name-map is FS1.3 and FS1.4, since it's the only map with a korean name.

Player population
Here we see how the ladder MMR is distributed between the players of different races. The vertical lines represent the mean of the distributions.

In the raw counts + Show Spoiler +

, we see that overall there are slightly more protoss players than zerg or terran players, but the difference is not that big (~15%). The average MMR is very slightly higher for zerg.

We can also look at this after normalizing each of the distributions: + Show Spoiler +

. Basically this is what it would look like, if there were exactly as many players of each of the races. There's not too much insight here, since the distributions almost perfectly overlap. If anything, we could say that the skill variance between players of a given race is pretty similar for all races.

Part 1, Win rates vs. game time

These distributions were created in the same way as described here. I think the write-up and the discussion in the comments in the old thread on what we see in the plots was quite good, so I recommend reading that.

Just to have a short explanation: the colored data points with the colored line tells us how the win rate changes during the match, the relevant legend is on the left-hand side. Additionally we have a dotted, straight green line at exactly 50% win rate, which makes it easier for us to see where the "perfect balance" is.

In the background, we have blue bars superimposed, which tell us how many games ended in a given 1-minute interval. The relevant legend is on the right side. On top of the blue bars, we have percentage numbers. These tell us, which fraction of all games ended before a given point in time. For example, in the PvT graph, we can see a number of 35% above the blue bar for the 10-11 minute interval. This means that ~35% of all PvT games end before 11 minutes. Note that these percentage numbers go towards 100%, as the time increases - which makes sense, since all games have to end at some point, and almost all games end somewhere before 40 minutes.

PvT, overall
+ Show Spoiler +

PvT by MMR brackets
+ Show Spoiler +

PvT on 4-player maps, cross- vs close spawn
+ Show Spoiler +

PvZ, overall
+ Show Spoiler +

PvZ by MMR brackets
+ Show Spoiler +

PvZ on 4-player maps, cross- vs close spawn
+ Show Spoiler +

TvZ, overall
+ Show Spoiler +

TvZ by MMR brackets
+ Show Spoiler +

TvZ on 4-player maps, cross- vs close spawn
+ Show Spoiler +

You can find a lot more data on a per-map basis here.

When looking at these win rates vs time on a per-map basis, there's quite a bit of variation. While most maps tend to follow the very same trend as we see on the average for all maps, the deviations are actually quite significant. Especially when we compare two maps to each other, we can tell that specific builds or timings differ in their relative strength depending on the map. For example consider

TvZ on Eclipse and Neo Dark Origin, both 2-player-maps: + Show Spoiler +

To me, the most striking difference is the mid-game strength of Terran. While we see that Terran dominates the match-up as usual at ~9-16 minutes, this is much more pronounced on Eclipse, in agreement with the consensus that Eclipse is a particularly hard map for Zergs to get into a late-game 4gas situation.

Now if we go into specifics and smaller details, we can consider that the rush distance on Eclipse is a bit longer compared to most other 2-player-maps, so things like 2-Rax sunken busts are a bit weaker.

We can see this at the peak around 5-6-7 minutes, where the win rate shoots up for Terran. Note the peak is both lower and later in the case of Eclipse: the sunken bust hits later and is less potent. On the other hand, when we look at what happens around 7-8 minutes, we see a dip caused by Mutalisk timings. We can instantly tell that, whatever the reason, Mutalisks hit much harder on Neo Dark Origin, compared to Eclipse.

I think generally this is the most important takeaway and how this data can be used. You can consult these plots to see whether a particular build is comparably stronger or weaker on a specific map.

Part 2, Win rates by spawn location

Now, to the new part. In the following, we'll be looking at win rates for a specific spawn, on a given map, in a given matchup.

So each of the plots will have exactly that information: map, matchup, and how the spawn locations differ between each other.

Consider the example of ZvZ on Dominator: + Show Spoiler +

We can see that the 12 o'clock is by far the best spawn location. The red-green-color gradient at the bottom is telling us whether a spawn is at advantage or at disadvantage. For the case of mirror match-ups, this is the easiest, since the expected win rate everywhere should be 50%. So a deviation below that will turn into red, and anything above becomes green. At the very center, close to 50%, the color scheme becomes white, which means that, the more color you see on this plots, the more overall imbalance there is. If you see a plot with very little color, this means that all spawn locations are very similar to each other.

Next, let's consider another example, TvP on Polypoid. + Show Spoiler +

The only difference here is that we no longer use the 50% as benchmark for a "balanced" spawn. Instead, the benchmark is the overall win rate (for this matchup) for this map. For example, Terran has an overall win rate of 51.1% in TvP on this map, as given in the central text box. Thus to evaluate the spawn, we consider the difference to this 51.1% to figure out what the best/worst spawns are.

Another difference in this particular example is that an MMR bracket was selected, as shown in the info box. I'm using the same definition for this as last time. + Show Spoiler +

A match-mmr is determined by taking the average MMR of the two players (initial MMR values before the match was played). Several brackets of games are defined and compared. Additionally, a requirement is imposed on the max difference of player MMR, to ensure only games of a meaningful skill difference are taken into account.

I generally wanted to see how this imbalance for spawn locations behaves as we examine players of different strength. Currently, there are three brackets:

- all players
- match MMR > 1900
- match MMR > 2100

If deemed relevant, we can adjust them, but it seemed sufficient so far.

At last, note that while previously it was sufficient to e.g. look at a graph for PvT to understand both the perspective of the Terran and the perspective of the Protoss, for this map-and-matchup-and-spawn-specific plots we must consider these independently. So we end up with 9 perspectives which need to be considered: zvz, tvt, pvp, pvz, pvt, tvz, tvp, zvp, zvt.

Now we can dive into the actual results - let me start by saying that I was very surprised by what I saw.

For a given map, more often than not, the spawn location seems to actually be the most deciding factor. I expected the differences to be maybe at the order of a few percent, but the actual differences are often above 5% and sometimes even at 10%. Isn't that crazy? Two years ago, we also examined at how matchup win rates depend on cross-spawn vs close-spawn on 4-players map, and there is a significant and measurable effect. However, pretty often it is completely dwarfed when compared to the influence of the specific spawn.

If you think about how many of the recent balance discussions here on ZvP centered on matchup differences of like 48-52%, but its actually more like

"yeah so for PvZ, as Protoss on Fighting Spirit, you just need to spawn top right and you are miles ahead"

and this usually persists throughout the MMR brackets or becomes even more severe. Just look at this:
+ Show Spoiler +

What the heck? So when I first saw some of these, my first thought was that something might be wrong with the data or my methods, so I started checking things to confirm everything works as expected and I didn't mess up. I've arrived at a few cases where it should be easy to tell was is supposed to happen, and it does happen every single time. Thus I'm inclined to believe what we see in the data is overall correct. Let me show you these cases.

case 1, FS1.3 -> FS1.4

Luckily, there are a few maps in the pool, where just tiny changes were applied to specific spawns. For example, when we went from FS1.3 to FS1.4, the one relevant change was the layout of the minerals on the left-side spawns. It has become known that right-side spawn was mining faster, so we switched to "L"-shaped mineral layouts on the left side. This difference in mining rates is most relevant for Zerg, since they mine with the least amount of drones.

FS1.3, ZvZ
+ Show Spoiler +

FS1.4, ZvZ
+ Show Spoiler +

As expected, in each of the brackets we see the advantage of right-side spawn shrinks, when we compare FS1.4. to FS1.3. The same is true for other Zerg matchups when looking at 1.3 -> 1.4., where the left side spawns become a bit better. There are some fluctuations, but the trend goes in the right direction.

case 2, FS top right spawn Protoss

I actually didn't know about this, so I was very glad when eon mentioned that top right spawn is known to be advantageous for protoss. These advantages in mining should be most relevant in the mirror matchups, and this is what we see.

PvP on FS1.3+ Show Spoiler +

PvZ on FS 1.3 + Show Spoiler +

Protoss spawns top right on FS = big peepee.

It looks like the change to 1.4 partially improved this in PvP, where now top left is also a good spawn. + Show Spoiler +

Not quite sure how to interpret this, maybe someone else can comment. Also there's stuff like, for example, in PvT on FS1.3.: where bottom right is the best spawn: + Show Spoiler +

and when we go to FS1.4., its much more balanced overall + Show Spoiler +

, except that bottom left remains worst spawn by far.

So there is plenty of very match-up specific stuff that I don't quite understand, but also more variables at play. In the mirror match-up, everything seems as expected.

case 3, 12 o'clock spawn Zerg on Dominator

This spawn is known for very quick mining for Zergs, also confirmed by the data at hand.

What now?

First, here's the rest of the data on spawn-specific stats. It's grouped by

matchup -> map

and then there are 3 plots each, one per MMR bracket. Feel free to re-upload them somewhere else to be able to repost or reference them here.

There are hundreds of plots overall, have fun browsing them. I'd recommend you look for the match-ups you play yourself and maps that you like in particular, and try to see whether you find anything interesting or noteworthy.

There might be better ways to structure this and present it, I'm looking for input there as well. Also if you have any specific sanity tests or cross-checks that could be applied to make sure everything works as intended, please comment. I should add that the results did not pass all my vibe checks. For example, on FS, the bottom left spawn is supposed to be particularly bad in TvZ, which we do not see in the data. Now, whether this is because something is wrong with the methods; or, because every Terran knows that this spawn is supposed to be worse for them, and they looked up how to best place turrets; or, this spot never was bad at all; I don't know.

Regarding what all of this means for the game, I'm not sure either. Most often, there seem to be several variables involved at the same time, with overlapping effects.

- Specific spawns seem optimal for one race, but might be not optimal at all for a different race
- Some maps have wildly imbalanced spawns, while others are pretty balanced
- Quite often, the variance between specific spots is higher than the difference in the win rates for the matchup as a whole

I didn't want to theorize too much on what the underlying causes could be, e.g. is it more of a left-side-spawn vs right-side-spawn bias, or is it about top side vs bottom side, or is it specific to horizontal vs vertical entrance only relevant in PvZ + Show Spoiler +

but it seems like vertical entrance natural expansion is MUCH better for protoss in pvz, compared to horizontal entrance. its a mostly consistent thing

, etc. etc. etc., but I invite everyone to share their findings and what they think about the data.

Anyway, thanks again to repmastered for the data, and everyone consider donating a dollar for the free services we get. The stuff you see here is based on ~28 million ladder replays (out of the 70 million replays on repmastered, which include team games, fastest games, etc).

Soft_General_5023

110 Posts

November 20 2025 05:10 GMT

Thanks for the data analysis, very interesting.

On the highest player level group you created, MMR > 2400, the overall win rates are as follows, right?

PvT 46.25%
PVZ 47.46%
TvZ 55.73%

Soft_General_5023

110 Posts

November 20 2025 05:15 GMT

Crimson)S(hadow

Philippines596 Posts

November 20 2025 05:32 GMT

Te...

Bonyth

Poland594 Posts

November 20 2025 07:02 GMT

sa...

Ethelis

United States2397 Posts

November 20 2025 07:16 GMT

gi...

iFU.pauline

France1672 Posts

November 20 2025 07:41 GMT

The average game length for a zerg is around 12 minutes. And the peak winrate for t v z and p v z is around 12 minutes. Way above 50 %. How should we interpret that? WE ARE THE MOST MISUNDERSTOOD RACE!

iopq

United States1056 Posts

November 20 2025 07:51 GMT

So we can see that cross spawn is overall balanced, close spawn is not quite. Why not just make more 3 player maps?

It would be a slightly longer distance than close, but shorter distance than cross.

Abjurer

Sweden211 Posts

November 20 2025 07:51 GMT

Stellar work! Any changes you noticed in the winrate graphs compared to last time?

RedW4rr10r

Switzerland749 Posts

November 20 2025 08:55 GMT

#10

Thanks for the work, definitely some interesting data. I took a look at the previous thread from two years ago first. Now I wanted to check out the statistics here, but none of the images are working on my end "[image loading]". Is it me or did the links to the images break or something?

Malinor

Germany4733 Posts

November 20 2025 09:15 GMT

#11

Thank you so much for this. Super cool stuff.

[sc1f]eonzerg

Belgium6812 Posts

November 20 2025 12:17 GMT

#12

Any terran player that could explain why Polypoid bottom right is such a good spawn for you ? I think top right and bottom right what they have in common is the same mineral lines. Sim City is different. Is it by being at the right side any beneficial with defense and mobility of your army any different to the left side ?

Kraekkling

623 Posts

November 20 2025 12:19 GMT

#13

On November 20 2025 17:55 RedW4rr10r wrote:
Thanks for the work, definitely some interesting data. I took a look at the previous thread from two years ago first. Now I wanted to check out the statistics here, but none of the images are working on my end "[image loading]". Is it me or did the links to the images break or something?

anyone else? could you link to the exact broken image? might need to switch to another image host if this persists... thanks

[sc1f]eonzerg

Belgium6812 Posts

November 20 2025 12:28 GMT

#14

On November 20 2025 21:19 Kraekkling wrote:

Show nested quote +

anyone else? could you link to the exact broken image? might need to switch to another image host if this persists... thanks

Working fine for me.

RedW4rr10r

Switzerland749 Posts

November 20 2025 13:05 GMT

#15

On November 20 2025 21:19 Kraekkling wrote:

Show nested quote +

anyone else? could you link to the exact broken image? might need to switch to another image host if this persists... thanks

I tried a different browser (Firefox at first, then Chrome) and none of the images work. But they work on mobile (iOS). Must be on my end then, especially when I seem to be the only one. I'll check it out on my phone then