I recently delved into a database containing around 8 million replays. A huge thanks to Dakota_Fanning for granting me access to the data from repmastered.app.
Introduction
We're all aware of the slight imbalances in various matchups. For instance, there's a ~53% win rate for Terran in TvZ, with similar minor imbalances in other matchups. While these numbers can vary based on factors like maps, spawn locations, player skill, and the current meta, they are generally recognized and accepted by the community.
With the data at hand, I tried to examine how some of these factors might influence the result of a game. Specifically, I examined how win rates in non-mirror matchups change as the game progresses over time.
In essence, we'll be looking at the win rate segmented into 1-minute intervals.
For example, for the 7-8 minute interval, we only consider games that concluded between 7:00 and 8:00 in-game time. This gives us an idea of the likelihood of a particular race winning at that specific time in the game. By doing this for every 1-minute interval, we can observe the win rate over time.
Before diving into the results, let's understand the data used. You can skip this part if you're only interested in the results.
Data
+ Show Spoiler +
The dataset comprises roughly 8 million 1v1 games played since the start of 2018. It doesn't provide complete information from a replay but rather some extracted data.
Build order or income details were not available.
However, the dataset did include:
- Player races
- Game winner
- Game duration
- Spawn locations
- Player APM & effective APM
To refine the dataset, a few filters were applied:
- Games played after 01.01.2018
- Game duration > 2 minutes
- Exclude draws
- Exclude games with afk players
- Exclude games on fastest maps and similar
- Exclude games on maps with fewer than 20000 plays
Below is a histogram displaying the frequency of each map after applying these filters. ("13" represents Fighting Spirit 1.3)
+ Show Spoiler +
Win rates vs. game time
As mentioned earlier, the y-axis displays the win rate (colored and linked data points, label on the plot's left side) against game time on the x-axis. A dashed, green line represents the 50% win rate, helping us identify when matchup dynamics shift.
The grey background data, with axis labels on the right, shows how many games conclude in each interval. This helps gauge the frequency of games ending at specific times and how often they progress to certain stages.
PvZ + Show Spoiler +
This particular plot initially sparked my interest, so let's start here. Key observations include:
- Zerg dominates the game's early stages.
- The significant dip around 7-8 minutes can likely be attributed to hydra bust builds. This is the most striking example I've seen in the data so far of a single strategy or build having such a profound impact.
- Protoss takes the lead during the mid-game, peaking around 13-16 minutes. This surge might partly result from the hydra bust builds: games that didn't end earlier are likely ones where Protoss successfully fended off the bust, often securing a favorable position.
- From 16-20 minutes, Protoss's win rate drops, with Zerg taking the upper hand. Several factors might contribute, such as Protoss typically mining out their main and natural bases around 16 minutes, while Zerg continues mining at (at least) three bases until roughly 20 minutes. The shift also reflects Zerg's hive tech coming into play.
- Games extending beyond 30 minutes show a slight uptick for Protoss, but these instances are rare and may not be very significant.
Overall, while there weren't any major surprises, it was nice to quantify these trends. Comparing close spawn vs. cross spawn win rates will be interesting.
A key takeaway is that Protoss might benefit from trying to wrap up games while on two bases, as Zerg seems favored in later stages.
PvT + Show Spoiler +
Breaking this down chronologically:
- Protoss has the edge in the game's very early phases (up to 6 minutes).
- A notable number of games conclude between 6-8 minutes, probably due to Protoss timings.
- Terran seems to have a strong timing around 10-12 minutes, which appears to be an optimal window for them to end the game.
- The small bump in Protoss's win rate around 14 minutes is not completely clear to me; it seems to wane a few minutes later. Maybe it represents games where Protoss successfully defends against Terran's 10-12 minute push?
- After 12 minutes, Terran's win rate starts to decline, with games lasting over 20 minutes favoring Protoss.
PvT is the matchup where I have the least knowledge overall, so maybe someone else will have better insights on this?
TvZ + Show Spoiler +
Again, let's look at this chronologically:
- The first few minutes are dominated by wins due to zerglings
- Terran wins quite a bit at 5-7 minutes, probably mostly sunken busts?
- Zerg has a strong timing at around 7 minutes when lair tech kicks in
- If Terran doesn't lose to zerg lair tech timing, they have a really great time until about 12-13 minutes, when zerg finally gets defilers. This pre-defiler timing is probably the most dominant point in the game out of any matchup - both the win rate for Terran is very high and the number of games which end there is large.
- If Zergs can stabilize, they slowly but surely flip the matchup dynamics at the 12-20 minute period.
- And thus we enter a period of advantage for Zerg after 18 minutes. We should note here, that fewer games come to this point where zerg can reap their advantage.
- However, we also see a clear trend, where, if Zerg is not able to finish out the game once they unlock the full hive tech (defiler + upgraded ultra) at ~20 minutes, as the games drag on, the matchup becomes balanced again.
Here it was cool to see how the matchup flips multiple times. Also, if you just visually compare the shape of the win rate vs time of TvZ and PvZ, you'll see striking similarities. In both matchups, Zerg seems to have strong early timings, then gets dominated in the mid-game, but comes back at a later point.
Influence of spawn locations: close vs cross-spawn
Here we use 4 player maps only and compare the winrate on close spawning positions to the diagonal case. This is of particular interest from a map making point of view, since the rush distances have a big influence on how strong certain builds are.
PvT+ Show Spoiler +
Zealot rushes are strong on close positions, as expected. Looks like Protoss overall prefers diagonal spawns. The longer distance means that it takes longer for the Terran to reach him.
PvZ+ Show Spoiler +
Protoss feels a lot safer in the early game on diagonal spawns. Later in the game Protoss also seems to profit from diagonal spawns.
TvZ+ Show Spoiler +
Early Terran wins due to sunken busts are much more likely on close spawns. Then, for the most part of the game the distances don't seem to matter at all.
Influence of
Results in brackets of effective apm. + Show Spoiler +
Here we can see a few categories of players and the balance in their games. A lower-level group is defined by taking games where both players are slower than roughly half of all the players, that is effective apm below 150. Those are the red data points.
Then we have two groups where both players are required to be above some certain level: both above 150 in purp, and both above 180 in blue. These are the higher-level groups. Note that one is inclusive of the other here, so we expect a strong correlation.
PvT+ Show Spoiler +
Man wtf is happening at the 5 minute mark in the lower-level group. Is this the delayed Zealot rush hitting? Overall, slow Terrans get punished terribly by slow Protosses.
For the better groups, it seems like if Terran can survive to the 10th minute, they will have a comfortable 10 minutes of advantage to win the game. If they do not manage that, the game tilts towards Protoss again,
PvZ+ Show Spoiler +
Zealot rushes only work in the lower group. Hydras are deadly at all levels. It looks like the lower group Zergs are worse in converting their late-game advantage into a win compared to the higher groups.
TvZ+ Show Spoiler +
Again, Terran profits the most from just "getting better". Looking at the 5-6 minute mark we nicely see that the higher groups gain their wins from sunken busting at an earlier time than the lower group. Terran dominance before defiler looks really scary in the higher groups.
Influence by level of play based on MMR.
Here, MMR data is used to examine the balance. A match-mmr is determined by taking the average MMR of the two players (initial MMR values before the match was played). Several brackets of games are defined and compared. Additionally, a requirement is imposed on the max difference of player MMR, to ensure only games of a meaningful skill difference are taken into account. For example, in red are games with
- match-mmr below 1800, and difference of MMR of both player is below 200
The requirements are somewhat loosened as games of higher level are examined, as can be seen in the labels. Take note of the big error bars in some of the brackets in the later stages of the game; that is don't draw too many conclusions from those data points.
PvT+ Show Spoiler +
The game balance is even at an MMR above 1800 and tils slowly towards Terran the higher we go. At the highest level the matchup is roughly 52% in Terrans favour.
PvZ+ Show Spoiler +
Protoss is behind in all brackets and gets dominated at the highest levels where they achieve an overall win rate of 46%.
TvZ+ Show Spoiler +
Again we see a clear trend where the matchup becomes more favoured towards Terran as player skill increases. It looks like high-level Zerg vs Terran is the hardest matchup in broodwar.