Lots of stats from the ASL/KSL era of pro Broodwar - Page 4

cheesehuehue

Vatican City State90 Posts

November 17 2024 21:46 GMT

#61

Yeah I'm not reading whatever certain contrarian may have said, it's like talking with a wall.

On November 16 2024 21:41 RJBTVYOUTUBE wrote:
Because the sample size is so tiny its much more affected by player individual skill.

That shouldn't be a problem. @JackyVSO some suggestions that you might want to try to extract more information from the data:

- Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players. You can even use a bootstrap approach to calculate an empirical p-value and confidence interval, even with a small number of players. For instance, you could run a bootstrap resampling (e.g. using 10k replicates) for each player. For each replicate you can get an average PvZ (average of the PvZ winrate of all protoss players). You will have 10k means (from 10k replicates). You can use the distribution of those means to calculate the confidence intervals. Alternatively, you could use the same approach but resampling matches rather than players (i.e. perform the bootstrap on the pooled-matches, not on specific players' matches). A third approach would be to bootstrap the players themself (i.e. randomly leave some of the players out). Would be interesting to compare the estimates of the three approaches. If all bootstrap analyses lead to the same conclusion then that would put a nail in the coffin. If they differ, the interpretation would be more complicated. One caveat of the first approach is that you will probably have to standardize it by running the same procedure for Zergs (ZvP win rate), and standardize the win rates so that they sum up to 100% (they sum could deviate from 100% due to the unbalanced sample size). So for instance Adjusted_PvZ=[PvZ/(PvZ + ZvP)], and Adjusted_ZvP=[ZvP/(PvZ + ZvP)]. You would need to calculate those adjusted win rates for every bootstrap replicate. Same for other match ups.

On November 16 2024 21:41 RJBTVYOUTUBE wrote:
These stats are close enough to 50% to say the game is balanced.

No one, absolutely no one, has ever provided a systematic and unbiased* analysis that proves that the game is unbalanced. Every time someone has complained about the game being imbalanced has been based on specific, anecdotal observations that were hand-picked.

And by systematic and unbiased I mean a strict analysis where the populations were declared ahead of time, where the inclusion and exclusion criteria (game-wise, and players-wise) were also declared ahead of time and applied equally to all races. Trying to restrict the inclusion of players of one race (Protoss) based on length of professional career, but not of other races is cherry picking: The selection criteria must be the same for all observations. And by populations I mean declaring ahead of time about whom the generalizations are being made: "All players that participated in ASL/KSL/SSL", "All players that participated in KCM", "All ladder players with a MMR above e.g. 2300", etc. Cherry picking specific observations to support an already made conclusion is nonsense.

Criticizing this analysis for only considering ASL/KSL/SSL is nonsensical. That's how the population was defined. The same analysis can be repeated in any other pre-defined population, and if all populations lead to the same conclusion then the debate is settled until anyone provides an analysis of another reasonably defined population with a different conclusion.

Kraekkling did an analysis of ladder games, defining different selection criteria (based on MMR and effective APM thresholds). The conclusions are similar: P>T>Z>P.
One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race. The analysis could also be restricted to only main accounts/main race, and at least say X games during the last year, and at least X games per month every month. The proportion of protoss at the top could also be compared to the proportion of protoss players at different MMR bins, and see if they become under represented (relative to the lower MMR bin), as the MMR threshold is increased, etc, etc.

TMNT

2734 Posts

November 18 2024 13:51 GMT

#62

That's a lot of fallacies in one post lol (once again, from the guy who always accuses others of using fallacies)

I doubt anyone here would strongly disagree with the notion of P>T>Z>P.

What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three. At the top level mind. And it's not cherry picked observations. It's actually every time another evidence shows up, people are like 'here we go again'.

- Kespa stats from 2001-2012 (~35k games). It reads 52.1% PvT, 54.5% TvZ, 54.7% ZvP.

- Eloboard stats from 2021-2024 (~70k games). No Flash by the way. It reads 47.7% PvT, 55.1% TvZ, 52.3% ZvP.

- Sponbbang stats from 2017-2021 (which means with Flash) shows similar stats to eloboard by the way. The site is dead now but I believe there are screenshots of those stats we can even find right here in tl.

- 8 mil games (mostly ladder I think) analysis from Kraekkling. It says from 2k mmr and above only PvT win rate is less than 50%.

- Extracted data from KCM race wars since 2021 (which means no Flash) I posted in the previous page. 52.1% PvT, 52.6% TvZ, 54.3% ZvP.

The populations of those analyses are well defined, no? Not that I'm saying there is no flaws in the observed data, but that's the only thing we have to work with at the moment.

It's funny because this:

Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players.

One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race

is very close to what I literally did in the previous page (except it's the accumulated win rate of the examined players, not the average of their individual win rates - but if we treat it the latter way, the same trend remains as well). But since this troll can't argue with that (because it contradicts his already made conclusion) he just pretends it doesn't exist and keeps spilling out jargons.

- The only time the stats look different from the other observations is in this ASL/KSL/SSL thread where we have 53.4% PvT, 52.5% TvZ, 52% ZvP but as I already pointed out that's because of the combination of low sample size + uneven player pools for each race. Here's what happens if we standardize (lol) the data by comparing the accumulated win rates of the top 4, 8, 12 etc. players of each race (raw data in page 2):

Pls spare my poor excel skills. Note that only 12 Protosses have more than 10 PvT games in ASL/SSL/KSL and after that we have Noob and Brain who played a combined 5 games with a freaking win rate of 100% (hence the little bump from top 12 to top 16).

And the more shocking thing is, even if we remove Flash (best TvZ) and Soma (best ZvP) from the data, and keep all the Protoss players in, Protoss doesn't even come out on top if we look at the top 8 players: 56.9% PvT, 58.5% TvZ, 56.5% ZvP.

cheesehuehue

Vatican City State90 Posts

November 18 2024 20:10 GMT

#63

On November 18 2024 22:51 TMNT wrote:
I doubt anyone here would strongly disagree with the notion of P>T>Z>P.

Then WTF are you yapping so much about??????

And you provide all the eloboard results as a single stats. Really, you didn't think of stratifying in any meaningful way? By elo, proleage, k-league, etc???? How dumb are you?????

TMNT

2734 Posts

November 18 2024 21:40 GMT

#64

On November 19 2024 05:10 cheesehuehue wrote:

Show nested quote +

Then WTF are you yapping so much about??????

Dont you know how to read? It's literally written in the very next sentences of my post:

"What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three"

It's also a response to your original post:

However the advantage of Z over P is smaller than both the advantage of P over T, and the advantage of T over Z.

But people who whine about Protoss being the weakest race will simply reject the evidence and get angry for presenting FACTS to them. Unless you can support your whining with FACTS, SC is a balanced game. PERIOD.

Now please explain how come you had no problems announcing the part in bold with confidence after seeing a stat calculated in the same method with the other stats that everyone has been aware of that I just presented? Suddenly there is no need for confidence intervals or anything when it supports your conclusion eh?

Also it's funny because your first and second paragraph here kind of contradict each other lol. Obviously if Z>P>>T>>Z then Z is the weakest race. It's nice of you to not whine but the game would not be balanced in that case nonetheless.

And you provide all the eloboard results as a single stats. Really, you didn't think of stratifying in any meaningful way? By elo, proleage, k-league, etc???? How dumb are you?????

Because I'm not making an analysis and at the same time call out people for not accepting my FACTS.

But maybe you're right. We should stratify it. So how come I haven't seen you demanding the ASL stats to be stratified into Ro24/16/8 games, or top 4/8/12 players for each race? Oh wait, I already did the latter for you without anyone asking, but you're still ignoring it, well well, because it doesn't support your conclusion.

RJBTVYOUTUBE

Netherlands892 Posts

November 19 2024 06:08 GMT

#65

we should start analyzing data excluding outlier players from the data. top 7 for each race but exclude the top 1 and bottom 1 to reduce the influence an outlier has on the data pool. and use only eloboard for the data because it includes kcm and starleagues. exclude maps like troy, monty and minstrel that have lopsided winrates for specific match-ups. preferably use maps that are considered most balanced like Radeon, apocalypse etc.

TMNT

2734 Posts

November 19 2024 07:41 GMT

#66

On November 19 2024 15:08 RJBTVYOUTUBE wrote:
we should start analyzing data excluding outlier players from the data. top 7 for each race but exclude the top 1 and bottom 1 to reduce the influence an outlier has on the data pool. and use only eloboard for the data because it includes kcm and starleagues. exclude maps like troy, monty and minstrel that have lopsided winrates for specific match-ups. preferably use maps that are considered most balanced like Radeon, apocalypse etc.

That's one way. Although I'd add we should do both (with outliers and without outliers) and compare the results. The thing with BW statistics is no analysis can be definitive enough even if you manage to calculate p values or confidence intervals, because you always start with an observed data that has flaws. However, every stats can be suggestive. Thats why I'm always of the mindset that if most pieces of information suggest the same thing, then that thing is probably true.

The notion that P is the weakest race at the top level because PvT is not as favourable as TvZ or ZvP is neither new nor something I invented/pre-concluded myself. It has been analysed by others and from many years ago. For example here and here. We saw the same pattern in Kespa as we're seeing now. There is simply too much smoke for it to not be a fire.

M3t4PhYzX

Poland4196 Posts

November 19 2024 10:02 GMT

#67

On November 18 2024 22:51 TMNT wrote:
That's a lot of fallacies in one post lol (once again, from the guy who always accuses others of using fallacies)

I doubt anyone here would strongly disagree with the notion of P>T>Z>P.

What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three. At the top level mind. And it's not cherry picked observations. It's actually every time another evidence shows up, people are like 'here we go again'.

- Kespa stats from 2001-2012 (~35k games). It reads 52.1% PvT, 54.5% TvZ, 54.7% ZvP.

- Eloboard stats from 2021-2024 (~70k games). No Flash by the way. It reads 47.7% PvT, 55.1% TvZ, 52.3% ZvP.

- Sponbbang stats from 2017-2021 (which means with Flash) shows similar stats to eloboard by the way. The site is dead now but I believe there are screenshots of those stats we can even find right here in tl.

- 8 mil games (mostly ladder I think) analysis from Kraekkling. It says from 2k mmr and above only PvT win rate is less than 50%.

- Extracted data from KCM race wars since 2021 (which means no Flash) I posted in the previous page. 52.1% PvT, 52.6% TvZ, 54.3% ZvP.

The populations of those analyses are well defined, no? Not that I'm saying there is no flaws in the observed data, but that's the only thing we have to work with at the moment.

It's funny because this:

Show nested quote +

Some great stats right there. Thanks a lot.

Prev 1 2 3 4 All

Please or register to reply.

Lots of stats from the ASL/KSL era of pro Broodwar - Page 4

Completed

Ongoing

Upcoming