On November 16 2024 21:41 RJBTVYOUTUBE wrote:
Because the sample size is so tiny its much more affected by player individual skill.
Because the sample size is so tiny its much more affected by player individual skill.
That shouldn't be a problem. @JackyVSO some suggestions that you might want to try to extract more information from the data:
- Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players. You can even use a bootstrap approach to calculate an empirical p-value and confidence interval, even with a small number of players. For instance, you could run a bootstrap resampling (e.g. using 10k replicates) for each player. For each replicate you can get an average PvZ (average of the PvZ winrate of all protoss players). You will have 10k means (from 10k replicates). You can use the distribution of those means to calculate the confidence intervals. Alternatively, you could use the same approach but resampling matches rather than players (i.e. perform the bootstrap on the pooled-matches, not on specific players' matches). A third approach would be to bootstrap the players themself (i.e. randomly leave some of the players out). Would be interesting to compare the estimates of the three approaches. If all bootstrap analyses lead to the same conclusion then that would put a nail in the coffin. If they differ, the interpretation would be more complicated. One caveat of the first approach is that you will probably have to standardize it by running the same procedure for Zergs (ZvP win rate), and standardize the win rates so that they sum up to 100% (they sum could deviate from 100% due to the unbalanced sample size). So for instance Adjusted_PvZ=[PvZ/(PvZ + ZvP)], and Adjusted_ZvP=[ZvP/(PvZ + ZvP)]. You would need to calculate those adjusted win rates for every bootstrap replicate. Same for other match ups.
On November 16 2024 21:41 RJBTVYOUTUBE wrote:
These stats are close enough to 50% to say the game is balanced.
These stats are close enough to 50% to say the game is balanced.
No one, absolutely no one, has ever provided a systematic and unbiased* analysis that proves that the game is unbalanced. Every time someone has complained about the game being imbalanced has been based on specific, anecdotal observations that were hand-picked.
And by systematic and unbiased I mean a strict analysis where the populations were declared ahead of time, where the inclusion and exclusion criteria (game-wise, and players-wise) were also declared ahead of time and applied equally to all races. Trying to restrict the inclusion of players of one race (Protoss) based on length of professional career, but not of other races is cherry picking: The selection criteria must be the same for all observations. And by populations I mean declaring ahead of time about whom the generalizations are being made: "All players that participated in ASL/KSL/SSL", "All players that participated in KCM", "All ladder players with a MMR above e.g. 2300", etc. Cherry picking specific observations to support an already made conclusion is nonsense.
Criticizing this analysis for only considering ASL/KSL/SSL is nonsensical. That's how the population was defined. The same analysis can be repeated in any other pre-defined population, and if all populations lead to the same conclusion then the debate is settled until anyone provides an analysis of another reasonably defined population with a different conclusion.
Kraekkling did an analysis of ladder games, defining different selection criteria (based on MMR and effective APM thresholds). The conclusions are similar: P>T>Z>P.
One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race. The analysis could also be restricted to only main accounts/main race, and at least say X games during the last year, and at least X games per month every month. The proportion of protoss at the top could also be compared to the proportion of protoss players at different MMR bins, and see if they become under represented (relative to the lower MMR bin), as the MMR threshold is increased, etc, etc.