|
Following the publication of ladder game data analysis in Kraekkling's thread, here is an extensive analysis of data from all Starleague (ASL/KSL) games played between 2016 and 2023.
The analysis is based on a new, manually compiled dataset of the 1,906 on-stage games of ASL seasons 1-16 and KSL seasons 1-4. The findings relate to many different aspects including matchup dynamics (by year, game duration, spawn locations etc.), player rankings (winrates, Elo), map balance rankings, as well as sundry other enquiries including but not limited to testing out the famous "If I win game 1, I'll win the whole series" claim.
I have written up the findings on a web page with a lot of graphs and tables. Check it out in the link below:
All the stats from the ASL/KSL era
|
While I appreciate the analysis, I think the limitation of sample size and more importantly, the cup format of ASL and KSL, make the methodology a bit flawed, which leads to some misleading conclusions.
For example, ain't no way Sylphid is the most balanced map ever made. If anything it's one of the worst. You can check its stats (with number of games in the range of thousands) on eloboard and see.
The reason: with 141 games as a total sample size, excluding the mirror matchups maybe we have like 100 games left (just hypothetically), that leaves us with 33-34 games per matchup which means you have a ratio of 17:17 for a totally balanced map (at 50%) and only a 3-game swing makes it become a totally unbalanced map with 40% win rate for one race. And that's for Sylphid with a total games of 141. As for maps like Vermeer with 46 games, I can only imagine you have a sample size of 10 for each matchup.
The second problem is the cup format which, combined with the sample size, skew the win rate of players in every direction. This is because the quality of players in the Ro24 is very different to Ro16, let alone Ro8, 4 and finals. For example, let's say a player has a record of 8-4 before the semifinals, but then loses the semi 0-4 (to Flash for example). He will end up with a win rate of 50%, which is not better than a player with a record of 2-1 in Ro24 and 1-2 in Ro16.
You would think it will eventually even out after 20 seasons, but clearly not enough. That's why you end up with stuff like: - Mong and Sea are better TvP players than Light and Last? And Last at 7th place nonetheless. - Ample is the 5th best TvZ player in ASL+KSL history? - Jaehoon and Horang2 are better PvT players than Bisu and... Rain? - Best is a better PvZ player than Bisu and Mini? - Sacscri as the 5th best ZvT player? - Queen only at 8th spot in the ZvP table? (probably because he lost 2-8 to Mini in two semifinals which makes his stats look worse) Note that you don't see those irregularities in the ranking for 3 mirror matchups? That's quite self explanatory.
|
Great work! Nicely formatted, and some interesting stats to explore there. Thank you for sharing
|
|
On October 24 2023 09:38 TMNT wrote: While I appreciate the analysis, I think the limitation of sample size and more importantly, the cup format of ASL and KSL, make the methodology a bit flawed, which leads to some misleading conclusions.
For example, ain't no way Sylphid is the most balanced map ever made. If anything it's one of the worst. You can check its stats (with number of games in the range of thousands) on eloboard and see.
The reason: with 141 games as a total sample size, excluding the mirror matchups maybe we have like 100 games left (just hypothetically), that leaves us with 33-34 games per matchup which means you have a ratio of 17:17 for a totally balanced map (at 50%) and only a 3-game swing makes it become a totally unbalanced map with 40% win rate for one race. And that's for Sylphid with a total games of 141. As for maps like Vermeer with 46 games, I can only imagine you have a sample size of 10 for each matchup.
The second problem is the cup format which, combined with the sample size, skew the win rate of players in every direction. This is because the quality of players in the Ro24 is very different to Ro16, let alone Ro8, 4 and finals. For example, let's say a player has a record of 8-4 before the semifinals, but then loses the semi 0-4 (to Flash for example). He will end up with a win rate of 50%, which is not better than a player with a record of 2-1 in Ro24 and 1-2 in Ro16.
You would think it will eventually even out after 20 seasons, but clearly not enough. That's why you end up with stuff like: - Mong and Sea are better TvP players than Light and Last? And Last at 7th place nonetheless. - Ample is the 5th best TvZ player in ASL+KSL history? - Jaehoon and Horang2 are better PvT players than Bisu and... Rain? - Best is a better PvZ player than Bisu and Mini? - Sacscri as the 5th best ZvT player? - Queen only at 8th spot in the ZvP table? (probably because he lost 2-8 to Mini in two semifinals which makes his stats look worse) Note that you don't see those irregularities in the ranking for 3 mirror matchups? That's quite self explanatory.
I agree with this, especially that historical sum total winrates don't translate to a skill ranking. In the case of Sea, he was one of the strongest Terrans in 2017 but he's very far from that now. The total winrates obviously don't pay any attention to time. The Elo ranking does, though, so that should give a better picture of who is stronger now.
But yes, the sample rate means that most of the stats can't be interpreted as evidence of strong underlying trends, and some of them are nothing but hints. But the stats are what they are, and it's up to each of us what we will make of them.
About "- Ample is the 5th best TvZ player in ASL+KSL history?" and "- Jaehoon and Horang2 are better PvT players than Bisu and... Rain?"
Well, they have a higher winrate in those matchups.I can't measure whether they are "better" in general, and my gut instinct is definitely to agree with you that they are probably not. Maybe it's useful to think of the winrate rankings more as a scoreboard than as an accurate reflection of skill levels.
|
Good work! One request - instead of measuring how "fast" or "slow" a player is by their game length, can you only count the duration of games they won?
|
Wow this looks like it took a lot of work and you've formatted it well too! Thanks for your time and Effort
|
this is really amazing. thank you. BW is extremely balanced
|
|
On October 24 2023 16:21 Zealgoon wrote: Good work! One request - instead of measuring how "fast" or "slow" a player is by their game length, can you only count the duration of games they won?
Interesting idea. I'll note that down and let you know when I've had time to look into it. You can see the winrates of a few selected players divided into time intervals in the article, which I guess is kind of close to what you're asking. The players selected are the ones with high variance between the different intervals.
Edit: actually this was really easy to look up by modifying the other script very slightly. This is what the table of average win duration by player looks like.
|
Amazing work. Read through all of it, really interesting, but not really surprising :D
|
Well done. It must have been a huge amount of work to collect data on game length and spawn locations by hand from 2 thousand afreeca vods, huge respect for your determination!
As discussed in the thread, I agree that we should be careful not to draw too many conclusions from stats which are based on a small sample of games, like for example balance stats for some of the maps.
Overall I think there are many valuable insights here. In particular, to me it was remarkably interesting to see how much Terran seems to rely on close spawns on 4-player maps.
|
On October 24 2023 21:35 Kraekkling wrote: Well done. It must have been a huge amount of work to collect data on game length and spawn locations by hand from 2 thousand afreeca vods, huge respect for your determination!
As discussed in the thread, I agree that we should be careful not to draw too many conclusions from stats which are based on a small sample of games, like for example balance stats for some of the maps.
Overall I think there are many valuable insights here. In particular, to me it was remarkably interesting to see how much Terran seems to rely on close spawns on 4-player maps.
Thank you. Yeah, I created an interface for putting in the data and then just filled it in little by little at times when I wasn't doing anything else important, over a year or so. I'm sure there was a smarter approach 😂 But hey, it worked out.
|
United States10081 Posts
Rain with an 84% PvP win rate is just crazy to me. Further noting that TvT and PvP are definitely more "skilled" mirror matchups while Zerg, with not one player having over 55% win rate, is definitely a coin flip matchup with the most amount of RNG based off build order wins. Sad to see Jaedong, which we used to nickname the matchup JvZ, be only at 33% WR, which is likely due to him being older and can't rely on his insane micro to level the game after a build order loss.
Also, it seems that 3 player maps are the most balanced overall for all 3 races (PvZ being the most imbalanced). This might be because 3p maps are just better for every race, or because RNG of spawn location, thus evening out the matchups (unlike 4p maps where cross spawns give no one a true advantage, 3p maps will always have someone spawn in a "favorable" position).
Really cool post overall, but I do agree with small sample sizes that some of the conclusions can't be totally drawn out from the games.
|
Average game length to me was very interesting. Fun to see most zergs average game length under 13 minutes
|
The pies and half donuts look tasty! ; D
I think we should not forget the finals either (where it really matters in terms of cash)
Terrans have won as much as the other 2 races COMBINED!!!!!!!!!!! T = Z + P ( T 10W = Z 6W + P 4W)
In terms of raw first place cash won T>Z+P
Terran: $ 613,466 (ASL $ 474,347) > Z+P Total: $ 549,839
Zerg: $ 338,040 (ASL $ 267,026)
Protoss: $ 211,799 (ASL $ 143,911)
*according to liquipedia and my calculations excluding VANT36.5 National Starleague and HungryApp Starz League with Kongdoo who are literally NOT named ASL or KSL (1 Z and 1 T victories there).
|
On October 25 2023 03:08 LUCKY_NOOB wrote:The pies and half donuts look tasty! ; D I think we should not forget the finals either (where it really matters in terms of cash) Terrans have won as much as the other 2 races COMBINED!!!!!!!!!!! T = Z + P ( T 10W = Z 6W + P 4W)In terms of raw first place cash won T>Z+PTerran: $ 613,466 (ASL $ 474,347) > Z+P Total: $ 549,839Zerg: $ 338,040 (ASL $ 267,026) Protoss: $ 211,799 (ASL $ 143,911) *according to liquipedia and my calculations excluding VANT36.5 National Starleague and HungryApp Starz League with Kongdoo who are literally NOT named ASL or KSL (1 Z and 1 T victories there). + Show Spoiler +
I wish this was a good way to see how terrans perform, but ASL seasons do not have the say pay out each season. Majority pay around 22k, however, ASL4 paid out 52,650 which FlaSh won.
|
I totally see TMNT's point about sample size but it is a good indication about which players do better or worse in certain MUs during ASL. For example I would normally Light over any Zerg in Proleague or KCM as he's basically the strongest general TvZer but he seems to do worse in offline planned matches. Snow has been the strongest general Protoss player for the last year but has really struggled to get out of group stages in ASL for the last few seasons.
Also damn yea flash really was GOAT!
|
On October 24 2023 19:16 JackyVSO wrote:Show nested quote +On October 24 2023 16:21 Zealgoon wrote: Good work! One request - instead of measuring how "fast" or "slow" a player is by their game length, can you only count the duration of games they won? Interesting idea. I'll note that down and let you know when I've had time to look into it. You can see the winrates of a few selected players divided into time intervals in the article, which I guess is kind of close to what you're asking. The players selected are the ones with high variance between the different intervals. Edit: actually this was really easy to look up by modifying the other script very slightly. This is what the table of average win duration by player looks like. Thanks for the quick reply. Free is looking a lot less slow now
|
"Reverse sweeps
The last series stat I’ve registered is the probability of reverse sweeps. In 197 Best-of-5s, we have seen six reverse sweeps. The probability of making a reverse sweep in a Best-of-5 (i.e. the probability that if you’re down 0-2, you go on to win the series) stands at 6%."
Maybe im confused, but how can 6/197 be 6%? Shouldn't it be around 3%? Or is it that only 100 games ended up being a 0-2 to start with. That was the only part that confused me here
|
Loving these stats, the timing graphs in particular are rly cool, thx for your time and effort. Is it possible to have the winrate breakdown by maps as well?
A lot of these maps are very anti Terran cus of Flash so that'll skew the winrates by a big margin, it'll be interesting to see winrates on the most standard maps. For example season 5 had insanely P favored maps, the top Terran players still end up figuring things out but life becomes way harder on the lower end Terrans.
Another interesting stat to see is the winrates based on spawn locations, specifically left side spawns vs right side spawns on older maps. The older maps had bad mineral formations on the left side (non L formations) so it would be interesting to see winrate difference between the old maps and the new maps. If it's too much work to compile that info then forget about it :D.
|
On October 25 2023 19:24 Highwinds wrote: "Reverse sweeps
The last series stat I’ve registered is the probability of reverse sweeps. In 197 Best-of-5s, we have seen six reverse sweeps. The probability of making a reverse sweep in a Best-of-5 (i.e. the probability that if you’re down 0-2, you go on to win the series) stands at 6%."
Maybe im confused, but how can 6/197 be 6%? Shouldn't it be around 3%? Or is it that only 100 games ended up being a 0-2 to start with. That was the only part that confused me here
It's as you said. I wanted to calculate the probability of a reverse sweep happening once the score is 0-2 so all series that started out 1-1 have been discounted.
So out of the times one player started out 2-0 in a Best-of-5, 6% of those times, the other player then won the last three matches, whereas in 94% of cases, the player who started out 2-0 also won the series.
|
On October 26 2023 00:32 TT1 wrote: Is it possible to have the winrate breakdown by maps as well?
Glad you enjoyed the stats! The map balance table has the winrate for each matchup for each of the 13 most played maps. If there are any other map/winrate stats you're interested in, they shouldn't be too hard for me to look up.
Another interesting stat to see is the winrates based on spawn locations, specifically left side spawns vs right side spawns on older maps. The older maps had bad mineral formations on the left side (non L formations) so it would be interesting to see winrate difference between the old maps and the new maps. If it's too much work to compile that info then forget about it :D.
I don't have that information, unfortunately! I have only registered whether it was cross spawns or not...
|
Wow this was amazing, thanks for the work!
|
On October 26 2023 00:32 TT1 wrote: Loving these stats, the timing graphs in particular are rly cool, thx for your time and effort. Is it possible to have the winrate breakdown by maps as well?
A lot of these maps are very anti Terran cus of Flash so that'll skew the winrates by a big margin, it'll be interesting to see winrates on the most standard maps. For example season 5 had insanely P favored maps, the top Terran players still end up figuring things out but life becomes way harder on the lower end Terrans.
Another interesting stat to see is the winrates based on spawn locations, specifically left side spawns vs right side spawns on older maps. The older maps had bad mineral formations on the left side (non L formations) so it would be interesting to see winrate difference between the old maps and the new maps. If it's too much work to compile that info then forget about it :D.
Oh yeah I remember season 5, the map makers went all out to stop Flash from winning again lmao.
|
Talking about unexpected conclusions due to low sample size, Rain actually was pretty meh at PvT though. I only remember seeing brainless macro hoping to overwhelm, or zealot rushes beating fast expands.
|
I actually really enjoyed reading through that. Thank you.
|
BW is perfection. that was really awesome, thanks.
|
On October 25 2023 03:08 LUCKY_NOOB wrote:The pies and half donuts look tasty! ; D I think we should not forget the finals either (where it really matters in terms of cash) Terrans have won as much as the other 2 races COMBINED!!!!!!!!!!! T = Z + P ( T 10W = Z 6W + P 4W)In terms of raw first place cash won T>Z+PTerran: $ 613,466 (ASL $ 474,347) > Z+P Total: $ 549,839Zerg: $ 338,040 (ASL $ 267,026) Protoss: $ 211,799 (ASL $ 143,911) *according to liquipedia and my calculations excluding VANT36.5 National Starleague and HungryApp Starz League with Kongdoo who are literally NOT named ASL or KSL (1 Z and 1 T victories there). both HungryApp and VANT36.5 Starleagues should count as an ASL level event, clearly.
thx for these stats, anyway
|
thanks for this really cool I love stats
also FlaSh is clearly OP
|
The stats page has been updated following ASL 17 and 18 (SSL 1). The analysis now covers 2,107 games in total.
Soulkey now tops the Elo chart by a huge margin. Flash's return is looking more interesting than ever.
Lots of stats from the ASL/KSL era of Starcraft 1
|
Netherlands4724 Posts
I missed this the first time around. Very cool. Great work!
Imo another interesting thing would be to see who's best at reverse sweep or worst at it from the top 10 best performers (so we don't see a presumable bottom dweller score worst).
|
About seeding: you claim it having an advantage, but did you take into account the player' overall strength?
I.e. If a player can consistently get into ro8, with seeding (or is more likely), is he also a stronger player on average, or not?
|
Vatican City State90 Posts
Very cool data and analysis.
Protoss with the overall highest winrate across KSL/ASL/SSL
As is well known, SC is pretty much a rock-paper-scissors kind of game. However the advantage of Z over P is smaller than both the advantage of P over T, and the advantage of T over Z.
But people who whine about Protoss being the weakest race will simply reject the evidence and get angry for presenting FACTS to them. Unless you can support your whining with FACTS, SC is a balanced game. PERIOD.
|
That's a lot of CAPS for something not period at all lol. It's like flipping a coin 10 times, getting heads 7 and jumping to the conclusion that this coin is not balanced. Yes 7 out of 10 is a fact but the conclusion you're jumping to is not.
---------------------------------------------------------------
Why? Because op doesn't treat the data properly. Sample size is the obvious problem. But he also doesn't realize the structure of the SLs and players pool can make the data a bit lopsided. The reason Protoss looks best in terms of SL win rate here is because their players pool is the most top-heavy. In details:
Right in the analysis, we see there are 80 players who have ever played in SL, but the distribution is 24 , 27 , 29 . Then if you look a bit deeper into the top 45 players by win rate, we have 12 , 16 , 17 .
Why this causes a slight "inflation" in win rate for Protoss? Because the top 12 players get to play the top 12 players, plus another 4 players who are weaker than the first 12. Same for PvZ. If you are not clear then imagine Snow + Bisu playing Flash + Light for 100 games. Compare that to Snow + Bisu playing Flash + Light + Royal + JYJ for also 100 games. Surely the win rate for Protoss in the second case would be higher. The impact is obviouly smaller when you have 12 playing 16 , instead of 2 vs 4, but there is an impact nonetheless.
In reality we obviously can't have the ideal situation of top 10 players of a race playing only the top 10 players of another race. But we can still look at the combined individual win rates to find a trend. Take the pair PvZ for example (totally arbitrary, dont have time for PvT and TvZ, others feel free to try). The global win rate in PvZ is 48% for or 52% for . But if we only look at the top players of each race (just do some copy paste to an excel sheet and work it out):
- The top 9 in the PvZ table have a combined record of 238 wins/464 games --> 51.29%. - In contrast the top 9 have a combined record of 242/418 --> 57.9%. - But the top 13 have a combined record of 261/481 --> 54.26%
Now you see how the extra 4 players drag the ZvP win rate down (or bump the PvZ win rate up for the ). But to be honest we don't even need data to know that the more lower ranked players we include in the calculation, the lower the win rate of the race becomes.
So a fairer way to look at balance (when you have to work with this limited sample size) is to look at only the games of the top X players of each race. Let's say top 8. Why 8? To be honest you could do the same with 4 5 6 7 9 10 and the results would probably be the same. But I choose 8 here because we have the Ro8 in SL and also I think the undoubtedly top 8 of the modern era are Soulkey Hero Effort JD Queen Soma Action Larva, so the numbers of and players should correspond.
Based on the global win rate of each player, the top 8 for are Flash Last Light Sharp Rush Royal JYJ Mind and for are Snow Mini Rain Best Bisu Stork Shuttle Horang2 (actually the 8th for Protoss is Jaehoon but I replaced him with Horang2 the 9th because apparently Jaehoon didn't play any PvZ???), and the top 8 are as above I mentioned.
Extract the data from Jacky's tables, we got this:
Okay now we can have a glimpse of the trend that we have always seen in sponbbang/eloboard. Each race has a strong and weak matchup, and overall seems to be the weakest.
On an irrelevant note, you can also pick interesting tidbits from the table above. One I noticed is Best has the highest win rate in non-mirror matchups of any Protoss. Not Rain, not Mini. My man Best. In fact his win rate in non-mirror matchups is only behind the legends of the game Flash, Soulkey and Effort. You'd think he had won a title or at least a multiple time runner-up. But no, here's a guy who only got to the semi-finals twice and choked out of Ro8/16/24 regularly. That's partly because of his abysmal PvP, partly because he has played in all the SLs and probably regularly beaten lower ranked players in Ro24/16. (Edit: actually JYJ is also higher than him, his stats probably boosted by the one season wonder ASL15)
You can also do another version of the top 8 tables, but this time the top 8 is not based on their global win rates (including mirror matchups), but rather the specific matchups, like this:
This even looks closer to the sponbbang/eloboard trend that we have normally always seen ( > > ). And once again, the strong matchup for is not as strong as those of and , while the weak matchup for is also the weakest of any non-mirror matchups.
In conclusion, there are probably still some flaws in the above analysis from me (can't avoid given the whole dataset we have to work with has intrinsic problems), but it's certainly less flawed than the basic analysis from Jacky, as I also pointed out some in post #2 of this thread.
|
Vatican City State90 Posts
As always you trying to pick on me with the dumbest logic. The problem is that you think you are smart, but you are not. You just tire people out until they tap out and say "let that guy think he is right, he will keep arguing until you stop replying", as many people have told me over PMs.
On November 12 2024 01:42 TMNT wrote: It's like flipping a coin 10 times
Here is an example of your patronizing tone, as usual. The funny thing is you use a patronizing tone to state something 1) basic (sample size) as if you were illuminated and the only one to know it (like a chimp being proud of understanding what a square is), and 2) that is not true in this case: There are over 3600 games, not 10. So as usual, you resort to using a Strawman fallacy, equating a sample size of over 3600 to a sample size of 10.
On November 12 2024 01:42 TMNT wrote: The reason Protoss looks best in terms of SL win rate here is because their players pool is the most top-heavy.
Here is another example of your "reasoning" that is clearly fallacious, product of your ignorance. You criticize OP's method, but your critique is invalid. You cannot conclude the sample is "lopsided" from the race proportions in the tournaments. Anyone with minimal training in causal inference (which you clearly lack) will know that. Why? Because you need a standardizing factor: For instance if few NEW players choose to play protoss, then they are expected to be underrepresented among tournament players (with the few participating being older). Similarly, if most NEW pro players are e.g. Terran (or Zerg), then other races will be underrepresented. The problem is choosing the appropriate standardizing factor: 1) Total number of players in ladder? 2) total number of players who have tried to qualify to an ASL/SSL? 3) Total number of players who have actually participated in KSL/ASL/SSL? Another standardizing factor? The total number of players wouldn't make sense as anyone can create an account and play for a few days. The only options that make sense are options 2 and 3. OP used option 3. To use a chimp logic that you can understand, if you throw a coin and get 7 heads, you cannot conclude that is a high or low number, you need to know the total number of tosses.
|
On November 12 2024 02:38 cheesehuehue wrote: Here is an example of your patronizing tone, as usual. The funny thing is you use a patronizing tone to state something 1) basic (sample size) as if you were illuminated and the only one to know it (like a chimp being proud of understanding what a square is), and 2) that is not true in this case: There are over 3600 games, not 10. So as usual, you resort to using a Strawman fallacy, equating a sample size of over 3600 to a sample size of 10.
For one, it's not 3600 games lol. For a guy who multiple times used words such as dumb, ignorance, chimp, etc. to personally attack me (notice that I didn't use any terms to describe you), you just stupidly added the 1242+1166+1260 games of the 3 races together to get ~3600, without realizing that they overlap, and include mirror games.
The total games for the 3 non mirror matchups are 561+517+483. That's around ~1500 games or 500 for each matchup. Now compare that to this from eloboard with 20k+ games for each matchup and explain why don't you use this for an argument with CAPS:
|
Incredible work, thanks for your contribution! Let's see if FlaSh can gain back his ELO.
|
On November 12 2024 02:38 cheesehuehue wrote: Here is another example of your "reasoning" that is clearly fallacious, product of your ignorance. You criticize OP's method, but your critique is invalid. You cannot conclude the sample is "lopsided" from the race proportions in the tournaments. Anyone with minimal training in causal inference (which you clearly lack) will know that. Why? Because you need a standardizing factor: For instance if few NEW players choose to play protoss, then they are expected to be underrepresented among tournament players (with the few participating being older). Similarly, if most NEW pro players are e.g. Terran (or Zerg), then other races will be underrepresented. The problem is choosing the appropriate standardizing factor: 1) Total number of players in ladder? 2) total number of players who have tried to qualify to an ASL/SSL? 3) Total number of players who have actually participated in KSL/ASL/SSL? Another standardizing factor? The total number of players wouldn't make sense as anyone can create an account and play for a few days. The only options that make sense are options 2 and 3. OP used option 3. To use a chimp logic that you can understand, if you throw a coin and get 7 heads, you cannot conclude that is a high or low number, you need to know the total number of tosses.
And for this point don't expect to throw in a number of red herrings and win the argument. What new players lol? Everyone knows it's the same 30 guys or something who have been playing each other since Remastered. There's also option 4 which is called sponbbang/eloboard but I see you just pretend they didn't exist.
But let's go with option 3 which I never disagreed with in the first place, but you can either process it in the most basic way (like op did) or you can refine it like I did. Like in terms of players pool, what's better than having the same numbers of players for each race for the comparison of win rate? Whether it's top 5/10/20 doesn't matter. The important thing is it's a better method than having 5 players for one race and 10 for the other. If you can't refute that you're just a troll.
Also, for a guy who keeps trying to say it's only Soulkey that is doing good as Zerg recently (it's true), it's clear you acknowledge that he makes it lopsided for Zerg. But when another person use the same argument but for the whole lineup of a race, oh suddenly it becomes invalid lololol. In fact that's a common theme for you in this forum, being a hypocrite and lacking of self-awareness. Like you literally started the conversation with a patronizing tone, aiming at the "Protoss whiners", but then when I presented you with a post of pure analysis and no personal offence, you retaliated like a cunt.
|
Great analysis, keep up the good work!
|
Amazing analysis, and thanks for sharing!
Looking at the ELO ranking, the fact that Flash is still #2, and Effort is still #6, tells me that the decay factor should be tuned a bit more aggressively. These two haven't been active for quite a while (almost an eternity in such a competitive scene, where active players are improving by the day), and shouldn't deserve such a high spot anymore. For instance, can we say that Flash in his current form is better than Light?
I wonder what the ELO ranking will look like after the tune up.
|
This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it.
|
On November 12 2024 03:30 TMNT wrote:Show nested quote +On November 12 2024 02:38 cheesehuehue wrote: Here is another example of your "reasoning" that is clearly fallacious, product of your ignorance. You criticize OP's method, but your critique is invalid. You cannot conclude the sample is "lopsided" from the race proportions in the tournaments. Anyone with minimal training in causal inference (which you clearly lack) will know that. Why? Because you need a standardizing factor: For instance if few NEW players choose to play protoss, then they are expected to be underrepresented among tournament players (with the few participating being older). Similarly, if most NEW pro players are e.g. Terran (or Zerg), then other races will be underrepresented. The problem is choosing the appropriate standardizing factor: 1) Total number of players in ladder? 2) total number of players who have tried to qualify to an ASL/SSL? 3) Total number of players who have actually participated in KSL/ASL/SSL? Another standardizing factor? The total number of players wouldn't make sense as anyone can create an account and play for a few days. The only options that make sense are options 2 and 3. OP used option 3. To use a chimp logic that you can understand, if you throw a coin and get 7 heads, you cannot conclude that is a high or low number, you need to know the total number of tosses.
And for this point don't expect to throw in a number of red herrings and win the argument. What new players lol? Everyone knows it's the same 30 guys or something who have been playing each other since Remastered. There's also option 4 which is called sponbbang/eloboard but I see you just pretend they didn't exist. But let's go with option 3 which I never disagreed with in the first place, but you can either process it in the most basic way (like op did) or you can refine it like I did. Like in terms of players pool, what's better than having the same numbers of players for each race for the comparison of win rate? Whether it's top 5/10/20 doesn't matter. The important thing is it's a better method than having 5 players for one race and 10 for the other. If you can't refute that you're just a troll. Also, for a guy who keeps trying to say it's only Soulkey that is doing good as Zerg recently (it's true), it's clear you acknowledge that he makes it lopsided for Zerg. But when another person use the same argument but for the whole lineup of a race, oh suddenly it becomes invalid lololol. In fact that's a common theme for you in this forum, being a hypocrite and lacking of self-awareness. Like you literally started the conversation with a patronizing tone, aiming at the "Protoss whiners", but then when I presented you with a post of pure analysis and no personal offence, you retaliated like a cunt.
Actually, there are some "newish" players that rose up in the remastered era. Rain and Soulkey rose up super early in the current Era so you can consider them "old" players. But JyJ took until 2018/2019 to be a player recognized by the other pros as an equal. and he broke through as a top terran in 2022/2023. Royal was recognized as a top terran all the way back in 2020 when he started performing really well in spons and then in proleagues. But before then he was considered to be on a tier below the rest. Barracks likewise was also on that tier until 2021/2022 when he started to set himself apart as the next potential breakthrough terran. Speed literally had his breakthrough just this year and performs as well as a top 5-8 terran. Biggest breakthrough story is Soma who suddenly broke through with Castermuse Starleague season 1 if I recall correctly. that was may 2019. That same year he got 4th in KSL.
Beside them though we have not seen actually new players breakthroigh into the top level. We did see new players break into the "joker tier" below King tier. (king tier usually reserved for round of 16 talent). We have only seen already top players shift around a bit and have their respective moment to shine in a meta or multiple metas.
|
On November 12 2024 20:42 RJBTVYOUTUBE wrote:Show nested quote +On November 12 2024 03:30 TMNT wrote:On November 12 2024 02:38 cheesehuehue wrote: Here is another example of your "reasoning" that is clearly fallacious, product of your ignorance. You criticize OP's method, but your critique is invalid. You cannot conclude the sample is "lopsided" from the race proportions in the tournaments. Anyone with minimal training in causal inference (which you clearly lack) will know that. Why? Because you need a standardizing factor: For instance if few NEW players choose to play protoss, then they are expected to be underrepresented among tournament players (with the few participating being older). Similarly, if most NEW pro players are e.g. Terran (or Zerg), then other races will be underrepresented. The problem is choosing the appropriate standardizing factor: 1) Total number of players in ladder? 2) total number of players who have tried to qualify to an ASL/SSL? 3) Total number of players who have actually participated in KSL/ASL/SSL? Another standardizing factor? The total number of players wouldn't make sense as anyone can create an account and play for a few days. The only options that make sense are options 2 and 3. OP used option 3. To use a chimp logic that you can understand, if you throw a coin and get 7 heads, you cannot conclude that is a high or low number, you need to know the total number of tosses.
And for this point don't expect to throw in a number of red herrings and win the argument. What new players lol? Everyone knows it's the same 30 guys or something who have been playing each other since Remastered. There's also option 4 which is called sponbbang/eloboard but I see you just pretend they didn't exist. But let's go with option 3 which I never disagreed with in the first place, but you can either process it in the most basic way (like op did) or you can refine it like I did. Like in terms of players pool, what's better than having the same numbers of players for each race for the comparison of win rate? Whether it's top 5/10/20 doesn't matter. The important thing is it's a better method than having 5 players for one race and 10 for the other. If you can't refute that you're just a troll. Also, for a guy who keeps trying to say it's only Soulkey that is doing good as Zerg recently (it's true), it's clear you acknowledge that he makes it lopsided for Zerg. But when another person use the same argument but for the whole lineup of a race, oh suddenly it becomes invalid lololol. In fact that's a common theme for you in this forum, being a hypocrite and lacking of self-awareness. Like you literally started the conversation with a patronizing tone, aiming at the "Protoss whiners", but then when I presented you with a post of pure analysis and no personal offence, you retaliated like a cunt. Actually, there are some "newish" players that rose up in the remastered era. Rain and Soulkey rose up super early in the current Era so you can consider them "old" players. But JyJ took until 2018/2019 to be a player recognized by the other pros as an equal. and he broke through as a top terran in 2022/2023. Royal was recognized as a top terran all the way back in 2020 when he started performing really well in spons and then in proleagues. But before then he was considered to be on a tier below the rest. Barracks likewise was also on that tier until 2021/2022 when he started to set himself apart as the next potential breakthrough terran. Speed literally had his breakthrough just this year and performs as well as a top 5-8 terran. Biggest breakthrough story is Soma who suddenly broke through with Castermuse Starleague season 1 if I recall correctly. that was may 2019. That same year he got 4th in KSL. Beside them though we have not seen actually new players breakthroigh into the top level. We did see new players break into the "joker tier" below King tier. (king tier usually reserved for round of 16 talent). We have only seen already top players shift around a bit and have their respective moment to shine in a meta or multiple metas. Yeah but what you're talking about is the topic of another conversation. All those players you mentioned are progamers with a license who had been training in teamhouses for years (before 2010) in the Kespa days. I think even Soma was in a team but didn't get his license? They were in their team's lineups for Proleague. Barracks was even in OSL once. JYJ and Speed played in OSL/MSL qualifiers but didn't get through. They played in the Sonic SL era. They belong to the pool of 30 or so players in the Remastered era as I mentioned. What changed during the period 2018-2022 is they improved their level. In that sense there's little difference between those guys and the likes of Snow/Mini (who achieved just a bit more in OSL/MSL days).
What the other guy implied is the new players who literally just start to pick a race and play the game. Tbh dude knows there is no "new" player, he just wanted to put out a hypothetical example to distract from the main argument.
It's also funny that he mentioned standardizing factor. Well surely to assess balance at the top of the game through win rate, you need a pool of equal number of players in each race? For example surely if you judge Protoss balance by using only a pool of the 6 Dragons against the entire players pool of T and Z then the Protoss race would look absurdly imba lol.
|
On November 12 2024 21:46 TMNT wrote:Show nested quote +On November 12 2024 20:42 RJBTVYOUTUBE wrote:On November 12 2024 03:30 TMNT wrote:On November 12 2024 02:38 cheesehuehue wrote: Here is another example of your "reasoning" that is clearly fallacious, product of your ignorance. You criticize OP's method, but your critique is invalid. You cannot conclude the sample is "lopsided" from the race proportions in the tournaments. Anyone with minimal training in causal inference (which you clearly lack) will know that. Why? Because you need a standardizing factor: For instance if few NEW players choose to play protoss, then they are expected to be underrepresented among tournament players (with the few participating being older). Similarly, if most NEW pro players are e.g. Terran (or Zerg), then other races will be underrepresented. The problem is choosing the appropriate standardizing factor: 1) Total number of players in ladder? 2) total number of players who have tried to qualify to an ASL/SSL? 3) Total number of players who have actually participated in KSL/ASL/SSL? Another standardizing factor? The total number of players wouldn't make sense as anyone can create an account and play for a few days. The only options that make sense are options 2 and 3. OP used option 3. To use a chimp logic that you can understand, if you throw a coin and get 7 heads, you cannot conclude that is a high or low number, you need to know the total number of tosses.
And for this point don't expect to throw in a number of red herrings and win the argument. What new players lol? Everyone knows it's the same 30 guys or something who have been playing each other since Remastered. There's also option 4 which is called sponbbang/eloboard but I see you just pretend they didn't exist. But let's go with option 3 which I never disagreed with in the first place, but you can either process it in the most basic way (like op did) or you can refine it like I did. Like in terms of players pool, what's better than having the same numbers of players for each race for the comparison of win rate? Whether it's top 5/10/20 doesn't matter. The important thing is it's a better method than having 5 players for one race and 10 for the other. If you can't refute that you're just a troll. Also, for a guy who keeps trying to say it's only Soulkey that is doing good as Zerg recently (it's true), it's clear you acknowledge that he makes it lopsided for Zerg. But when another person use the same argument but for the whole lineup of a race, oh suddenly it becomes invalid lololol. In fact that's a common theme for you in this forum, being a hypocrite and lacking of self-awareness. Like you literally started the conversation with a patronizing tone, aiming at the "Protoss whiners", but then when I presented you with a post of pure analysis and no personal offence, you retaliated like a cunt. Actually, there are some "newish" players that rose up in the remastered era. Rain and Soulkey rose up super early in the current Era so you can consider them "old" players. But JyJ took until 2018/2019 to be a player recognized by the other pros as an equal. and he broke through as a top terran in 2022/2023. Royal was recognized as a top terran all the way back in 2020 when he started performing really well in spons and then in proleagues. But before then he was considered to be on a tier below the rest. Barracks likewise was also on that tier until 2021/2022 when he started to set himself apart as the next potential breakthrough terran. Speed literally had his breakthrough just this year and performs as well as a top 5-8 terran. Biggest breakthrough story is Soma who suddenly broke through with Castermuse Starleague season 1 if I recall correctly. that was may 2019. That same year he got 4th in KSL. Beside them though we have not seen actually new players breakthroigh into the top level. We did see new players break into the "joker tier" below King tier. (king tier usually reserved for round of 16 talent). We have only seen already top players shift around a bit and have their respective moment to shine in a meta or multiple metas. Yeah but what you're talking about is the topic of another conversation. All those players you mentioned are progamers with a license who had been training in teamhouses for years (before 2010) in the Kespa days. I think even Soma was in a team but didn't get his license? They were in their team's lineups for Proleague. Barracks was even in OSL once. JYJ and Speed played in OSL/MSL qualifiers but didn't get through. They played in the Sonic SL era. They belong to the pool of 30 or so players in the Remastered era as I mentioned. What changed during the period 2018-2022 is they improved their level. In that sense there's little difference between those guys and the likes of Snow/Mini (who achieved just a bit more in OSL/MSL days). What the other guy implied is the new players who literally just start to pick a race and play the game. Tbh dude knows there is no "new" player, he just wanted to put out a hypothetical example to distract from the main argument. It's also funny that he mentioned standardizing factor. Well surely to assess balance at the top of the game through win rate, you need a pool of equal number of players in each race? For example surely if you judge Protoss balance by using only a pool of the 6 Dragons against the entire players pool of T and Z then the Protoss race would look absurdly imba lol.
Soma had a license but no team. Only other player who had some sort of success without pro-team experience is Brain. He got a license on both terran and protoss but didn't get onto a team. Currently all the active unlicensed but considered pro players are Joker tier.
And I agree. The argumentation made by cheese looks logical superficially but is full of holes and fallacies when you look closer to the point where there's not much left to scrape for value. Protoss is not imba. Protoss has a few very strong players concentrated at the top tier, then nothing in the high tier, and then a lot of players in the middle tier section. Zerg has the same distribution. A BUNCH of top tier zergs, then no high tier, and then a few mid tier, and then a bunch of low tier.
Terran has a much more gradual/even distribution of Top, High, Mid and Low tier players compared to the others. But this also affects the winrates. And that's purely the sample size of players being relatively small where outliers affects statistics much more. If we had the same number of pros as in the Kespa era we'd see probably a better distribution amongst the tiers.
In small sample sizes outliers can skew the numbers. You can look at as many matches played between the players and get some useful data out of it. But if most of those matches happened between unequally skilled players your data will still be skewed by the outliers.
|
On November 12 2024 05:44 bochs wrote: Amazing analysis, and thanks for sharing!
Looking at the ELO ranking, the fact that Flash is still #2, and Effort is still #6, tells me that the decay factor should be tuned a bit more aggressively. These two haven't been active for quite a while (almost an eternity in such a competitive scene, where active players are improving by the day), and shouldn't deserve such a high spot anymore. For instance, can we say that Flash in his current form is better than Light?
I wonder what the ELO ranking will look like after the tune up.
I agree it might be sensible to introduce some kind of decay. There's none at all as it stands now.
|
On November 11 2024 05:27 Uldridge wrote: About seeding: you claim it having an advantage, but did you take into account the player' overall strength?
I.e. If a player can consistently get into ro8, with seeding (or is more likely), is he also a stronger player on average, or not?
You're absolutely right to point out this problem. I've tried to make sure the comparison is not too unfair by only including players who have participated in at least one season as a seeded player. However, this isn't really good enough, as players who consistently get seeded will make up a larger part of the seeded results in the comparison, whereas players who consistently qualify but have only been seeded once or twice will make up most of the non-seeded results. What I should really do is only compare individual players' results as seeded vs non-seeded, and then take the average of those.
|
On November 12 2024 08:21 RowdierBob wrote: This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it.
Are there any particular figures you'd like to see for only Ro.8 and onwards?
|
Also keep in mind that players who make it to the Round of 8 and beyond contribute more games to the data pool than players who crash in Ro24 or Ro16. Just in the Ro8 match they can potentially play MORE games than they did in Ro24 or Ro8 combined. This also leads to an outlier player such as Soulkey or Flash having a greater impact on the winrates in the data pool, further skewering the numbers. Also who a player gets on their path affects the data too. If a soulkey level player gets an easy ro24, easy ro16, easy ro8, they will likely contribute no losses to the data. But if they get a really difficult group of near their own skill in Ro16 or Ro8, they will contribute a more even set of data.
Also, if a best of set is super close in each game but only one of the players gets any wins, you get a dishonest contribution to the data. The data will see a 4-0. But in reality all 4 games were super close and could've gone either way. This is a problem that occurs in small sample sizes with two evenly skilled players. You can see it with Ultimate Battle specifically because those are best of 9. Sometimes one player starts with 3 wins, but then loses 5 games out of the next 6. Based off of the first 3 you could conclude that player absolutely dominated. But based off of the next 6 games you can conclude it was the other who dominated. But when looking at each game by itself you can conclude something entirely different.
|
Interested in seeing the how much the season with Sparkle/3rd World affected Protoss' overall winrate in both matchups lol. But then again there's probably a couple seasons that had dramatically imbalanced maps for each race when you dig deep enough.
I think the biggest problem with affecting PvZ still is the Bo7 format in the Ro8 onward. Protoss can do much better in Bo1s and Bo3s, but once you get to a series the Zerg have so many more options for mind-games and cheese than the protoss has. A player like Soulkey can really abuse that.
edit - would love to see stats for each MU based on series length: Bo3, Bo5, Bo7. My prediction is that for TvZ and TvP it gets more balanced the longer the series, but for ZvP it gets more IMBA.
|
would love to see stats drawn exclusively from KCM or from Ultimate battle. Ultimate battle is usually reserves only for the top of the top. and KCM is sometimes a mixed bag of top and mid but usually top.
|
On November 14 2024 00:41 Ideas wrote: edit - would love to see stats for each MU based on series length: Bo3, Bo5, Bo7. My prediction is that for TvZ and TvP it gets more balanced the longer the series, but for ZvP it gets more IMBA. Actually there are not a lot of Bo7's ever played in ASL. They only introduced it from season 11 and only from the semifinals except for this season. For example there're only 8 Bo7 PvZ series in ASL and the score is 4-4, and many would find it surprising that Protoss is leading 22-20 in map score, of which Mini (16 wins) and Soulkey (12 wins) combined are responsible for 66% of the maps themselves.
It's not useful data at all if we want to talk about balance because ultimately it comes down to a few players and their form on those specific days.
And as RJB mentioned above, a 4-0 score in ASL may not suggest the dominance of the winner or the imba of the matchup like we would intuitively think. Simply because of the sample size and the flow of a BoX series in tournament. For example, Mini slapped Queen 4-1 in consecutive seasons, giving the impression of Queen being his bitch but if you check his overall win rate of him vs Queen right at the times he slapped Queen in ASL, it's just close to 50%. Similarly Soulkey just slapped Snow 4-0 this season but online overall they are more like 60/40 or something.
|
On November 13 2024 05:55 JackyVSO wrote:Show nested quote +On November 12 2024 08:21 RowdierBob wrote: This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it. Are there any particular figures you'd like to see for only Ro.8 and onwards? Just some general balance stats. Just a hunch that P will be a whole lot worse from ro8 onwards compared to the overall stats.
|
On November 13 2024 05:55 JackyVSO wrote:Show nested quote +On November 12 2024 08:21 RowdierBob wrote: This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it. Are there any particular figures you'd like to see for only Ro.8 and onwards? Kcm and ultimate battle stats!
|
On November 14 2024 12:11 RJBTVYOUTUBE wrote:Show nested quote +On November 13 2024 05:55 JackyVSO wrote:On November 12 2024 08:21 RowdierBob wrote: This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it. Are there any particular figures you'd like to see for only Ro.8 and onwards? Kcm and ultimate battle stats! I just realized that KCM stats can be done with eloboard. There's always a memo "KCM" next to each entry of the KCM games so you can do an advanced search and get all the KCM games that have been played since eloboard was created. In one click.
After some copy + paste + sort in an excel sheet, it gives me these stats, since Jun 2021: PvT: 233-214 (52.1%) TvZ: 251-226 (52.6%) ZvP: 263-221 (54.3%)
The sample size is similar to that of ASL/KSL but the player pool is smaller (top 10 of each race probably). Would be nice if someone can dig up the stats before June 2021 though. Would x2 the sample size.
|
On November 14 2024 09:42 TMNT wrote:Show nested quote +On November 14 2024 00:41 Ideas wrote: edit - would love to see stats for each MU based on series length: Bo3, Bo5, Bo7. My prediction is that for TvZ and TvP it gets more balanced the longer the series, but for ZvP it gets more IMBA. Actually there are not a lot of Bo7's ever played in ASL. They only introduced it from season 11 and only from the semifinals except for this season. For example there're only 8 Bo7 PvZ series in ASL and the score is 4-4, and many would find it surprising that Protoss is leading 22-20 in map score, of which Mini (16 wins) and Soulkey (12 wins) combined are responsible for 66% of the maps themselves. It's not useful data at all if we want to talk about balance because ultimately it comes down to a few players and their form on those specific days. And as RJB mentioned above, a 4-0 score in ASL may not suggest the dominance of the winner or the imba of the matchup like we would intuitively think. Simply because of the sample size and the flow of a BoX series in tournament. For example, Mini slapped Queen 4-1 in consecutive seasons, giving the impression of Queen being his bitch but if you check his overall win rate of him vs Queen right at the times he slapped Queen in ASL, it's just close to 50%. Similarly Soulkey just slapped Snow 4-0 this season but online overall they are more like 60/40 or something.
Wow it feels like there has been a lot more lol. I guess recently bias of soulkey beating snow last 2 seasons has really skewed my memory. Thanks for sharing!
|
Netherlands4724 Posts
The ever old P>T>Z>P shows up time and time and again. The game is not perfect and maps can never balance all 3 match ups. Imba can be overcome by skill, planning, execution and luck. Every race has had multiple greats.
I'm not saying analyzing data and getting bigger samplw sizes and using more and better criteria shouldn't be done, but we can agree on 3 things: 1. Flash was the most imba to ever play the game so far, and like superman he still had his kryptonire in Effort 2. P>T>Z>P 3. Some players can be so good at a certain match up or map, they can overpower the imba
Also if we keep a healthy influx of new maps,, the game won't get stale nor can we ever truly conclude imba. Thar's a good thing. What we can always easily see which player is the current best performer of them all and we only need some stats for that, no fancy analizing. Couple that with 2 Premier tournaments and we're golden.
|
On November 14 2024 22:29 TMNT wrote:Show nested quote +On November 14 2024 12:11 RJBTVYOUTUBE wrote:On November 13 2024 05:55 JackyVSO wrote:On November 12 2024 08:21 RowdierBob wrote: This is great, Jacky, thanks for your work on it.
I'm not sure if this is something you'd be inclined to do, but I'd be really curious to see the data for games when they hit the ro8 onwards (so r08, ro 4, finals). It feels like watching the games there are a lot of 'mid' P players who do well each season but never really threaten to win an ASL. There seems to be a big drop off in Protoss play at the elite level, but I wonder if the data confirms it. Are there any particular figures you'd like to see for only Ro.8 and onwards? Kcm and ultimate battle stats! I just realized that KCM stats can be done with eloboard. There's always a memo "KCM" next to each entry of the KCM games so you can do an advanced search and get all the KCM games that have been played since eloboard was created. In one click. After some copy + paste + sort in an excel sheet, it gives me these stats, since Jun 2021: PvT: 233-214 (52.1%) TvZ: 251-226 (52.6%) ZvP: 263-221 (54.3%) The sample size is similar to that of ASL/KSL but the player pool is smaller (top 10 of each race probably). Would be nice if someone can dig up the stats before June 2021 though. Would x2 the sample size. These stats are close enough to 50% to say the game is balanced. If you were to remove the current season of KCM you'd get a slightly worse ZvP winrate because zergs performed well, tosses performed less well.
|
On November 14 2024 00:41 Ideas wrote: Interested in seeing the how much the season with Sparkle/3rd World affected Protoss' overall winrate in both matchups lol. But then again there's probably a couple seasons that had dramatically imbalanced maps for each race when you dig deep enough.
I think the biggest problem with affecting PvZ still is the Bo7 format in the Ro8 onward. Protoss can do much better in Bo1s and Bo3s, but once you get to a series the Zerg have so many more options for mind-games and cheese than the protoss has. A player like Soulkey can really abuse that.
edit - would love to see stats for each MU based on series length: Bo3, Bo5, Bo7. My prediction is that for TvZ and TvP it gets more balanced the longer the series, but for ZvP it gets more IMBA.
Here you go:
It shows almost the opposite of what you predicted. But we have very few Bo7 games in the database so those don't say very much. Here are the raw numbers:
1 3 5 7 T<P 107 42 87 22 T>P 94 28 71 32 P<Z 97 44 102 26 P>Z 80 34 108 26 T>Z 99 30 124 42 T<Z 93 31 99 44
|
On November 16 2024 04:18 JackyVSO wrote:Show nested quote +On November 14 2024 00:41 Ideas wrote: Interested in seeing the how much the season with Sparkle/3rd World affected Protoss' overall winrate in both matchups lol. But then again there's probably a couple seasons that had dramatically imbalanced maps for each race when you dig deep enough.
I think the biggest problem with affecting PvZ still is the Bo7 format in the Ro8 onward. Protoss can do much better in Bo1s and Bo3s, but once you get to a series the Zerg have so many more options for mind-games and cheese than the protoss has. A player like Soulkey can really abuse that.
edit - would love to see stats for each MU based on series length: Bo3, Bo5, Bo7. My prediction is that for TvZ and TvP it gets more balanced the longer the series, but for ZvP it gets more IMBA. Here you go: It shows almost the opposite of what you predicted. But we have very few Bo7 games in the database so those don't say very much. Here are the raw numbers: 1 3 5 7 T<P 107 42 87 22 T>P 94 28 71 32 P<Z 97 44 102 26 P>Z 80 34 108 26 T>Z 99 30 124 42 T<Z 93 31 99 44
Because the sample size is so tiny its much more affected by player individual skill.
|
Vatican City State90 Posts
Yeah I'm not reading whatever certain contrarian may have said, it's like talking with a wall.
On November 16 2024 21:41 RJBTVYOUTUBE wrote: Because the sample size is so tiny its much more affected by player individual skill.
That shouldn't be a problem. @JackyVSO some suggestions that you might want to try to extract more information from the data:
- Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players. You can even use a bootstrap approach to calculate an empirical p-value and confidence interval, even with a small number of players. For instance, you could run a bootstrap resampling (e.g. using 10k replicates) for each player. For each replicate you can get an average PvZ (average of the PvZ winrate of all protoss players). You will have 10k means (from 10k replicates). You can use the distribution of those means to calculate the confidence intervals. Alternatively, you could use the same approach but resampling matches rather than players (i.e. perform the bootstrap on the pooled-matches, not on specific players' matches). A third approach would be to bootstrap the players themself (i.e. randomly leave some of the players out). Would be interesting to compare the estimates of the three approaches. If all bootstrap analyses lead to the same conclusion then that would put a nail in the coffin. If they differ, the interpretation would be more complicated. One caveat of the first approach is that you will probably have to standardize it by running the same procedure for Zergs (ZvP win rate), and standardize the win rates so that they sum up to 100% (they sum could deviate from 100% due to the unbalanced sample size). So for instance Adjusted_PvZ=[PvZ/(PvZ + ZvP)], and Adjusted_ZvP=[ZvP/(PvZ + ZvP)]. You would need to calculate those adjusted win rates for every bootstrap replicate. Same for other match ups.
On November 16 2024 21:41 RJBTVYOUTUBE wrote: These stats are close enough to 50% to say the game is balanced.
No one, absolutely no one, has ever provided a systematic and unbiased* analysis that proves that the game is unbalanced. Every time someone has complained about the game being imbalanced has been based on specific, anecdotal observations that were hand-picked.
And by systematic and unbiased I mean a strict analysis where the populations were declared ahead of time, where the inclusion and exclusion criteria (game-wise, and players-wise) were also declared ahead of time and applied equally to all races. Trying to restrict the inclusion of players of one race (Protoss) based on length of professional career, but not of other races is cherry picking: The selection criteria must be the same for all observations. And by populations I mean declaring ahead of time about whom the generalizations are being made: "All players that participated in ASL/KSL/SSL", "All players that participated in KCM", "All ladder players with a MMR above e.g. 2300", etc. Cherry picking specific observations to support an already made conclusion is nonsense.
Criticizing this analysis for only considering ASL/KSL/SSL is nonsensical. That's how the population was defined. The same analysis can be repeated in any other pre-defined population, and if all populations lead to the same conclusion then the debate is settled until anyone provides an analysis of another reasonably defined population with a different conclusion.
Kraekkling did an analysis of ladder games, defining different selection criteria (based on MMR and effective APM thresholds). The conclusions are similar: P>T>Z>P. One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race. The analysis could also be restricted to only main accounts/main race, and at least say X games during the last year, and at least X games per month every month. The proportion of protoss at the top could also be compared to the proportion of protoss players at different MMR bins, and see if they become under represented (relative to the lower MMR bin), as the MMR threshold is increased, etc, etc.
|
That's a lot of fallacies in one post lol (once again, from the guy who always accuses others of using fallacies)
I doubt anyone here would strongly disagree with the notion of P>T>Z>P.
What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three. At the top level mind. And it's not cherry picked observations. It's actually every time another evidence shows up, people are like 'here we go again'.
- Kespa stats from 2001-2012 (~35k games). It reads 52.1% PvT, 54.5% TvZ, 54.7% ZvP.
- Eloboard stats from 2021-2024 (~70k games). No Flash by the way. It reads 47.7% PvT, 55.1% TvZ, 52.3% ZvP.
- Sponbbang stats from 2017-2021 (which means with Flash) shows similar stats to eloboard by the way. The site is dead now but I believe there are screenshots of those stats we can even find right here in tl.
- 8 mil games (mostly ladder I think) analysis from Kraekkling. It says from 2k mmr and above only PvT win rate is less than 50%.
- Extracted data from KCM race wars since 2021 (which means no Flash) I posted in the previous page. 52.1% PvT, 52.6% TvZ, 54.3% ZvP.
The populations of those analyses are well defined, no? Not that I'm saying there is no flaws in the observed data, but that's the only thing we have to work with at the moment.
It's funny because this:
Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players.
One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race is very close to what I literally did in the previous page (except it's the accumulated win rate of the examined players, not the average of their individual win rates - but if we treat it the latter way, the same trend remains as well). But since this troll can't argue with that (because it contradicts his already made conclusion) he just pretends it doesn't exist and keeps spilling out jargons.
- The only time the stats look different from the other observations is in this ASL/KSL/SSL thread where we have 53.4% PvT, 52.5% TvZ, 52% ZvP but as I already pointed out that's because of the combination of low sample size + uneven player pools for each race. Here's what happens if we standardize (lol) the data by comparing the accumulated win rates of the top 4, 8, 12 etc. players of each race (raw data in page 2):
Pls spare my poor excel skills. Note that only 12 Protosses have more than 10 PvT games in ASL/SSL/KSL and after that we have Noob and Brain who played a combined 5 games with a freaking win rate of 100% (hence the little bump from top 12 to top 16).
And the more shocking thing is, even if we remove Flash (best TvZ) and Soma (best ZvP) from the data, and keep all the Protoss players in, Protoss doesn't even come out on top if we look at the top 8 players: 56.9% PvT, 58.5% TvZ, 56.5% ZvP.
|
Vatican City State90 Posts
On November 18 2024 22:51 TMNT wrote: I doubt anyone here would strongly disagree with the notion of P>T>Z>P.
Then WTF are you yapping so much about??????
And you provide all the eloboard results as a single stats. Really, you didn't think of stratifying in any meaningful way? By elo, proleage, k-league, etc???? How dumb are you?????
|
On November 19 2024 05:10 cheesehuehue wrote:Show nested quote +On November 18 2024 22:51 TMNT wrote: I doubt anyone here would strongly disagree with the notion of P>T>Z>P.
Then WTF are you yapping so much about?????? Dont you know how to read? It's literally written in the very next sentences of my post:
"What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three"
It's also a response to your original post:
However the advantage of Z over P is smaller than both the advantage of P over T, and the advantage of T over Z.
But people who whine about Protoss being the weakest race will simply reject the evidence and get angry for presenting FACTS to them. Unless you can support your whining with FACTS, SC is a balanced game. PERIOD. Now please explain how come you had no problems announcing the part in bold with confidence after seeing a stat calculated in the same method with the other stats that everyone has been aware of that I just presented? Suddenly there is no need for confidence intervals or anything when it supports your conclusion eh?
Also it's funny because your first and second paragraph here kind of contradict each other lol. Obviously if Z>P>>T>>Z then Z is the weakest race. It's nice of you to not whine but the game would not be balanced in that case nonetheless.
And you provide all the eloboard results as a single stats. Really, you didn't think of stratifying in any meaningful way? By elo, proleage, k-league, etc???? How dumb are you????? Because I'm not making an analysis and at the same time call out people for not accepting my FACTS.
But maybe you're right. We should stratify it. So how come I haven't seen you demanding the ASL stats to be stratified into Ro24/16/8 games, or top 4/8/12 players for each race? Oh wait, I already did the latter for you without anyone asking, but you're still ignoring it, well well, because it doesn't support your conclusion.
|
we should start analyzing data excluding outlier players from the data. top 7 for each race but exclude the top 1 and bottom 1 to reduce the influence an outlier has on the data pool. and use only eloboard for the data because it includes kcm and starleagues. exclude maps like troy, monty and minstrel that have lopsided winrates for specific match-ups. preferably use maps that are considered most balanced like Radeon, apocalypse etc.
|
On November 19 2024 15:08 RJBTVYOUTUBE wrote: we should start analyzing data excluding outlier players from the data. top 7 for each race but exclude the top 1 and bottom 1 to reduce the influence an outlier has on the data pool. and use only eloboard for the data because it includes kcm and starleagues. exclude maps like troy, monty and minstrel that have lopsided winrates for specific match-ups. preferably use maps that are considered most balanced like Radeon, apocalypse etc. That's one way. Although I'd add we should do both (with outliers and without outliers) and compare the results. The thing with BW statistics is no analysis can be definitive enough even if you manage to calculate p values or confidence intervals, because you always start with an observed data that has flaws. However, every stats can be suggestive. Thats why I'm always of the mindset that if most pieces of information suggest the same thing, then that thing is probably true.
The notion that P is the weakest race at the top level because PvT is not as favourable as TvZ or ZvP is neither new nor something I invented/pre-concluded myself. It has been analysed by others and from many years ago. For example here and here. We saw the same pattern in Kespa as we're seeing now. There is simply too much smoke for it to not be a fire.
|
On November 18 2024 22:51 TMNT wrote:That's a lot of fallacies in one post lol (once again, from the guy who always accuses others of using fallacies) I doubt anyone here would strongly disagree with the notion of P>T>Z>P. What the "Protoss whiners" are saying is P>T>>>Z>>P or something like that (the number of ">" in TvZ and ZvP relative to each other is up for debate though, while sometimes there are people arguing that P=T), but one thing for sure PvT is the least imba matchup among the three. At the top level mind. And it's not cherry picked observations. It's actually every time another evidence shows up, people are like 'here we go again'. - Kespa stats from 2001-2012 (~35k games). It reads 52.1% PvT, 54.5% TvZ, 54.7% ZvP. - Eloboard stats from 2021-2024 (~70k games). No Flash by the way. It reads 47.7% PvT, 55.1% TvZ, 52.3% ZvP. - Sponbbang stats from 2017-2021 (which means with Flash) shows similar stats to eloboard by the way. The site is dead now but I believe there are screenshots of those stats we can even find right here in tl. - 8 mil games (mostly ladder I think) analysis from Kraekkling. It says from 2k mmr and above only PvT win rate is less than 50%. - Extracted data from KCM race wars since 2021 (which means no Flash) I posted in the previous page. 52.1% PvT, 52.6% TvZ, 54.3% ZvP. The populations of those analyses are well defined, no? Not that I'm saying there is no flaws in the observed data, but that's the only thing we have to work with at the moment. It's funny because this: Show nested quote +Instead of calculating a single PvZ (or ZvP) win rate, you could calculate the PvZ win rate for every Protoss player that has participated in ASL/KSL/SSL. Then average the win rate of all players.
One thing that was missing from the ladder games analysis is that the win rates could also be calculated as the average of the win rates of players of each race is very close to what I literally did in the previous page (except it's the accumulated win rate of the examined players, not the average of their individual win rates - but if we treat it the latter way, the same trend remains as well). But since this troll can't argue with that (because it contradicts his already made conclusion) he just pretends it doesn't exist and keeps spilling out jargons. - The only time the stats look different from the other observations is in this ASL/KSL/SSL thread where we have 53.4% PvT, 52.5% TvZ, 52% ZvP but as I already pointed out that's because of the combination of low sample size + uneven player pools for each race. Here's what happens if we standardize (lol) the data by comparing the accumulated win rates of the top 4, 8, 12 etc. players of each race (raw data in page 2): Pls spare my poor excel skills. Note that only 12 Protosses have more than 10 PvT games in ASL/SSL/KSL and after that we have Noob and Brain who played a combined 5 games with a freaking win rate of 100% (hence the little bump from top 12 to top 16). And the more shocking thing is, even if we remove Flash (best TvZ) and Soma (best ZvP) from the data, and keep all the Protoss players in, Protoss doesn't even come out on top if we look at the top 8 players: 56.9% PvT, 58.5% TvZ, 56.5% ZvP. Some great stats right there. Thanks a lot.
|
|
|
|