On October 30 2020 08:17 Djabanete wrote: Averaging the PvZ win probabilities across all maps used in a Bo5 is not the mathematically correct way to get the overall PvZ win probability in that Bo5.
Without having worked through the correct math, I’m skeptical of the idea that tossing one heavily P-favored map into a pool of 55/45 Z-favored maps would be fair.
Edit: You could also question the validity of those win probabilities, as some have done, but even if you take them at face value you cannot perform a simple average in this context.
Example: If I play you in a Bo5 and I have a 100%, 100%, 0%, 0%, and 100% chance to win on the five maps, then I will win the Bo5 100% of the time, not 60% of the time, even though 60% is my “average” win probability each game.
While your example is right, the question here is not that one heavily P favored map into a pool of 54/45 Z favored maps would be fair or not, but how much heavily P favored that map should be to balance the win probability of the series as a whole.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
You're one of the more balanced poster here. Best to ignore the negativity. Keep it up!
The same side who earlier said "Thank you for pushing the discussion positively with well thought out posts" now accuses you of being "either ignorant and downright dishonest".
It's amazing how quick tides can shift. I guess "pushing the discussion positively" only happens when we're pushing the discussion towards one side
It was two different people that said these quotes, why does it matter that they are "the same side"?
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
You're one of the more balanced poster here. Best to ignore the negativity. Keep it up!
The same side who earlier said "Thank you for pushing the discussion positively with well thought out posts" now accuses you of being "either ignorant and downright dishonest".
It's amazing how quick tides can shift. I guess "pushing the discussion positively" only happens when we're pushing the discussion towards one side
Uh? I think you're making a mistake here. I said the former, not the later. I like Light-'s posts a lot.
There are clearly certain people here who are very determined to 'shut down' discussion on valid talking points. They resort to really harsh words, including ad hominen attacks. You know who you who are.
I'm surprised no one has thrown in the running "Should Protoss have a <insert home area>?" joke yet. At least that was really funny...
Anyway, some valid points have been made on 'balancing the map pool'. It's an interesting angle to focus on (rather than patching the game).
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Let me understand, in your opinion ZvP is not the most lopsided matchup there is in Brood War(players and maps aside)?
I never made any such claim, quite the opposite if you look closely; I believe that the current ASL maps are actually quite fair overall in pvz, much more so than say CB, FS, Sylphid would be (and many other maps from previous seasons of ASL). My personal opinion on the subject is that the "standard 4p macro maps" are completely outdated and have been proven imbalanced in several matchups. My overall main point in this thread though was that Best lost because he played incredibly poorly, not because of his race, the mu, or the maps. Which makes sense, seeing that this is in fact the thread for discussing the ASL season 10 Ro8 day 4. I don't know why but people seem to think it's the place to balance whine and ask for patches.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
Why would you consider that a good or reliable set of data? You would have to be ignorant to consider the rampant use of outdated irrelevant maps and "legacy showmatches" between players that were good 15 years ago reliable or good data sets. Also only a soft science would ever consider ~585? games over a 1+ year span a good sample size of anything. Why don't you look at sponbang that has thousands of games played, on actual meta maps, by the absolute best progamers in the world? The answer to that is either you didn't know about it (ignorance) or that data didn't support your ideas (dishonest). In what universe can you possibly believe that ultimate battle is less lopsided than spongames? Spongames are literally played by the absolute best players in Korea; once in awhile a lesser pro might get to participate, say someone like Shine, Miso or Killer. But you have to understand that the lower tier players tend to play other lower tier players more so than they do the absolute top; for example, Flash has 41 spon games this month, his "worst" opponent is either Sharp or Shuttle. To me those are both top ~5(ish) players of their respective races and are infinitely more valid than the abominations in Ultimate Battle (Nada vs Reach, Britney vs Ginyuda, NAl_ra vs tossgirl, Calm vs Horang2, etc etc.) Lastly, I would also argue that there is a definite trend where if a player or team has already won the match, they will start semi-trolling because the games have to be played out regardless and both players just want to get out of there once the winner is determined, which is obviously bad for any sort of useful statistical analysis.
Now I think everyone should either discuss the game between Best and Zero as this thread was intended for, or you should go make a new post to discuss your great patching or balance ideas for the game. Know that there is not going to be any balance patches though, and good players are 100% okay with that, else they would not still be playing.
I haven't seen anyone in this thread saying that Best played well and lost because ZvP is not balanced; also, Brood War evidently does not need patches.
This doesn't change the fact that ZvP is Brood War's worst matchup balance wise and that it would probably require the map pool to slightly favor Protoss, not the opposite like current one does.
Soma and ZerO would have advanced this time in any of case, given their opponents' poor performance.
On October 28 2020 20:01 RKC wrote: Interestingly, despite the 2/4 map pick, Best said in the pre-match interview along the lines that he would likely lose if the series go all the way.
That implies a few things: the earlier maps are more favourable, reliance on a limited 'bag of builds', lack of confidence to play long-drawn out macro games over a series.
Best lost the series when his cannon rush failed in G1. You could sense that in his mis-plays thereon (losing a shuttle for no reason, gambling on 2-gates to counter mutas, etc).
No doubt Best played poorly. The question is what was exactly his gameplan, and why he didn't seem confident playing straight against Zero (despite his commendable record). Any thoughts?
I actually do want to discuss about the game itself. But it just got drowned out by all the PvZ imbalance talk
This thread literally had people making those claims, and at least two people posted that a balance patch is required and suggested changes... I assume you didn't read it very carefully? Also what do you base your opinion on? If you look at sponbbang's stats in 2020, PvZ is the worst matchup, yes. But if you look at it since the beginning (May 2017 I believe) ZvT is actually at a lower %. I do think it's fairly obvious that balance has shifted over time based on maps and meta-shifts (for example, Benzene was, according to LP, a good map for PvZ when it was used in 2010-2011, whereas now it is the most broken zvp map in the current map rotation. People used to consider FS perfectly balanced, and now we know that it is not.
Also, I think it's worth having a discussion about the biggest change in mu stats -- there has long been an opinion that the game is balanced via rotational imbalance, meaning z might struggle vs t, but they make up for that by beating p, etc. However, a somewhat recent trend is that terran has been winning both of their matchups, which voids that notion. For example, if you look at race win rates in 2020 rather than specific matchups, you'll get Terran at 52.9%, Zerg at 50.6% and Protoss at 46.1%. If you do the same from May 2020, it's 54.5% for T, 49.7% for Z and 45.8% for P. I guess the "obvious" solution would be to make maps worse for T without making them worse for P/Z (the how is the complex part).
On October 28 2020 20:01 RKC wrote: Interestingly, despite the 2/4 map pick, Best said in the pre-match interview along the lines that he would likely lose if the series go all the way.
That implies a few things: the earlier maps are more favourable, reliance on a limited 'bag of builds', lack of confidence to play long-drawn out macro games over a series.
Best lost the series when his cannon rush failed in G1. You could sense that in his mis-plays thereon (losing a shuttle for no reason, gambling on 2-gates to counter mutas, etc).
No doubt Best played poorly. The question is what was exactly his gameplan, and why he didn't seem confident playing straight against Zero (despite his commendable record). Any thoughts?
I actually do want to discuss about the game itself. But it just got drowned out by all the PvZ imbalance talk
I have actually thought about the series a lot (although still didn't bother re-watching it, so I apologize for any numerical mistake I might make). I think that what most people missed out on (including the casters) was Best's choice of build in game 1, I remember several people saying that best made a mistake in going 2 stargates, especially since he didn't produce from both of them. However, the truth is that Best has been playing that build for awhile without the cannon rush (quite successfully as well). The idea of the build is to skip as many corners as possible to rush out a citadel and then a second gateway, He will attack with 4-5 zealots that have speed/+1 and a dragoon (which is built to deny scouting / kill free overlords) while then going 2 stargate +1 behind it -- the idea is that the zerg scouts the citadel with his overlord or speedlots with lings, which prompts a muta / sunk response. The zealots tend to still do damage, and then by the time the mutas try to do damage they all get destroyed by the corsairs. It's a cool build, and he absolutely destroyed Soulkey with it in one of the Ultimate Battle events. However, the cannon rush delayed his build a lot, and Zero did not opt for 3 hatch spire, but instead went for lair into a 4th hatch -- a build that relies on hydras to defend overlords from sairs, and the hydra den timing makes any 2 gate speedlot attack ineffective with almost no effort. Unfortunately Best put himself behind, and despite scouting the 3h lair, Zero ended up actually skipping the spire entirely -- I don't know whether he did this on purpose (prep) or if he just got a bit lucky, either way it was a rough g1 from Best.
G2 was fairly straight forward, Best rushed a sair out, had full scouting information for 50% of the game, saw Zero doing 4 hatch lurk/ling with a low drone count (28). He decided to only get 1 cannon at his egg-wall, he went pure goons to defend it, he rushed out a robo, a citadel a templars archive while going up to 7 gateways total -- all on 46 probes. Turns out that protoss cannot hold a low eco all-in while greeding on probes himself, having only 1 cannon and getting almost all tech in the game (stargate + robo + observatory + citadel + templar archives). I think he should've gotten a reaver out (probably before an observer), and/or made more cannons at his egg-choke, and/or just stayed on goon tech until he felt safe enough to go for citadel/archives etc. Also feel like him panic-building cannons at his natural's min line did absolutely nothing to hold the attack, as Zero just went around and killed his production.
G3 I mean he did well but at the end of the day the map is pretty bad for pvz, especially if you play zealot heavy and zerg gets to just waltz up to your choke with his lurkers. I liked Zero morphing his lurkers where he did.
On October 29 2020 00:21 TornadoSteve wrote: I dont know if it have been pointed out, but i love the small details in ZerO's game such as his decision to not scout with his initial overlord in Game 3.
In fact, he stopped it over the choke at his main and waited for Best scouting probe to spot it and trick him as the 2nd overlord hatching when he went for pool9.
Not a big move or anything, but could have if Best pulled back his probe back to confirm/try to block the hatchery expansion. On this map in particular, i can even see the benefit of delay your 1st scouting overlord to scout the path with 2 overlords later on. Loving it
Any thoughts about my earlier post? Is this a common thing at higher level? I feel like ZerO's mind game are very deep and the guy is still under rated af.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Let me understand, in your opinion ZvP is not the most lopsided matchup there is in Brood War(players and maps aside)?
I never made any such claim, quite the opposite if you look closely; I believe that the current ASL maps are actually quite fair overall in pvz, much more so than say CB, FS, Sylphid would be (and many other maps from previous seasons of ASL). My personal opinion on the subject is that the "standard 4p macro maps" are completely outdated and have been proven imbalanced in several matchups. My overall main point in this thread though was that Best lost because he played incredibly poorly, not because of his race, the mu, or the maps. Which makes sense, seeing that this is in fact the thread for discussing the ASL season 10 Ro8 day 4. I don't know why but people seem to think it's the place to balance whine and ask for patches.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
Why would you consider that a good or reliable set of data? You would have to be ignorant to consider the rampant use of outdated irrelevant maps and "legacy showmatches" between players that were good 15 years ago reliable or good data sets. Also only a soft science would ever consider ~585? games over a 1+ year span a good sample size of anything. Why don't you look at sponbang that has thousands of games played, on actual meta maps, by the absolute best progamers in the world? The answer to that is either you didn't know about it (ignorance) or that data didn't support your ideas (dishonest). In what universe can you possibly believe that ultimate battle is less lopsided than spongames? Spongames are literally played by the absolute best players in Korea; once in awhile a lesser pro might get to participate, say someone like Shine, Miso or Killer. But you have to understand that the lower tier players tend to play other lower tier players more so than they do the absolute top; for example, Flash has 41 spon games this month, his "worst" opponent is either Sharp or Shuttle. To me those are both top ~5(ish) players of their respective races and are infinitely more valid than the abominations in Ultimate Battle (Nada vs Reach, Britney vs Ginyuda, NAl_ra vs tossgirl, Calm vs Horang2, etc etc.) Lastly, I would also argue that there is a definite trend where if a player or team has already won the match, they will start semi-trolling because the games have to be played out regardless and both players just want to get out of there once the winner is determined, which is obviously bad for any sort of useful statistical analysis.
Now I think everyone should either discuss the game between Best and Zero as this thread was intended for, or you should go make a new post to discuss your great patching or balance ideas for the game. Know that there is not going to be any balance patches though, and good players are 100% okay with that, else they would not still be playing.
No, in ultimate battle every win gives player 100,000 won per game and additionaly players can bet themselfs a double. It makes quite the difference since players can still earn a lot of money with losing score. The betting mechanic makes this indeed quite competitive event, Light and Action for example have trained offline for these games.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
You're one of the more balanced poster here. Best to ignore the negativity. Keep it up!
The same side who earlier said "Thank you for pushing the discussion positively with well thought out posts" now accuses you of being "either ignorant and downright dishonest".
It's amazing how quick tides can shift. I guess "pushing the discussion positively" only happens when we're pushing the discussion towards one side
Thanks, appreciate it. I don't feel like there's been any real negativity, I think everyone's just trying to share their opinions, maybe it comes off the wrong way, but I don't think anyone's got bad intentions. Myself, I've always been interested in this issue so I'm hoping to get a better understanding of it, and I hope others share their thoughts.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Let me understand, in your opinion ZvP is not the most lopsided matchup there is in Brood War(players and maps aside)?
I never made any such claim, quite the opposite if you look closely; I believe that the current ASL maps are actually quite fair overall in pvz, much more so than say CB, FS, Sylphid would be (and many other maps from previous seasons of ASL). My personal opinion on the subject is that the "standard 4p macro maps" are completely outdated and have been proven imbalanced in several matchups. My overall main point in this thread though was that Best lost because he played incredibly poorly, not because of his race, the mu, or the maps. Which makes sense, seeing that this is in fact the thread for discussing the ASL season 10 Ro8 day 4. I don't know why but people seem to think it's the place to balance whine and ask for patches.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
Why would you consider that a good or reliable set of data? You would have to be ignorant to consider the rampant use of outdated irrelevant maps and "legacy showmatches" between players that were good 15 years ago reliable or good data sets. Also only a soft science would ever consider ~585? games over a 1+ year span a good sample size of anything. Why don't you look at sponbang that has thousands of games played, on actual meta maps, by the absolute best progamers in the world? The answer to that is either you didn't know about it (ignorance) or that data didn't support your ideas (dishonest). In what universe can you possibly believe that ultimate battle is less lopsided than spongames? Spongames are literally played by the absolute best players in Korea; once in awhile a lesser pro might get to participate, say someone like Shine, Miso or Killer. But you have to understand that the lower tier players tend to play other lower tier players more so than they do the absolute top; for example, Flash has 41 spon games this month, his "worst" opponent is either Sharp or Shuttle. To me those are both top ~5(ish) players of their respective races and are infinitely more valid than the abominations in Ultimate Battle (Nada vs Reach, Britney vs Ginyuda, NAl_ra vs tossgirl, Calm vs Horang2, etc etc.) Lastly, I would also argue that there is a definite trend where if a player or team has already won the match, they will start semi-trolling because the games have to be played out regardless and both players just want to get out of there once the winner is determined, which is obviously bad for any sort of useful statistical analysis.
Now I think everyone should either discuss the game between Best and Zero as this thread was intended for, or you should go make a new post to discuss your great patching or balance ideas for the game. Know that there is not going to be any balance patches though, and good players are 100% okay with that, else they would not still be playing.
The legacy showmatches were but a small fraction of the entire sample, about 50 games. So there's really no need to focus on them as if they were representative of the event as a whole. If anything, you run the danger of misrepresenting it and being dishonest. You can take them out if you don't find them valuable, it will not make a significant difference to the numbers. You'll still have 535 games featuring the best against the best. To be statistically significant, that is more than enough. You don't need thousands of games (where are you seeing this anyway, I'd appreciate a link), especially when those thousands of games consist of disproportionate representation amongst the players and have quality of matchup issues. Lower tier players playing lower tier players, and sometimes going up against higher tier. This is something you've identified, but I think we've overlooked its effect on the map stats.
Let's say you want to look at PvT win rates on some map. I think it's important to consider whose wins and losses have contributed to those percentages, because to me there looks to be a disproportion. Look at Best, one of the best PvTers. He's played 59 games this month. But then other guys with worse PvT like Stork and Shuttle have played 72 and 96 PvTs, respectively. Their games are being counted more than Best's in the overall map stats. That is the problem I see with Spon's data, that some players' wins/losses affect the overall records more, simply because they play more Spon matches, and these players are not always the best. The wins/losses of less than top players is not useful in determining balance.
Then quality of matchup issues. You see that almost half of Best's 59 games are from just two players - Sharp and Rush. Now sure, Sharp and Rush can be considered top Terrans, they were in the UB as well. But playing half your games against them? Ideally, you should play an equal number of games against all top players for the win rates to be more accurate, lest the win rates be padded with too much of lengthy, one-sided matchups. But unfortunately that's what oftentimes happens, one guy plays another guy a ton, and the results are lopsided, like 5-16, and then he'll have games vs guys not in his league at all. These kinds of games are also not useful to determine balance, but nevertheless, it all ends up going into the percentages, and to my knowledge there's no way of filtering it out. 34 out of Stork's 72 PvTs include Shinee, Leta, Jyj, Sea, Barracks, herO (an amateur? Never heard of him), organ, and Scan. Not the cream of the crop. You can basically say half of Stork's contribution to the PvT record is not useful and distorting the useful information. Best has 7 games vs Shinee and Sea. Only 3 against Flash. Stork has 37 games against Action and Hero. Going 9-28. What do you think this is doing to the map stats? So unless you have some way of filtering these issues, I don't think Spon stats are that useful.
Also, where did you get this idea that the players start "semi-trolling?" There's money involved in the UB. Why would anyone troll? And "my great patching or balance ideas?" I never advocated for a patch, and I don't. So try not to call others "ignorant" or "dishonest," lest you look like a hypocrite.
On October 29 2020 21:04 Avi-Love wrote: Honestly I'm starting to get really annoyed with the idiocy and misinformation in this thread, are you guys just completely out of touch with reality? Do you not follow the scene? Do you not understand the game at all?
First of all Falsh's pvz has nothing to do with fucking reavers, he plays the most stock standard sair/zeal attack into zeal/ht into 8 gate and/or exp, he's been doing this for weeks. He actually had a lot of success, especially on ringing bloom, where he would consistently do well against the very best zergs -- it does seem like Zero started figuring out how to counter his style, and he would implement a lot of big drop (counter drop / doom drop) play with hydras. There were a couple of funny games where he would also drop drones and start manner hatcheries in the middle of Flash's main.
Second of all there is absolutely no need for any sort of patch, if you think there is you're blind to the evolution of the game. Sc:bw is never going to be patched, any and all need for balance changes is done via maps, which gives more than enough room to tinker with things -- if you don't believe that, just look at how a lot of maps have completely changed the meta and mu balance throughout the ASL. Sparkle changed all of the matchups on their head, Ringing Bloom has made it more or less impossible to do 3hh, Plasma is the best map for protoss since Third World, etc etc etc.
Thirdly, this map pool is NOT "super broken" or "impossible for pvz". Since Jan 2020 the win rates on the ASL maps are as follows: Polypoid 46.7% Eclipse 45.1% Optimizer 47.4% Ringing Bloom 52.2% Benzene 37.7% Shakuras Temple 47.7% (Spon has two of them, I took the one with the most games, I'm too lazy to merge them) Plasma 67.9%.
My quick calculator potato math gives me an average PvZ win rate of 49.24% (I also checked since July, for a more recent, but smaller sample size, and the number ends up at 49.64%). Granted, both benzene and plasma have low'ish game counts and I suspect that if you were to do a weighted calculation where you also took into account the amount of games played, it would be a bit worse for protoss. But overall this map pool is *not* super imbalanced, nor is it the reason there is no protoss in the top 4. A FS/CB/Sylphid/Escalade type of map pool would be way closer to 40/60 than this, and would actually be potentially imbalanced, in my opinion.
Lastly, I honestly thought it would be painfully obvious for everyone watching that Best lost because he played badly, showed up with a ton of nerves and probably got tilted after his absolute failure to execute his own build in game 1. Best didn't lose because of the maps, or because of the match ups -- we know for a fact that he actually performs really well against Zero, and in particularly he does so on these very maps. The mental gymnastics required to consider 3 games played on one day, in a high pressure LAN situation, is a better sample size than their individual games played over a span of 3 months is absolutely breathtaking. How can you be that delusional? And yeah Snow lost too, to a player he has been losing to consistently, on a wide variety of maps (mappools spanning several ASL/KSLs). People seem to also forget that both Snow and Best won PvZ games against top tier opponents (that they normally lose to) to even get to the ro8 in the ASL -- did you guys just forget, or does protoss winning against good zergs while being underdogs not fit into your narrative, so you choose to ignore it? (Since July, Best is 10-16 vs Action and Snow is 21-38 vs Hero in spon games)
Avi-Love gets sucked into the toilet. Shame on you..
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
You're one of the more balanced poster here. Best to ignore the negativity. Keep it up!
The same side who earlier said "Thank you for pushing the discussion positively with well thought out posts" now accuses you of being "either ignorant and downright dishonest".
It's amazing how quick tides can shift. I guess "pushing the discussion positively" only happens when we're pushing the discussion towards one side
Thanks, appreciate it. I don't feel like there's been any real negativity, I think everyone's just trying to share their opinions, maybe it comes off the wrong way, but I don't think anyone's got bad intentions. Myself, I've always been interested in this issue so I'm hoping to get a better understanding of it, and I hope others share their thoughts.
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Let me understand, in your opinion ZvP is not the most lopsided matchup there is in Brood War(players and maps aside)?
I never made any such claim, quite the opposite if you look closely; I believe that the current ASL maps are actually quite fair overall in pvz, much more so than say CB, FS, Sylphid would be (and many other maps from previous seasons of ASL). My personal opinion on the subject is that the "standard 4p macro maps" are completely outdated and have been proven imbalanced in several matchups. My overall main point in this thread though was that Best lost because he played incredibly poorly, not because of his race, the mu, or the maps. Which makes sense, seeing that this is in fact the thread for discussing the ASL season 10 Ro8 day 4. I don't know why but people seem to think it's the place to balance whine and ask for patches.
On October 30 2020 08:45 Light- wrote:
On October 30 2020 06:53 Avi-Love wrote: I think using Ultimate Battle as a sample when you have sponbbang is either ignorant or downright dishonest, why would you not use the bigger and more recent sample size to gauge balance? The map pool has shifted about 5 times since the first Ultimate Battle, thankfully Colosseum, Medusa, Cross Game and such are no longer used (and FS/CB).
Ignorant and dishonest as to what? Well, I was looking at Ultimate Battle because I thought it was a good and reliable set of data to make inferences from. I come from a psychology background where case studies are very popular, and we believe that certain things are needed in making a valid assessment, such as having a large sample size, a representative population that features variation, and parity in testing. Applied to BW and balance talk, this I think means to have a lot of games played by the top players vs other top players on many different maps, with a large and distributed number of games amongst each player/map. And I think Ultimate Battle features this in a nice, controlled setting. But since Spon matches are not exclusively between top players and top players, and feature many a lopsided beatdown by one player on another, usually on whatever the current ASL maps are, there are potential confounds that I think are more serious than any to be found in Ultimate Battle.
Why would you consider that a good or reliable set of data? You would have to be ignorant to consider the rampant use of outdated irrelevant maps and "legacy showmatches" between players that were good 15 years ago reliable or good data sets. Also only a soft science would ever consider ~585? games over a 1+ year span a good sample size of anything. Why don't you look at sponbang that has thousands of games played, on actual meta maps, by the absolute best progamers in the world? The answer to that is either you didn't know about it (ignorance) or that data didn't support your ideas (dishonest). In what universe can you possibly believe that ultimate battle is less lopsided than spongames? Spongames are literally played by the absolute best players in Korea; once in awhile a lesser pro might get to participate, say someone like Shine, Miso or Killer. But you have to understand that the lower tier players tend to play other lower tier players more so than they do the absolute top; for example, Flash has 41 spon games this month, his "worst" opponent is either Sharp or Shuttle. To me those are both top ~5(ish) players of their respective races and are infinitely more valid than the abominations in Ultimate Battle (Nada vs Reach, Britney vs Ginyuda, NAl_ra vs tossgirl, Calm vs Horang2, etc etc.) Lastly, I would also argue that there is a definite trend where if a player or team has already won the match, they will start semi-trolling because the games have to be played out regardless and both players just want to get out of there once the winner is determined, which is obviously bad for any sort of useful statistical analysis.
Now I think everyone should either discuss the game between Best and Zero as this thread was intended for, or you should go make a new post to discuss your great patching or balance ideas for the game. Know that there is not going to be any balance patches though, and good players are 100% okay with that, else they would not still be playing.
The legacy showmatches were but a small fraction of the entire sample, about 50 games. So there's really no need to focus on them as if they were representative of the event as a whole. If anything, you run the danger of misrepresenting it and being dishonest. You can take them out if you don't find them valuable, it will not make a significant difference to the numbers. You'll still have 535 games featuring the best against the best. To be statistically significant, that is more than enough. You don't need thousands of games (where are you seeing this anyway, I'd appreciate a link), especially when those thousands of games consist of disproportionate representation amongst the players and have quality of matchup issues. Lower tier players playing lower tier players, and sometimes going up against higher tier. This is something you've identified, but I think we've overlooked its effect on the map stats.
Let's say you want to look at PvT win rates on some map. I think it's important to consider whose wins and losses have contributed to those percentages, because to me there looks to be a disproportion. Look at Best, one of the best PvTers. He's played 59 games this month. But then other guys with worse PvT like Stork and Shuttle have played 72 and 96 PvTs, respectively. Their games are being counted more than Best's in the overall map stats. That is the problem I see with Spon's data, that some players' wins/losses affect the overall records more, simply because they play more Spon matches, and these players are not always the best. The wins/losses of less than top players is not useful in determining balance.
Then quality of matchup issues. You see that almost half of Best's 59 games are from just two players - Sharp and Rush. Now sure, Sharp and Rush can be considered top Terrans, they were in the UB as well. But playing half your games against them? Ideally, you should play an equal number of games against all top players for the win rates to be more accurate, lest the win rates be padded with too much of lengthy, one-sided matchups. But unfortunately that's what oftentimes happens, one guy plays another guy a ton, and the results are lopsided, like 5-16, and then he'll have games vs guys not in his league at all. These kinds of games are also not useful to determine balance, but nevertheless, it all ends up going into the percentages, and to my knowledge there's no way of filtering it out. 34 out of Stork's 72 PvTs include Shinee, Leta, Jyj, Sea, Barracks, herO (an amateur? Never heard of him), organ, and Scan. Not the cream of the crop. You can basically say half of Stork's contribution to the PvT record is not useful and distorting the useful information. Best has 7 games vs Shinee and Sea. Only 3 against Flash. Stork has 37 games against Action and Hero. Going 9-28. What do you think this is doing to the map stats? So unless you have some way of filtering these issues, I don't think Spon stats are that useful.
Okay this is my last time posting, you're absolutely beyond help and so delusional it hurts. First you say that legacy showmatches that make up more than 10% of your "great sample" makes no difference, I know psychology is a very soft science, but come on my dude are you joking? And honestly, where do I find thousands of games? SPONBBANG. Where only players playing spon games are ranked, all of whom are progamers or at the very least absolute top amateurs -- all infinitely higher "quality" players than the jokers that played in the legacy ultimate battle events. Furthermore, how can you whine about this "disproportionate representation amongst the players" when you don't even know where to find the fucking statistics? Are you just making blind assumptions? Do you think spon games are between C ranks? Then you go on to complain about Best playing vs Sharp and Rush? They are literally some of the best terran players in the world, and yeah spon games are often determined by popularity and/or a need to practice a certain matchup (or even map). But here's the great thing, when you have thousands of games it's not going to be two people playing the same two people. But I mean luckily you brought up this issue yourself, complaining about Best vs Rush and Sharp specifically, (10-7 vs Rush, 8-3 vs Sharp) but completely neglect to mention that he also played Flash (2-1), Sorry (5-3) and Light (7-6)? These are all the very best terran players, and ALL OF THEM ARE SIGNIFICANT FOR STATISTICS.
I genuinely don't think you understand how close all progamers are in skill, you even insinuate that Stork is a much better player than Leta and JyJ when in reality they both made it further in this very ASL than he did. So no, that contribution is NOT useless nor does it distort the "useful information". Furthermore, you also decided to insinuate that Piano is a lot worse than Stork -- the very same piano that lost 2-7 in Ultimate Battle? So when he plays Stork it's a bad sample, but when he plays hero it's a good one? Funny considering hero is a much better player than stork, so "Ok" buddy. You really need to understand that it evens out, and that's why we get consistent result over large sample sizes, it's why we know for a fact that certain maps are more balanced than others by looking at spon results. It's not perfect, but it is by far the best that we have.
I think the bottom line is that your science is simply wrong here, you don't seem to recognize what actually makes a useful brood war sample and what does not. Main issues are that you think 535 is a "great" sample size, it's not. I'm sure a mathematician or statistician can elaborate on that, but I am neither. Furthermore, you seem to think that a sample size of ~585 spread over 49 *YES, FORTY NINE* different maps is a good thing (?). You also seem to think that a sample that's over what 1.5 years old is good? Come on, look at sponstats, you have 2582 games played on polypoid in the last 4 months alone. That's *one* map, not 49. But okay man, keep living in your fantasy world where you think old games played on Gaema Gowon, Nostalgia, Jade, Jim Raynor's Memory, Autobahn, Cross Game, Reverse Temple, Silent Vortex and Tres Pass are good to determine the current map/mu balance. Personally I think it's ridiculous.
The problem with stats is that they completely ignore the quality of players. Flash and Light have already proven to be better with protoss than most protoss mains. So you would have to remove them from the terran stats. I don't think using stats in a game like starcraft will create the most meaningful result. The difference in the balance of the maps and quality of players is just way too vast.
Clearly you're interpreting the data differently than I, we'll agree to disagree. Our exchange is going nowhere, nothing I try to explain seems to get through to you. You're still going on about old maps being used when they were only used in 10 or less games for those 5 sets of legacy matches, but apparently 5 series out of 60 has a big impact or represents the whole. Yet Spon stats include thousands of games from the lowest to the highest is not problematic. Then you admit you're not a mathematician and statistician. Clearly not, with statements like "535 is a great sample size, it's not" but, a 3 game set against Flash is "SIGNIFICANT FOR STATISTICS." Well, you might be interested to know every psychologist is required to study statistics.
To me, you just seem to cherry pick and misrepresent data to fit your beliefs, and that's a shame. So go ahead and don't post anymore, I'd rather not have to deal with someone who calls people ignorant and dishonest yet resorts to insults and false claims.
On October 30 2020 23:02 Essbee wrote: The problem with stats is that they completely ignore the quality of players. Flash and Light have already proven to be better with protoss than most protoss mains. So you would have to remove them from the terran stats. I don't think using stats in a game like starcraft will create the most meaningful result. The difference in the balance of the maps and quality of players is just way too vast.
Agreed. If we want to try to evaluate balance statistically we need to control for as many variables as possible, and so it's imperative to control for player quality by only looking at the data from the matches between the best players. Likewise, I believe map balance can be controlled by taking games from a significant number (close to 30 or more) of maps to remove the effect of map imbalance. Random sampling is important because the effects of any one or two forces is drowned by the noise of all the different things. If a pattern still emerges even after taking a statistically significant random sample, then by reason it is clear there is a fundamental effect going on. The hard truth for some to accept is that, throughout BW's history, ZvP has always shown the largest gap in the numbers. But as I said before, the jury is still out, the gap is not egregious, and it is very well likely due to the gaps in player skill.