|
It would be good to call it something other than a win rate, say 'balance indicator' or 'strength indicator'. Otherwise you will always get flack for it not actually being a true win rate.
I called it adjusted win/rates, which is exactly what it is - Not sure whether that's worse or better than balance-indicator.
It's taking a data set from a total population group (pro games) then weighing the wins of different races more heavily depending on the bias of the adjustor. Politicians do it all the time, totally "legit." Hark, what baseball through yonder window breaks?
Have you actually even read the methdology or actually studied the graph? Terran had an adjusted win/rates of over 70% in early WOL!!! Good luck defending that as a race-bias, when the graph actually shows that all races at different times have been OP.
Could you link to the comment where you explain your method?
Well you find the comments and my thoughts on some of the numbers it presents here. http://www.teamliquid.net/forum/starcraft-2/255254-designated-balance-discussion-thread?page=1079
As I said previously, it's not a prefect method. Compared to the China-example, we do not have all the required information to make the best possible adjustment. But I don't think that's an argument against not coming up with any solution at all.
|
You describe your assumptions in a very vague and qualitative manner. Please provide more specific details so we can evaluate your model.
|
Double post
|
On September 14 2014 01:44 Salient wrote: You desribe your assumptions in a very vague and qualitative manner. Please provide more specific details so we can evaluate your model.
Allow me to elaborate. I haven't put up the exact formula untill now, because it's not that very relevant in the assesment of whether the adjusted win/rates are useful or not. Instead, the useful-ness is related to the qualitaitve perspective: Does it make sense historically? . If you go for a purely quantiative approach then you need a lot more data (Aliguac actually have that data avialble and could make some much more accurate adjusted balance win/rates if they chose to).
The exact formula for adjusting win/rates is this: Naked win/rates in XvY * (A+1)/(B+1)
Where , A = Percentage of all non-mirror matchups that has race Y in it B = Percentage of all non-mirror matchups that has race X in it
This formula obviously doesn't tell you that much in it self, but compare this to the alternative? What do you get out of looking at naked win/rates? Not a lot. What about looking at both win/rates and distribution-metric qualitaitvely each single month? That was what I previously did (for 2 years), and I believe it gives a more comprehensive view, but it has two disadvantages
(1) More time-consuming (than just looking at one number) (2) More impacted by race bias.
For instance, let's say you are a terran player and see TvZ win/rates are 54% in Month 1, and games played by terran = 58% and games played by Zerg = 64%, you might reason that balance looks decent here.
But then a couple of months later, you see that TvP win/rates = 45%, and games played by T = 62% and games played by P = 58%. You may reason here that balance is absolutely terrible, even though it mirrors the numbers from Month 1 relatively closely.
Thus, I simply see this approach as having an advantage of being consistent. No month-to-month "bias", but rather an approach that can show the change of balance over time. If someone can come up with an improved (consistent) model that takes more factors into account (I am looking at Aliguac here), I would be very happy to see that.
|
|
Yeah I think Terran is somewhat overpowered, but that's just my personal opinion. It's much like that Fall of Terran editorial you wrote was just your opinion rather than the official position of TL, and you admitted that you were biased.
Hider and I are talking about statistical modeling rather than just opinions. So I'm willing to give the benefit of the doubt to the other side -- regardless of my personal bias or belief -- because that's how big boys debate. He could be right. In which case, I would like to see his model used even if it disproves my beliefs. That's because I ultimately care about the game more than my own partisan position.
|
On September 14 2014 02:12 Salient wrote:Yeah I think Terran is somewhat overpowered, but that's just my personal opinion. Meanwhile, when Protoss was winning everything for months in the first half of 2014, things were fine and Terran in particular had no problem because TaeJa could beat tier3 Protoss in weekend tournaments. Those hilarious double standards of yours still don't prevent you from talking about "bias," even putting some words into my mouth in passing (where did I "admit" anything exactly?). What wouldn't you be ready to do in order to try to save the last vanishing pieces of your credibility, I wonder...
|
On September 14 2014 02:20 TheDwf wrote:Show nested quote +On September 14 2014 02:12 Salient wrote:Yeah I think Terran is somewhat overpowered, but that's just my personal opinion. Meanwhile, when Protoss was winning everything for months in the first half of 2014, things were fine and Terran in particular had no problem because TaeJa could beat tier3 Protoss in weekend tournaments. Those hilarious double standards of yours still don't prevent you from talking about "bias," even putting some words into my mouth in passing (where did I "admit" anything exactly?). What wouldn't you be ready to do in order to try to save the last vanishing pieces of your credibility, I wonder...
Historical imbalance (real or not) is irrelevant to a debate about the specifics of Hider's statistical model. Your vitriol is misplaced.
|
Hider and I are talking about statistical modeling rather than just opinions. So I'm willing to give the benefit of the doubt to the other side -- regardless of my personal bias or belief -- because that's how big boys debate. He could be right. In which case, I would like to see his model used even if it disproves my beliefs. That's because I ultimately care about the game more than my own partisan position.
Regarding the specific issue you brought up (terran win/rates in HOTS). That's probably an instance where the model results may suprise some people.
I think most people first realized the terran UP'ness a coupe of months into 2014, but the model estimated that terran gradually became more and more UP vs toss starting after the summer 2013, and that Z was already a bit favored vs terran before the Widow Mine nerf.
While the latter seemed like a ridiclous statement at the time, was it - in hindsight - really such a dumb estimate? Since November 2013 terran has benefited from Banshee buffs, Hellbat buffs, Thor buff and Siege Tank buff (though less significant). Thus, if we assume that TvZ is relatively balanced now, the story of terran being (slightly) UP before the Widow-Mine nerf actually makes sense.
In page 1079 (or 1080) I discussed some of the possible explanation for why terran was preceived (amongst the general community) to be in a better positon than the numbers suggested by the adjusted win/rates.
In the end it comes down to terran suffering more than protoss during the patch-zerg, which wasn't a very well-known story.
Rather, the story was that Z imba and P/T suffered. But while terrans fell off from the competitive scene, protosses didn't, and given the assumptions of the model (no structural reasons to expect a difference), terran would have go through several months of win/rates above 50% vs both P and Z in order for the competitive terrans to establish them selves once again.
However, that never occured. Protoss intead slowly figured out that they benefited quite a lot from the Msc which allowed them to do much greedier builds than in WOL. As a consequence, protoss win/rates improved vs terran while the opposite needed to be the case (terran had to be buffed in HOTS relative to WOL vs toss).
The issue here is that the whole "adjustment proces" - where win/rates needs to be above 50% for a period - is quite unintutive, and I think neither Blizzarrd nor the community understands this concept. As a conseuqence, it was believed that balance was fine in mid/late 2013.
So that's basically the (likely) explanation for why the model recognizies the UP'ness much faster than the general public did.
|
My understanding of the situation is the following: one needs to make concrete assumptions on how the actual balance in the respective matchups are, and predict the most likely outcome of race-wise tournament results based on the current player pool. The better the predictions are, the better the assumptions were. This of course is reverse-engineered mathematically, feeding tournament results into a complex mathematical formula in order to find the best assumptions on race-wise balance which supports the overall tournament results. Of course the more data one has, the more accurate the assumptions will become.
I believe that Blizzard does use sophisticated methods of determining their (so-called) weighted win-rates in order to make conclusions on balance, both with ladder data and tournament data. There are structural problems with both the tournament and the ladder format as an indicator of balance in terms of pure win-rates. The problems arise in tournaments from the simple reasons that it's only the ones who qualify that gets to play, and only the winners who advance (like it should be). The problems with the ladder is of course that you are matched with an equally skilled opponent, and will of course make pure win-rates a pretty useless indicator for balance.
There are many other complications too. The balance is not time-constant due to metagame changes and patches, so this possibly needs to be accounted for as well. In addition, the balance is not necessarily constant at all skill levels.
So there are many reasons why one needs to "manipulate" the data in a certain way to make useful assessments of balance.
I personally believe that a good, simple and easy indicator of balance is the race distribution in, say, top 16 in WCS tournament formats. This has a lot of faults too, but I think much of it is intrinsically accounted for in this simplistic model, such that it can't be viewed as "skewed" data in any significant way.
|
Sorry to sound so ignorant, but... the day people realize the truth behind all this was that most of the best mechanical players of SC2 happened to be Terrans, they will then realize that the "imba" win ratios were a cause of "imba" population-skill distributions in the races, and consequently some of the most annoying statistical-based arguments will end.
It's a freaking game, and if I gave you the option to play the most handsome, badass looking character compared to some dirtbag useless imbecile, you would most likely pick the former. Not to say the Terran race happens to go by the standards of beauty as just described, but there definitely are some psychological aspects of humans, especially kids, coming into play when deciding what race to play.
|
On September 14 2014 03:06 emidanRKO wrote: It's a freaking game, and if I gave you the option to play the most handsome, badass looking character compared to some dirtbag useless imbecile, you would most likely pick the former. Not to say the Terran race happens to go by the standards of beauty as just described, but there definitely are some psychological aspects of humans, especially kids, coming into play when deciding what race to play.
There may be something to this at the absolute beginner level, but for pro players who practice all day it's pretty much nonsense to think this way.
|
I believe that Blizzard does use sophisticated methods of determining their (so-called) weighted win-rates in order to make conclusions on balance, both with ladder data and tournament data.
I tried to read and watch everything I could about how Blizzard approaches the statistical side and I even asked David Kim in one of his Q&As (no answer unfortunately). But I think a lot of their comments indicate that they do not have a sound way of modelling win/rates. I actually remember Dustin Browder saying at the end of WOL beta where siege tanks did 60 damage and the game was played on T-favored maps that win/rates were around 50/50. Noone at that time (that played the game) could make sense at that number at all.
While Blizzard do adjust win/rates on the ladder by taking into account expected-win/rates, the methodology is flawed. Expected win/rate is based on the MMR. If one player has a very high MMR, he will be favored over a player with a lower MMR. However, if the latter conssitently wins, then it's - according to Blizzard's model - a sign that the race of the latter is favored.
This adjustment approach is pretty meaningless in depicting balance-issues. I am also convinced that David Kim wrote my question in the Q&A, and if he knew the answer to my question, he would surely have responded. Instead, it's more likely they just hired one math-guy who knew how to make equations, but lacked the understanding of creating usefull statistical adjustments.
I believe if they really had created useful adjusted win/rates, then the win/rates which David Kim occationally publishes would deviate more from 50/50. However, they always seems to be around 50/50. Given how much the ladder distribution is skewed, I think every type of adjustment-proces would result in terrans having lower win/rates.
I personally believe that a good, simple and easy indicator of balance is the race distribution in, say, top 16 in WCS tournament formats. This has a lot of faults too, but I think much is intrinsically accounted for with this simplistic model.
I think the issue with that methodlogy is that it is way too vulnerable to variance in results and the skill-level of the players. Having a couple of Korean terrans playing in the region increases the amount of players in the top 16, regardless of balance. I think it's better to look at a larger database here and make predictions based on the assumption that the best terran player = best toss = best zerg. The 20th best zerg = 20th best toss = 20th best terran etc.
With enough data on indiviudal competitive players, you can come up with pretty reliable adjusted win/rates.
|
On September 14 2014 03:17 Hider wrote:
I tried to read and watch everything I could about how Blizzard approaches the statistical side and I even asked David Kim in one of his Q&As (no answer unfortunately). But I think a lot of their comments indicate that they do not have a sound way of modelling win/rates. I actually remember Dustin Browder saying at the end of WOL beta where siege tanks did 60 damage and the game was played on T-favored maps that win/rates were around 50/50. Noone at that time (that played the game) could make sense at that number at all.
While Blizzard do adjust win/rates on the ladder by taking into account expected-win/rates, the methodology is flawed. Expected win/rate is based on the MMR. If one player has a very high MMR, he will be favored over a player with a lower MMR. However, if the latter conssitently wins, then it's - according to Blizzard's model - a sign that the race of the latter is favored.
It should be quite obvious that this adjustment-proces by Blizzard is pretty terrible, and they have never given any indicaiton that they have more a useful approach to balance. If they really had that, I think it's also alot more likely that the win/rates which David Kim occationally publishes would deviate more from 50/50. However, they always seems to be around 50/50. Given how much the ladder distribution is skewed, I don't think every type of adjustment-proces would result in terrans having lower win/rates.
I remember they slapped up a huge formula on screen once, and explaining that the used it to find their "weighted" win-rates. Obviously it's not that simple, but still, it does mean that they use simplistic models when it comes to this. And without intimate knowledge of how they measure these things I don't think you can judge them as easily. They do (or did) have a mathematician on board there making these models.
On September 14 2014 03:17 Hider wrote: I think the issue with that methodlogy is that it is way too vulnerable to variance and skill-level of the players. Having a couple of Korean terrans playing in the region increases the amount of players in the top 16, regardless of balance. I think it's better to look at a larger database here.
Sure, but the race representations in the GSL (for example) have usually reflected the general consensus on balance. At least that is my impression. While there is much variance, especially for particular players, much of this is canceled out considering there are only three races and a very high skill density at the top in Korea.
|
I remember they slapped up a huge formula on screen once, and explaining that the used it to find their "weighted" win-rates. Obviously it's not that simple, but still, without intimate knowledge of how they measure these things I don't think you can judge them as easily. They do (or did) have a mathematician on board there making these models.
Yes I watched that video like 5 times, and combined with every thing else they said on the adjustment-proces, that's where I am basing my theory on how it works.
Sure, but the race representations in the GSL (for example) have usually reflected the general consensus on balance. At least that is my impression. While there is much variance, especially for particular players, much of this is canceled out considering there are only three races and a very high skill density at the top in Korea.
Two things here:
(1) WCS Korea is different from WCS EU. In WCS EU top 16 race distirubtion is impacted by which players were popular in 2011, as that have made them more attractive to foreign teams.
(2) I also studied the GSL distribution and it lags behind the adjusted win/rates by a lot. Yes it could predict that terran was OP in 2011, but it kept "predicting" that in early 2012 even though my model saw that the balance was fine + naked terran GSL win/rates weren't above 50% in early 2012.
In GSL/WCS Korea there is actually no reason to expect that win/rates and race representation won't go hand-in-hand as the best protosses are matched agaisnt the best terrans, and if P is op, then the best protosses will will have a higher win/rate. That's not the case throughout most of Aliguac win/rates, however, as many of the best players in the world are unliekly to participate in many of the smaller tournaments if they get "too good".
Given that logic, it seems more plausible that GSL naked win/rates was a more reliable indicator of balance then race distribution in early 2012. Unlike win/rates, race distribution can increase due to a "luck": For instance if the results of 3 bo3s for Race X are: 2-1, 2-1 and 0-2, then the win/rates are 50-50. But if they are the the matches which decides whether you are gonna play code A or code S, you are going to see more of race X in code S next season.
As win/rates were close to 50/50 (both naked GSL's and adjusted Aliguac win/rates) while terran race distribution increased in early 2012/late 2011, I think this increase is simply due to lots of terrans winning 2-1 and losing 0-2 (thus getting lucky).
I believe Blizzard buffed the Queen based on the (wrong) assumption that race distribution in Code S is a strong indicator of balance, which resulted in the long era of the patch-zerg problem. If you go through the historical numbers of the adjusted win/rates and make balance suggestions based on that, you would have avoided all of the bad patches by Blizzard.
And without intimate knowledge of how they measure these things I don't think you can judge them as easily. They do (or did) have a mathematician on board there making these models.
So what would the adjusted ladder win/rates likely would look like if Blizzard took into account that lots of terrans had dropped down in leagues (race representation)? As I see it, terrans should have a lower adjusted win/rate than non-adjusted win/rate, because - if they in fact are UP - then they are being matched up against lesser skilled players.
But when you go through the previous published win/rates, you do not see that trend at all. The only trend you see is that win/rates are always very close to 50/50 (regardless of how much protoss is winning with Blink Stalkers...)
Based on that, I think it's more likely that they are adjusting wn/rates based on MMR - which is exactly what the math guy on the video said they did - and not based on race representation (which they have made no indication of).
|
Blizzard major balance changes seem to be decided by community uproar. As for the minor ones i believe they roll a dice to decide. I believe the next wave of balance changes will come after Blizzcon, and should be focused on adding more variety to the match ups. Thats the perfect time to try something a bit more extreme to shake things up.
|
On September 14 2014 02:36 Hider wrote: In page 1079 (or 1080) I discussed some of the possible explanation for why terran was preceived (amongst the general community) to be in a better positon than the numbers suggested by the adjusted win/rates.
In the end it comes down to terran suffering more than protoss during the patch-zerg, which wasn't a very well-known story.
I'm still contesting that in the same way I did at page 1079-1080 :p
|
On September 14 2014 03:56 Nebuchad wrote:Show nested quote +On September 14 2014 02:36 Hider wrote: In page 1079 (or 1080) I discussed some of the possible explanation for why terran was preceived (amongst the general community) to be in a better positon than the numbers suggested by the adjusted win/rates.
In the end it comes down to terran suffering more than protoss during the patch-zerg, which wasn't a very well-known story.
I'm still contesting that in the same way I did at page 1079-1080 :p
Well you made a comment that terrans did fine in Korea. Then I looked at GSL win/rates and didn't find any support for that assertion. I even tested whether there was support for the thesis that the model underestimated the strenght of the best terrans by comparing it to GSL/WCS Korea win/rates. However, since 2012, adjusted T win/rates based on Aliguac data is actually higher than terrans win/rates in GSL.
Instead, the perception of toss doing poorly vs terran in 2012 probably is a consquence of race distribution in code S/WCS Korea, which I in my post above - argued is a really bad indicator of balance.
|
On September 14 2014 04:00 Hider wrote:Show nested quote +On September 14 2014 03:56 Nebuchad wrote:On September 14 2014 02:36 Hider wrote: In page 1079 (or 1080) I discussed some of the possible explanation for why terran was preceived (amongst the general community) to be in a better positon than the numbers suggested by the adjusted win/rates.
In the end it comes down to terran suffering more than protoss during the patch-zerg, which wasn't a very well-known story.
I'm still contesting that in the same way I did at page 1079-1080 :p Well you made a comment that terrans did fine in Korea. Then I looked at win/rates and saw that wasn't true. Terrans weren't doing well vs protoss. Instead, the perception is probably a consquence of race distribution in code S/WCS Korea, which I in my post above - argued is a really bad indicator of balance.
I don't remember you showing me winrates where terrans did badly in korea vs protoss at that time. All I remember is the numbers I pulled, 136-93 for terran.
|
On September 14 2014 04:04 Nebuchad wrote:Show nested quote +On September 14 2014 04:00 Hider wrote:On September 14 2014 03:56 Nebuchad wrote:On September 14 2014 02:36 Hider wrote: In page 1079 (or 1080) I discussed some of the possible explanation for why terran was preceived (amongst the general community) to be in a better positon than the numbers suggested by the adjusted win/rates.
In the end it comes down to terran suffering more than protoss during the patch-zerg, which wasn't a very well-known story.
I'm still contesting that in the same way I did at page 1079-1080 :p Well you made a comment that terrans did fine in Korea. Then I looked at win/rates and saw that wasn't true. Terrans weren't doing well vs protoss. Instead, the perception is probably a consquence of race distribution in code S/WCS Korea, which I in my post above - argued is a really bad indicator of balance. I don't remember you showing me winrates where terrans did badly in korea vs protoss at that time. All I remember is the numbers I pulled, 136-93 for terran.
I don't know where you are getting those numbers from. By using data from both all of GSL and WCS Korea (code A, code S, qualifications), I made this graph.
In the below graph the performance difference between adjusted aliguac win/rates and GSL win/rates can be seen. A positive value means Koreans are performing better, and a negative value means foreigners are performing better. What you see is that protoss were doing fine vs terran in 2012, but not in 2011. At the end of 2012, protoss player actually had a better win/rates in GSL vs terran than according to adjusted aliguac win/rates. Overall, however, sample size is a bit small in 2012, so its tough to say anything clearly. But from 2012-2014, you do not actually see any skill-cap differences for terran. Protoss players on the other hand have performed better vs zergs in Korea than in the foreign scene since 2012. This kinda makes sense we actually see very few succesful foreign protosses, so perhaps protoss is the hardest race... ![[image loading]](http://i58.tinypic.com/2cyp76d.png)
Win/rates for GSL in 2012 (the five seasons)
P 50.0% 49.2% 51.1% 46.9% 48.3% T 50.0% 50.8% 48.9% 53.1% 51.7%
Source: Aliguac.
|
|
|
|