• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 11:52
CEST 17:52
KST 00:52
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
Code S RO4 & Finals Preview: herO, GuMiho, Classic, Cure4Code S RO8 Preview: Classic, Reynor, Maru, GuMiho2Code S RO8 Preview: ByuN, Rogue, herO, Cure5[ASL19] Ro4 Preview: Storied Rivals7Code S RO12 Preview: Maru, Trigger, Rogue, NightMare12
Community News
Code S Season 1 - RO8 Group B Results (2025)4[BSL 2v2] ProLeague Season 3 - Friday 21:00 CET6herO & Cure GSL RO8 Interviews: "I also think that all the practice I put in when Protoss wasn’t doing as well is paying off"0Code S Season 1 - herO & Cure advance to RO4 (2025)0Dark to begin military service on May 13th (2025)21
StarCraft 2
General
Code S RO8 Preview: ByuN, Rogue, herO, Cure Is there a place to provide feedback for maps? Code S Season 1 - RO8 Group B Results (2025) 2024/25 Off-Season Roster Moves Code S RO4 & Finals Preview: herO, GuMiho, Classic, Cure
Tourneys
RSL: Revival, a new crowdfunded tournament series [GSL 2025] Code S Season 1 - RO4 and Grand Finals [GSL 2025] Code S:Season 1 - RO8 - Group B SOOP Starcraft Global #20 SEL Code A [MMR-capped] (SC: Evo)
Strategy
Simple Questions Simple Answers [G] PvT Cheese: 13 Gate Proxy Robo
Custom Maps
[UMS] Zillion Zerglings
External Content
Mutation # 473 Cold is the Void Mutation # 472 Dead Heat Mutation # 471 Delivery Guaranteed Mutation # 470 Certain Demise
Brood War
General
BGH auto balance -> http://bghmmr.eu/ BW General Discussion ASL 19 Tickets for foreigners Recent recommended BW games Battlenet Game Lobby Simulator
Tourneys
[ASL19] Semifinal B [USBL Spring 2025] Groups cast [ASL19] Semifinal A [BSL 2v2] ProLeague Season 3 - Friday 21:00 CET
Strategy
[G] How to get started on ladder as a new Z player Creating a full chart of Zerg builds [G] Mineral Boosting
Other Games
General Games
Stormgate/Frost Giant Megathread Beyond All Reason Grand Theft Auto VI Nintendo Switch Thread What do you want from future RTS games?
Dota 2
Official 'what is Dota anymore' discussion
League of Legends
LiquidLegends to reintegrate into TL.net
Heroes of the Storm
Simple Questions, Simple Answers
Hearthstone
Heroes of StarCraft mini-set
TL Mafia
Vanilla Mini Mafia TL Mafia Community Thread TL Mafia Plays: Diplomacy TL Mafia: Generative Agents Showdown Survivor II: The Amazon
Community
General
UK Politics Mega-thread Russo-Ukrainian War Thread US Politics Mega-thread Elon Musk's lies, propaganda, etc. Ask and answer stupid questions here!
Fan Clubs
Serral Fan Club
Media & Entertainment
[Manga] One Piece Movie Discussion! Anime Discussion Thread [Books] Wool by Hugh Howey
Sports
Formula 1 Discussion 2024 - 2025 Football Thread NHL Playoffs 2024 NBA General Discussion
World Cup 2022
Tech Support
Computer Build, Upgrade & Buying Resource Thread Cleaning My Mechanical Keyboard How to clean a TTe Thermaltake keyboard?
TL Community
The Automated Ban List TL.net Ten Commandments
Blogs
Why 5v5 Games Keep Us Hooked…
TrAiDoS
Info SLEgma_12
SLEgma_12
SECOND COMMING
XenOsky
WombaT’s Old BW Terran Theme …
WombaT
Heero Yuy & the Tax…
KrillinFromwales
BW PvZ Balance hypothetic…
Vasoline73
ASL S19 English Commentary…
namkraft
Customize Sidebar...

Website Feedback

Closed Threads



Active: 22333 users

Scientific proof that SC2 is imbalanced (sorta)

Forum Index > SC2 General
Post a Reply
Normal
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-17 09:20:41
August 16 2010 22:13 GMT
#1
What follows is the world's first scientific analysis testing the hypothesis of whether SC2 is imbalanced! Hopefully this will kill any future debate on the subject (LOL yeah, that didn't happen). Just for clarification, from a functional perspective I believe the game is well balanced. I've also edited this post to encorporate recommendations from the other posters in this thread, so be aware of that if you read the other posts in the thread..

Abstract: A simple statistical analysis revealed that, you guessed it, SC2 is IMBALANCED (technically). However, the degree of imbalance is so infintessimally small that you’d literally have to play millions of games to ever know it. Moreover, my initial analysis suggests that the balance is very consistent across all player skill levels, and I think Blizzard deserves a lot of credit for the hard work they’ve put into balancing Starcraft 2.


Introduction: Ever since the SC2 Beta was released, people have been arguing about whether and which races are imbalanced, and what should be done to fix the balance problem in Starcraft 2. There have been a plague of articles on the subject, some of which I’ve read as I trolled around the TeamLiquid forums. I decided that the best way to settle the issue once and for all was to take a scientific approach, using statistical analysis to test the hypothesis that the races in Starcraft 2 are inbalanced. I'm starting with the only data I have available, which is a table of win/loss results for each game that's been played so far. My hypothesis is that if SC2 is imbalanced, then some races will have better win/loss ratings than others.


Methods: The data I used were found in the ‘stats’ section at sc2ranks.com in the morning of August 16th. Data are reported as win percentages for each race at each league level, and in parentheses is a number that I assumed were the total number of wins in each category. To answer the question of whether Starcraft 2 races have different win/loss ratings, I used a Chi squared analysis in Excel, which let me compare the observed win rate with the rate that would be expected if all races were balanced (i.e. wins were randomly distributed among races within each league). I used available win percentages to estimate overal win rates for the 50 million games of SC2 that have so far been played. I also did a somewhat wonky power analysis to explore how many games you’d have to play before noticing the imbalance. Spoiler: it’s a lot.


Results: OK, Starcraft 2 appears to be technically slightly imbalanced, but before I discuss that I want to raise some observations. First, the league system seems to work fairly well at giving players of diverse skill sets a similarly frustrating online experience. While Bronze players are the whipping-boys of the online SC2 community, with a 42% overall win rate, things are more even in the other leagues, with win rates of 50% for Silver players, rising gradually through the leagues to 56% for the average Diamond player.

[image loading]
[image loading]

Blue shading indicates the race does better than expected, red indicates the race does worse, according to the results from my chi-square analysis (p < 0.001 where significant, n ~ 50 million, 3 d.f.). The light values indicate that differences are only detectable at sample sizes of about 1 million games played.

Regarding the imbalance issue, with the roughly 50 million games that have been uploaded into the sc2ranks.com database, it seems that Starcraft 2 is ever so slightly imbalanced (see Table 1). One thing that really jumps out is that the game is least balanced for bronze players, where Terran players have a distinct disadvantage, winning about 2% fewer games than the other races, and Protoss players have a slight advantage, winning about 1% more games than average.

Among the other leagues, however, the story is quite different. The win rate disparities, although statistically significant, are small, never exceeding a third of a percent in any direction, and averaging a disparity of about 1.5 tenths of a percent. In general, the game appears to be very well balanced from silver through diamond leagues. Within those leagues, Terran has a slight advantage (see, Terran is IMBA!), meaning you’ll win about 2 games in a thousand more often than you should. With Zerg, a thousand games would yield about 1.5 more losses than average (see Zerg is weak and IdrA is the MAN!). Protoss and random aren’t much different from average.

For Diamond-level players, terran is the strongest race with a 56.1% win percentage and random is the worst (55.53%). A Diamond Solo Zen Master would have to play 1801 games to win as random and 1800 to win as zerg, but only 1794 to win as Protoss and 1784 to win as Terran. So if you want Terran mastery, you’ll get it in 17 fewer games than a random player! That’s enough time to take your significant other out to dinner, so if relationships are important to you, I recommend Terran. It also could help win you $2000. For sharing this knowledge with you, I expect a portion of the proceeds.

[image loading]

OK, so I’ve said that the races are imbalanced, but the imbalances are pretty darn small. Because Chi-square tests become extremely sensitive at very large sample sizes, I decided to do a jury rigged power analysis to determine just how many games you’d need to play to detect an imbalance. I basically just ran the Chi-square analysis using simulated data sets of various sample sizes to see how many games you'd have to play before it was obvious that some races won more games than other races.

So, let’s say you play ten thousand games using each race equally (2500 games with each race). If you’re a Silver through Diamond player, you wouldn’t be able to detect a statistically significant difference in your win rate. A Bronze player will notice terran has a disadvantage.

Now let’s say you keep playing until you get to a hundred thousand games. A Bronze player will notice that Protoss is a little extra strong, but players in the other leagues will still think the game is perfectly balanced. In fact, you’d have to play about a million games before you started to notice that the races were imbalanced in the diamond league.

To show the level of balance graphically, I used a sample size of about 100,000 games, played according to the racial proclivities of the player base (so, not all races used equally), because this was the largest sample size for which I could accurately estimate numbers of games won for each race. You can see on the log-transformed graph below that for silver through platinum players, the races are very well balanced, and that racial imbalances are only apparent in bronze level play.

[image loading]

The y-axis just represents how divergent the race is from average (0 on the graph). Higher values indicate that the race does better than expected, and lower means the race does worse. The colored regions at the top and bottom of the graph are the regions of statistical significance (p = 0.005). Mostly the differences between the races aren't significant.

Discussion: My data show that, within a league, each of the races has a rougly equal chance of winning a randomly selected game. This indicates that the balance of SC2 is probably pretty good. People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.

Technically speaking, my results suggest that Starcraft 2 is NOT perfectly balanced, but that the degree of imbalance is so small that it is functionally imperceptible until literally hundreds of thousands of games are played. Also, the imbalance is clearly most extreme and noticeable in the Bronze league. Perhaps this should come as no surprise, because less experienced players are probably much less predicable than more experienced players, because the weakest players are likely to pick the more familiar Terran race, and because the overall level of play in each league is generally lower (in my opinion) than during the Beta, meaning that Blizzard probably didn’t get a strong chance to balance the game for players playing at the current level of Bronze play.

In short, it's my opinion that the designers of SC2 did an amazingly good job of balancing the game, and across a very wide spectrum of player skill levels. So, unless you’ve played a few hundred thousand games of Starcraft 2, rest assured that any sense you have that the game is imbalanced is probably illusory. Hopefully, this analysis will put to rest the question of balance in SC2, at least in many people’s minds.

As a possible next step I could do the same analysis for SC1, and test the hypothesis that SC1 is more balanced than SC2. Based mostly on the suggestions of other people in this thread, I also think it would be a good idea to try to do an analysis where I test to see whether there are matchup imbalances, and whether race use is really consistent among leagues.

Finally, I ask that you not take these results too seriously. This was just a little project that I did for fun. I believe that my results suggest that SC2 is pretty well balanced in general, but I had a pretty limited data set and there are a lot of factors I couldn't account for. If I'm able to get better data I'll do a more involved analysis (unless my experience with this thread burns me out utterly), but until then this is the best that I can offer! Enjoy it for what it's worth!
iEchoic
Profile Blog Joined May 2010
United States1776 Posts
Last Edited: 2010-08-16 22:20:33
August 16 2010 22:15 GMT
#2
I appreciate the effort, but this is all within the scale of like 1-2% difference. That can be attributed to variance as much as anything else. That said, this community will herald this as a great achievement because it is admittedly well-formatted and presented. Nice work, and it looks excellent, but I just read through it and can't find anywhere where you explain that this small difference is significant.

Your last section clears it up but the title and intro is misleading.

Edit: I worded this confusingly - as you said that it is very insigificant - but you say it is imbalanced, just a little bit. I think a more proper description is that there is no evidence of imbalance. This small difference isn't large enough to be attributed to race balance differences, it could be attributed to the community who plays the races, etc.
vileEchoic -- clanvile.com
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 16 2010 22:16 GMT
#3
The sample size is so vast that random chance can't explain the differences, but at the same time they are so small as to be meaningless.
nam nam
Profile Joined June 2010
Sweden4672 Posts
August 16 2010 22:18 GMT
#4
Get back to me when you have calculated the win percentage against respective race.

User was warned for this post
StarcraftGuy4U
Profile Joined May 2010
United States74 Posts
August 16 2010 22:18 GMT
#5
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 16 2010 22:21 GMT
#6
Clarified (hopefully)
Mindcrime
Profile Joined July 2004
United States6899 Posts
August 16 2010 22:23 GMT
#7
Scientific proof that the matchmaking system is working

That is all that this is. No conclusions about balance can be drawn from looking at win%, on ladder, when the matchmaking system is specifically designed so that you win about 50% of your games. :|
That wasn't any act of God. That was an act of pure human fuckery.
teamsolid
Profile Joined October 2007
Canada3668 Posts
Last Edited: 2010-08-16 22:25:03
August 16 2010 22:23 GMT
#8
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.

This. The numbers you are basing the whole analysis on are meaningless for balance purposes, because it's the direct result of a working AMM. Also Terran has the worst win-rate in Bronze league simply because many new players who just finished the campaign would be playing Terran and won't have the slightest clue about how to play properly.
SeaSmoke
Profile Joined July 2010
United States326 Posts
August 16 2010 22:23 GMT
#9
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.


This...

Matchmaking isn't random...it actively chooses players one should beat after losing.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 16 2010 22:27 GMT
#10
Ah, I see.. the league system irons out imbalances by selectively keeping weaker races lower in the leagues. I hadn't thought of that. I can see if the proportion of races changes as you move through the leagues...
Toids
Profile Joined June 2010
United States17 Posts
August 16 2010 22:29 GMT
#11
Ya.... you can't pull numbers from the ladder to explain balance. You need to get data from outside of the matchmaking system.
HowardRoark
Profile Blog Joined February 2010
1146 Posts
August 16 2010 22:33 GMT
#12
What you must consider is that if you statistically find that Zerg is underpowered at the diamond league, when diamond league means anyone that have ever played Starcraft 1 or finished the campaign is this:

Imagine how much more the Zerg imbalances skyrocket when you take the top 1 percentile of the diamond (where you have the pro gamers). The numbers you work with must mean that the imbalances at the pro level (which probably are even less than one per mille of the diamond players) would look kind of bleak for the pro zergs.

Just picture your stats if you did this on the D players on iccup, how low win percentage the Terran cerainly have against Protoss, when they at the top top level of BW (where no foreigner as of yet have reached) dominate.

(I beg pardon if this obvious fact have been pointed out earlier in this thread, then ban me, but I am just to tired to read through every post every thread to validate a post).
"It is really good to get the double observatory if you want to get the speed and sight range for the observer simultaneously. It's a little bit of an advanced tactic, and by advanced, I mean really fucking bad."
virgozero
Profile Joined May 2010
Canada412 Posts
August 16 2010 22:34 GMT
#13
another not-smart(dont wanna get B'd) thread based on players statistics.

1.) we know its not 100% balanced that is a given. if you thought so you must be half retarded
2.) this analyzation does not take account for skill, popularity of race, there are too many variables and the assumptions are too large.

Why do people insist on using rankings and statistics to call a race imbalance?

If you think something is imblanace, then just say it and say what part, no need to bring in all the word's #s to inevitably fail your argument. Idra makes good cases, and he states the reason why he thinks terran has an advantage of zerg, you never see him go OMG LOOK AT THESE #s I FOUND !
51-50-49, therefor 51 must be OP !
neobowman
Profile Blog Joined March 2008
Canada3324 Posts
August 16 2010 22:35 GMT
#14
Isn't this math and not science?
Muirhead
Profile Blog Joined October 2007
United States556 Posts
August 16 2010 22:38 GMT
#15
All this stuff is invalidated because of blizzard's matchmaking service, which will make all but the very best and worst players on the entire ladder converge to a 50% win-rate.
starleague.mit.edu
dcberkeley
Profile Joined July 2009
Canada844 Posts
August 16 2010 22:38 GMT
#16
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

Scientific != science
Moktira is da bomb
tathecat563
Profile Joined April 2010
United States96 Posts
August 16 2010 22:40 GMT
#17
Can you calculate the win rate of diamond vs diamond as well? I think that might shed some more light on possible imbalance.

Since when diamond players play against the lower rank players, they get more wins because of a skill difference. This difference under-exaggerates the difference between just diamond level players playing each race.
Hi
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 16 2010 22:41 GMT
#18
I wouldn't try to extrapolate my 'results' to the pro level, just because the level of play is so great and the game itself is played so differently. Also, I would guess that performance of top-teir pros on the ladder wouldn't closely correlate with tournament performance, due to differences in their play habits. I think the only way to know if SC2 is balanced at the pro level would be to compile results of tournaments, and I suspect that there haven't been enough of those to give a difinitive answer (BW is pretty streak-y for certain races, for example).

Regarding the other criticisms, I was under the impression that players in the same league get roughly the same player draws as one another. Is this true, or do higher ranked players draw from a different pool of players than lower ranked players in the same league?
Andtwo
Profile Joined June 2009
United States126 Posts
August 16 2010 22:42 GMT
#19
What test did you actually use?

The problem I have with that website is that many of the super high level players simply do not ladder very much and instead have practice partners.

I think the best part of this post is the graph for bronze XD
Loser777
Profile Blog Joined January 2008
1931 Posts
Last Edited: 2010-08-16 22:44:01
August 16 2010 22:42 GMT
#20
This is certainly interesting, but one should also consider the situation from a Brood War perspective... in Brood War -how the winrates of pros of different races are different also yet the game is considered to be balanced (with slight variations)

Such a statistical analysis has problems in that eventually (as we are seeing now), only the confident players in the supposedly underpowered races will play that race -so you're never going to see extreme variations in winrate -people will stop playing the losing race.
6581
futoM4ki
Profile Joined May 2010
Germany73 Posts
Last Edited: 2010-08-16 22:45:45
August 16 2010 22:43 GMT
#21
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?


XD great!!! Is math actually science? Ask Nobel :D

On August 17 2010 07:38 Muirhead wrote:
All this stuff is invalidated because of blizzard's matchmaking service, which will make all but the very best and worst players on the entire ladder converge to a 50% win-rate.


this

When you loose to much, you´ll be matched against weeker opponents. Even against those ranked in 1 - 2 Divisions under yours

great post anyway
Do you really want chat rooms?
virgozero
Profile Joined May 2010
Canada412 Posts
Last Edited: 2010-08-16 22:44:16
August 16 2010 22:43 GMT
#22
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

nope.

mathematics is a form a logical deductions based on #s.

this is science because they are logical (or i'd say illogical for this case) inductions based on facts.
Wr3k
Profile Blog Joined June 2009
Canada2533 Posts
Last Edited: 2010-08-16 22:48:48
August 16 2010 22:45 GMT
#23
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.


QFT. You shouldn't be a scientist. The match making system is designed so that you get players as close to a 50% win rate as possible. So in a hypothetical situation of one race being overpowered, the win %'s of the race will not change, merely the distribution of players. You should look at the distribution per rating by race.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 16 2010 22:45 GMT
#24
given enough games and time winrates for players on any ladder should be approaching 50% so i guess if we assume that blizzard's matchmaking system is working correctly, we can look to racial distribution at top levels for an indication of balance right?
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
Muirhead
Profile Blog Joined October 2007
United States556 Posts
August 16 2010 22:47 GMT
#25
I would think the best way Blizzard could test balance at all levels is to have separate hidden ELOs for each MU. Then they could see that the typical Diamond Z is 600 in ZvZ and 550 in ZvT, for example.
starleague.mit.edu
ejac
Profile Blog Joined January 2009
United States1195 Posts
Last Edited: 2010-08-16 22:49:39
August 16 2010 22:48 GMT
#26
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.
esq>n
tathecat563
Profile Joined April 2010
United States96 Posts
August 16 2010 22:49 GMT
#27
On August 17 2010 07:41 GagnarTheUnruly wrote:
I wouldn't try to extrapolate my 'results' to the pro level, just because the level of play is so great and the game itself is played so differently. Also, I would guess that performance of top-teir pros on the ladder wouldn't closely correlate with tournament performance, due to differences in their play habits. I think the only way to know if SC2 is balanced at the pro level would be to compile results of tournaments, and I suspect that there haven't been enough of those to give a difinitive answer (BW is pretty streak-y for certain races, for example).

Regarding the other criticisms, I was under the impression that players in the same league get roughly the same player draws as one another. Is this true, or do higher ranked players draw from a different pool of players than lower ranked players in the same league?


Well they have to draw from other leagues to all have >50% win rate.

With Diamond only statistics, there will be some that are <50% and over 50%.
Hi
Wr3k
Profile Blog Joined June 2009
Canada2533 Posts
Last Edited: 2010-08-16 22:52:09
August 16 2010 22:51 GMT
#28
On August 17 2010 07:48 ejac wrote:
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.


Yeah, this is much more scientific "proof" that Terran is in fact OP, and Z is the worst race.

I mean cmon... look at it, you would have to be blind to not see a relationship: http://www.sc2ranks.com/stats/race/us/1
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 16 2010 22:53 GMT
#29
On August 17 2010 07:51 Wr3k wrote:
Show nested quote +
On August 17 2010 07:48 ejac wrote:
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.


Yeah, this is much more scientific "proof" that Terran is in fact OP, and Z is the worst race.

I mean cmon... look at it, you would have to be blind to not see a relationship: http://www.sc2ranks.com/stats/race/us/1

that's ignoring the fact that as skill level gets higher sample size becomes smaller and you have to compensate for a margin of error (calculate significance or something?). 70:30 with a sample size of 1000000 is vastly different from 70:30 with a sample size of 10.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
LolnoobInsanity
Profile Joined May 2010
United States183 Posts
August 16 2010 22:58 GMT
#30
This doesn't show that the game is balanced at all. All this shows is that matchmaking is pretty good.
Biochemist
Profile Blog Joined February 2009
United States1008 Posts
August 16 2010 23:00 GMT
#31
Re-work your hypothesis based on your new understanding of how the matchmaking system actively creates 50% win rates and run the numbers again!

+ Show Spoiler +
[image loading]


Isn't science fun?
The_Pacifist
Profile Blog Joined May 2010
United States540 Posts
August 16 2010 23:01 GMT
#32
Kudos to the OP. This is actually a very well-done statistics test.

Unfortunately, people have said that the matchmaking system will pair players in a way to guarantee balanced win rates. In which case...

Nevertheless, I think it's pretty interesting to compare the win rate results at the lowest leagues. I would never have expected such a large difference going from bronze to silver. For the longest time, I thought both were pretty much the same animal (as in, a Silver Leaguer only sucks a hair bit less than a Bronze Leaguer.) Either that, or Bronze leaguers suck so bad that the matchmaking system has a hard time giving them "easy wins" to even out the win rates.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-16 23:05:13
August 16 2010 23:01 GMT
#33
On August 17 2010 07:45 mahnini wrote:
given enough games and time winrates for players on any ladder should be approaching 50% so i guess if we assume that blizzard's matchmaking system is working correctly, we can look to racial distribution at top levels for an indication of balance right?


I agree. Here's a graph of the racial distributions. The y-axis is proportion of that race in the games played pool for a certain league.

[image loading]

This indicates that racial imbalances aren't causing weak races to get held back in lower leagues. If some of the earlier criticisms were true, that matchmaking obviates differences in racial performance, we should see some races gaining prominence and others losing it as you move through the leagues. In particular, the races that indicated as slightly weak in my analysis should fall out of diamond. Comparing silver through diamond you can see that this isn't the case. For example, zerg gets more common.

I think it's reasonable to conclude that the races are pretty balanced, but I acknowledge that some of the criticisms I'm getting are valid.

The analysis was a chi-square analysis comparing observed distributions vs. homogenous distributions (assumed under random sorting).

Also, this is definitely science, because it uses a hypothesis-based testing approach. Math is just a tool to accomplish the science. Whether it's good science seems to be stimulating a rigorous debate LOL.

The 'real' mathematical way to test for imbalance would probably require treating players individually, and using regression-based approaches to predict performance based on race, league placement, etc. That way one could parse out the influence race has on win rate. I don't have access to that kind of data, though. The best would be to control for player as a variable, to see if players consistently perform better with certain races than others.

Edit: these^^ are games played not players active, so take the graph with the appropriate grain of salt.
uzyszkodnik
Profile Joined April 2010
Poland64 Posts
August 16 2010 23:02 GMT
#34
the results are around a statistical mistake.
Also you dont take into account different types of build that player could play / undiscovered ways of play etc.

E.g a nice proof will be to compute a build and check at which point player A could make heavy push on player B, how long is that timing window etc. Checking does equally macroing players will end with a draw or one of them will loose and so on.
Wr3k
Profile Blog Joined June 2009
Canada2533 Posts
Last Edited: 2010-08-16 23:08:18
August 16 2010 23:04 GMT
#35
On August 17 2010 07:53 mahnini wrote:
Show nested quote +
On August 17 2010 07:51 Wr3k wrote:
On August 17 2010 07:48 ejac wrote:
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.


Yeah, this is much more scientific "proof" that Terran is in fact OP, and Z is the worst race.

I mean cmon... look at it, you would have to be blind to not see a relationship: http://www.sc2ranks.com/stats/race/us/1

that's ignoring the fact that as skill level gets higher sample size becomes smaller and you have to compensate for a margin of error (calculate significance or something?). 70:30 with a sample size of 1000000 is vastly different from 70:30 with a sample size of 10.


Yes, the sample size is very small, but there are only so many 800+ diamond players to get data from. SC2 needs to be balanced at the highest level. It's the only sample size we have. A small sample size with a analysis that actually makes sense is still infinitely better than one with a large sample size that is completely and utterly flawed. I know the sc2ranks numbers are unreliable due to their size, but there is still enough players above 600/700 to show that the difference in racial distribution is significant. The real question is whether or not players perform better with one race than another, or if more people at the top are just choosing terran. All the OP has shown with his numbers is that the match making system is working properly.
stochastic
Profile Joined April 2010
United States16 Posts
Last Edited: 2010-08-16 23:04:59
August 16 2010 23:04 GMT
#36
OP, I really like what you’ve attempted here (as a probability and statistics major!).

Perhaps a more meaningful test would be a Chi-Square test of independence on the number of diamond level players of each race:

Find percentage of total players of each race (at ALL levels). Using this, find the expected number of diamond ranked players of each race under the assumption that league placement is independent of race. Use the appropriate Chi-Square test statistic to determine the likelihood that the observed proportion of diamond players of each race fits what would be expected under independence.

I think this will help alleviate bias caused by the matchmaking service, as all we care about is that it places people into diamond correctly.

Also, we will have accounted for the fact that more terran players on the whole means more terran players in diamond, if the skill level of players across races is equal.


A problem is that the skill level of players across races may not be equal. But I have to think that players of greater skill tend to choose the “stronger” race, so the relative strength of that race would be reflected in our results.
Warent
Profile Blog Joined May 2010
Sweden205 Posts
August 16 2010 23:05 GMT
#37
I'm a bit confused about your method, why (and what) are you assuming are chi-square distributed? The win ratio? Wouldn't it be more appropriate to assume a normal distribution assuming that each game is binary distributed, either you will win - or you will lose - and there is a certain probability for one of the events to happen (hm I get a feeling that it is more complicated than that). If that is true, the central limit theorem states that you could approximately assume a normal distribution when the amount of games increases (if my memory serves me right).

I would also suggest that you in your method state your assumptions, and in this case a definition of imbalance would be in place. One factor that such a definition most likely has to include is the skill of the player. We assume that a matchup is imbalanced if players of equal skill lose due to their choice of race rather than to their lack of skill. For this reason I would advise against trying to use numbers in order to balance the game.
"More drones!"
Wr3k
Profile Blog Joined June 2009
Canada2533 Posts
August 16 2010 23:06 GMT
#38
On August 17 2010 08:04 stochastic wrote:
OP, I really like what you’ve attempted here (as a probability and statistics major!).

Perhaps a more meaningful test would be a Chi-Square test of independence on the number of diamond level players of each race:

Find percentage of total players of each race (at ALL levels). Using this, find the expected number of diamond ranked players of each race under the assumption that league placement is independent of race. Use the appropriate Chi-Square test statistic to determine the likelihood that the observed proportion of diamond players of each race fits what would be expected under independence.

I think this will help alleviate bias caused by the matchmaking service, as all we care about is that it places people into diamond correctly.

Also, we will have accounted for the fact that more terran players on the whole means more terran players in diamond, if the skill level of players across races is equal.


A problem is that the skill level of players across races may not be equal. But I have to think that players of greater skill tend to choose the “stronger” race, so the relative strength of that race would be reflected in our results.


Yes OP, please do, because I played starcraft in my stats classes and crammed to get a B.
Telcontar
Profile Joined May 2010
United Kingdom16710 Posts
August 16 2010 23:06 GMT
#39
statistics on each matchups would be more pertinent and as someone already said, the way matchmaking works means you really cant get a good sense from your results. good effort though.
Et Eärello Endorenna utúlien. Sinome maruvan ar Hildinyar tenn' Ambar-metta.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
Last Edited: 2010-08-16 23:10:35
August 16 2010 23:08 GMT
#40
On August 17 2010 08:04 Wr3k wrote:
Show nested quote +
On August 17 2010 07:53 mahnini wrote:
On August 17 2010 07:51 Wr3k wrote:
On August 17 2010 07:48 ejac wrote:
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.


Yeah, this is much more scientific "proof" that Terran is in fact OP, and Z is the worst race.

I mean cmon... look at it, you would have to be blind to not see a relationship: http://www.sc2ranks.com/stats/race/us/1

that's ignoring the fact that as skill level gets higher sample size becomes smaller and you have to compensate for a margin of error (calculate significance or something?). 70:30 with a sample size of 1000000 is vastly different from 70:30 with a sample size of 10.


Yes, the sample size is very small, but there are only so many 800+ diamond players to get data from. SC2 needs to be balanced at the highest level. It's the only sample size we have. A small sample size with a analysis that actually makes sense is still infinitely better than one with a large sample size that is completely and utterly flawed. I know the sc2ranks numbers are unreliable due to their size, but there is still enough players above 600/700 to show that the difference in racial distribution is significant. All the OP has shown with his numbers is that the match making system is working properly.

saying there are enough player to make the imbalance significant doesn't make it so there are calculations for this but i'm terrible and don't know how to do them. though the OP itself doesn't show us anything revelating it does give us a concrete basis off which we can make assumptions such as: blizzards matchmaking works properly therefore we can look towards racial distribution at certain levels to help gauge imbalance.

if there are a proportional amount of zerg in at the top levels as in the general population it means the ladder perceives the skill levels of those zergs to be high which would not be the case if a certain matchup were extremely imbalanced.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
Jameser
Profile Joined July 2010
Sweden951 Posts
Last Edited: 2010-08-16 23:10:26
August 16 2010 23:08 GMT
#41
another funny thing about those numbers,
http://www.sc2ranks.com/stats/league/all/1/all

zerg and terran behave inversely to one another when you go up and down the leagues, zerg has a higher percentile of players in the higher leagues while terran has a higher percentile in the lower leages

if OP is correct and the game is balanced (which is a ridiculous assumption as his numbers mean nothing other than the AMM is working) then one can only conclude that noobs pick terran and pros pick zerg :D:D

overall the balanced or not discussion is pretty dumb, there's a reason blizzard holds off with the first patch after release; any balance or imbalance cannot be detected this early in the game by skill levels between bronze to mid diamond, simply because neither side is playing their race to full potential, you can only hope to notice imbalance in high level tournament play and even then you have to account for a players subjective bias and just random strokes of chance.

tl:dr
OP means nothing
Wr3k
Profile Blog Joined June 2009
Canada2533 Posts
August 16 2010 23:13 GMT
#42
On August 17 2010 08:08 mahnini wrote:
Show nested quote +
On August 17 2010 08:04 Wr3k wrote:
On August 17 2010 07:53 mahnini wrote:
On August 17 2010 07:51 Wr3k wrote:
On August 17 2010 07:48 ejac wrote:
The problem the way you're calculating the imbalances is that you're assuming that races should have different win percentages. A 1000 point terran may only be as skilled as an 800 point zerg, but both may have 50% win records at their perspective levels. It's just the terran has a racial imbalance allowing him to play competitively at 1000 points.

This graph: http://www.sc2ranks.com/stats/race/us/1
shows that as points goes up, terran starts to dominate the ranks more and more, and zerg gets worse and worse.


Yeah, this is much more scientific "proof" that Terran is in fact OP, and Z is the worst race.

I mean cmon... look at it, you would have to be blind to not see a relationship: http://www.sc2ranks.com/stats/race/us/1

that's ignoring the fact that as skill level gets higher sample size becomes smaller and you have to compensate for a margin of error (calculate significance or something?). 70:30 with a sample size of 1000000 is vastly different from 70:30 with a sample size of 10.


Yes, the sample size is very small, but there are only so many 800+ diamond players to get data from. SC2 needs to be balanced at the highest level. It's the only sample size we have. A small sample size with a analysis that actually makes sense is still infinitely better than one with a large sample size that is completely and utterly flawed. I know the sc2ranks numbers are unreliable due to their size, but there is still enough players above 600/700 to show that the difference in racial distribution is significant. All the OP has shown with his numbers is that the match making system is working properly.

saying there are enough player to make the imbalance significant doesn't make it so there are calculations for this but i'm terrible and don't know how to do them. though the OP itself doesn't show us anything revelating it does give us a concrete basis off which we can make assumptions such as: blizzards matchmaking works properly therefore we can look towards racial distribution at certain levels to help gauge imbalance.

if there are a proportional amount of zerg in at the top levels as in the general population it means the ladder perceives the skill levels of those zergs to be high which would not be the case if a certain matchup were extremely imbalanced.


I didn't say the imbalance was significant, I said there was a significant difference between the # of terrans and # of zergs in 600+ diamond.

I agree with the rest of your post.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 16 2010 23:13 GMT
#43
i posted this in a different thread. and there was a thread that also talked about this topic specifically but i dont remember what it was.
actually, if you disregard random as a race, the number of zerg in the top 100 is proportional with the number of zergs in the general population ~21.75% vs 22% in the top 100, protoss is underrepresented at ~39.20% vs 32%, terran is overrepresented with 39.05% vs 46%.

i haven't taken stats in a while so i don't know how to calculate the margin of error or whatever but clearly zerg isn't underrepresented.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
Dave[9]
Profile Blog Joined October 2003
United States2365 Posts
August 16 2010 23:16 GMT
#44
Can you obviously stretch the meaning of "scientific proof" when you don't even use a legitimate peer review process? Sounds like another amateur's(which doesn't measure knowledge fyi) opinion to me.
http://www.teamliquid.net/forum/viewmessage.php?topic_id=104154&currentpage=316#6317
mahnini
Profile Blog Joined October 2005
United States6862 Posts
Last Edited: 2010-08-16 23:17:44
August 16 2010 23:17 GMT
#45
but dave[9] this IS the peer review process!

we are doing science! :D
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
Gnial
Profile Blog Joined July 2010
Canada907 Posts
August 16 2010 23:18 GMT
#46
Very well written and comprehensible.

Although I agree with a lot of people that say this data can't give us a conclusive answer about balance, I think people need to realize that it doesn't mean this analysis is useless.

This is a very good analysis of one part of the equation.
1, eh? 2, eh? 3, eh?
0neder
Profile Joined July 2009
United States3733 Posts
August 16 2010 23:18 GMT
#47
So, how is your thesis different for BW?
hifriend
Profile Blog Joined June 2009
China7935 Posts
August 16 2010 23:19 GMT
#48
Lol this ought to be good. Brb reading.
Gnial
Profile Blog Joined July 2010
Canada907 Posts
August 16 2010 23:20 GMT
#49
On August 17 2010 08:08 Jameser wrote:
another funny thing about those numbers,
http://www.sc2ranks.com/stats/league/all/1/all

zerg and terran behave inversely to one another when you go up and down the leagues, zerg has a higher percentile of players in the higher leagues while terran has a higher percentile in the lower leages

if OP is correct and the game is balanced (which is a ridiculous assumption as his numbers mean nothing other than the AMM is working) then one can only conclude that noobs pick terran and pros pick zerg :D:D

overall the balanced or not discussion is pretty dumb, there's a reason blizzard holds off with the first patch after release; any balance or imbalance cannot be detected this early in the game by skill levels between bronze to mid diamond, simply because neither side is playing their race to full potential, you can only hope to notice imbalance in high level tournament play and even then you have to account for a players subjective bias and just random strokes of chance.

tl:dr
OP means nothing


More pros still pick Terran over Zerg. All it means is that noobs don't pick zerg.
1, eh? 2, eh? 3, eh?
ayababa
Profile Joined May 2010
Australia347 Posts
August 16 2010 23:22 GMT
#50
Looks like all the bronze Terran are getting 6 pool'ed and proxied by pylon lol. poor bronzies.
Evolving as a player is fun and rewarding. eg: figuring out you can kite a zealot by moving and stopping etc.
Well done is better than well said - Benjamin Franklin
GooseBoy
Profile Joined April 2010
United States66 Posts
August 16 2010 23:24 GMT
#51
Get back to me when you have calculated the win percentage against respective race.

User was warned for this post


No need to warn, as this man speaks the truth.
No_eL
Profile Joined July 2007
Chile1438 Posts
August 16 2010 23:27 GMT
#52
its not only a of results and charts... zerg its a way difficult to play than other races and bad terran and protoss players are doing so well against very good zerg players that all are getting mad about it. Its an incredible effort for zerg players to maintain good positioning and win rates that i expected to see a zerg dominance when the game finally reach a plateau about balance (two expansions to come and many patches in the future will change the game for create the ultimate RTS game that all are waiting for)
Beat after beat i will become stronger.
Phant
Profile Joined August 2010
United States737 Posts
August 16 2010 23:29 GMT
#53
I think the only people who know the true state of balance from a statistics standpoint is Blizzard. In order to make an accurate analysis you would need to know info only Blizzard knows (all this stuff I hear about hidden rating...).

I'm sure everyone knows that you don't always get paired with people from your own league, and often times get put into a game higher league players. I'm just going to make up an arbitrary "true" rating for this, let's say a Terran player has a rating of 1000. You could compare the win % of that terran player when paired against someone who is +- 10 of his rank, so 990 to 1010. Then you could look to the win% against people that have a rating 200 or more. An imbalance would be obvious if the Terran had a higher win rate against higher ranked opponents than any of the other races.

Of course this is just 100% speculation, I don't know how the matchmaking system works.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 16 2010 23:31 GMT
#54
On August 17 2010 08:27 No_eL wrote:
its not only a of results and charts... zerg its a way difficult to play than other races and bad terran and protoss players are doing so well against very good zerg players that all are getting mad about it. Its an incredible effort for zerg players to maintain good positioning and win rates that i expected to see a zerg dominance when the game finally reach a plateau about balance (two expansions to come and many patches in the future will change the game for create the ultimate RTS game that all are waiting for)

while your input may be true, frankly i think that is a cop out (and off topic!). it is near impossible to balance the difficulty of races while maintaining the diversity that starcraft does. terran was widely accepted as the most mechanically heavy race at pretty much all levels in sc1 and yet there were no large complaints about imbalance there.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
splcer
Profile Joined October 2009
United States166 Posts
August 16 2010 23:31 GMT
#55
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.

+1 this is true so statistics wont show any correct imbalances so sadly this thread was mostly pointless good effort though :D
That which grows fast, whithers as rapidly. That which grows slowly, endures
virgozero
Profile Joined May 2010
Canada412 Posts
August 16 2010 23:41 GMT
#56
On August 17 2010 08:08 Jameser wrote:
another funny thing about those numbers,
http://www.sc2ranks.com/stats/league/all/1/all

zerg and terran behave inversely to one another when you go up and down the leagues, zerg has a higher percentile of players in the higher leagues while terran has a higher percentile in the lower leages

if OP is correct and the game is balanced (which is a ridiculous assumption as his numbers mean nothing other than the AMM is working) then one can only conclude that noobs pick terran and pros pick zerg :D:D

overall the balanced or not discussion is pretty dumb, there's a reason blizzard holds off with the first patch after release; any balance or imbalance cannot be detected this early in the game by skill levels between bronze to mid diamond, simply because neither side is playing their race to full potential, you can only hope to notice imbalance in high level tournament play and even then you have to account for a players subjective bias and just random strokes of chance.

tl:dr
OP means nothing

and thats only a part of it.

there another few billion assumptions such as

higher points = more skilled
that everyone does the same build and every game is carried out similarly

that is your treating a all-in proxy reaper 800 point terran the same as a siege cliff abuse 800 point terran. Heck they could be the same guy doing different builds at different times.

the imbalance is dependant on what the people do not win/loss.

Check out idras win loss ratio, he wins almost every game. Yet he is complaining about terran simply because terran does (in his eyes) hold an advantage. This has NOTHING to do with win and losses.
FiWiFaKi
Profile Blog Joined February 2009
Canada9858 Posts
August 16 2010 23:41 GMT
#57
This scientific proof is incorrect for the reasons stated several times above. The most accurate way to find racial imbalance is:

What percentage of all Zerg players are in diamond?
What percentage of all Zergs are 600 ELO plus diamond?

The higher the number the more accurate it will be if you have a large enough sample size.
In life, the journey is more satisfying than the destination. || .::Entrepreneurship::. Living a few years of your life like most people won't, so that you can spend the rest of your life like most people can't || Mechanical Engineering & Economics Major
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 16 2010 23:42 GMT
#58
On August 17 2010 08:04 stochastic wrote:
OP, I really like what you’ve attempted here (as a probability and statistics major!).

Perhaps a more meaningful test would be a Chi-Square test of independence on the number of diamond level players of each race:

Find percentage of total players of each race (at ALL levels). Using this, find the expected number of diamond ranked players of each race under the assumption that league placement is independent of race. Use the appropriate Chi-Square test statistic to determine the likelihood that the observed proportion of diamond players of each race fits what would be expected under independence.

I think this will help alleviate bias caused by the matchmaking service, as all we care about is that it places people into diamond correctly.

Also, we will have accounted for the fact that more terran players on the whole means more terran players in diamond, if the skill level of players across races is equal.


A problem is that the skill level of players across races may not be equal. But I have to think that players of greater skill tend to choose the “stronger” race, so the relative strength of that race would be reflected in our results.


I have a feeling, given the large sample size, that I'd get a similar result -- there would be a strongly significant but small heterogeneity in the distribution of races as you move up a ladder.

@ Warent, to do this test I calculated the average win rate within a league, then calculated the proportion of games played in each league by each race. I multiplied the number of games played by each race by the average win rate to calculate expected win rates for each race if wins were homogenously distributed (i.e. equally likely to occur for each race within a league).

Doing the statistics is simple; you just subtract the observed result (number of won games for a given race) from the expected result, square that quantity, and divide it by the expected result. Then you sum the results for each race. This gives you a standardized score for the degree of difference between the observed and expected results that's greater when the difference is greater, and that follows a chi-square distribution. You then look up the value you get on a chi-square distribution with the appropriate degrees of freedom to get your p-value.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-16 23:56:49
August 16 2010 23:54 GMT
#59
Sorry to double post, but I wanted this to stand alone:

Judging by the quality of the discussion and the number of alternate hypotheses and research suggestions I've gotten, I must've done good science after all!

If I had to write a new conclusion based on the discussion we've had here, I'd say this:

Given the data I had access to, I did the only analysis I think that was available to me. What it showed was that there is essentially no difference in the performance of the races viewed on a per-game (not per-player) basis, at least within the top four leagues. What this suggests is that regardless of race, players in each league are having a similar experience. Within a given league, no particular race is more likely to win a given game than one of the other races. Whether it's because of inherently strong balance or good matchmaking, the system is adept at providing a balanced experience for players at a given level regardless of race.

Another worthwhile conclusion, in my opinion, is that for an individual person who feels they are the victim of race imbalances, what they are actually experiencing is just trouble with a particular matchup. These data offer what in my opinion is pretty strong evidence that race alone is not a strong determinant of success at a given player level. You can't randomly pick a game played by a platinum player and predict the outcome based on the race of that player.

I think it's really important that people recognize that these data were gathered based on games played, not players playing the games. There are some important implications of that and I think some of the criticisms of the 'model' assumptions aren't as relevant when viewed in a per-game context.

In order to really answer a question about balance I acknowledge that a more rigorous analysis is needed. I'd do it, except I've already used up all the data I have access to. If anyone wants to send me data I'll happily test alternative hypotheses. I also want everyone to know that I'm not taking this too seriously, and you shouldn't either! My tongue was in my cheek a bit when I wrote this, but I acknowledge that that doesn't communicate on the internet.
darmousseh
Profile Blog Joined May 2010
United States3437 Posts
August 17 2010 00:05 GMT
#60
You do realize that all statistics relating win percentage are irrelevant because of the matchmaking scheme? Matchmaking ensures that all players have roughly a 50% win percentage. The only way to test imbalance is to have a large number of games not using the matchmaking system.
Developer for http://mtgfiddle.com
hadoken5
Profile Joined May 2010
Canada519 Posts
August 17 2010 00:09 GMT
#61
I think I posted something like this, but I didn't do it with all the graphs like you did. Good job though!
stochastic
Profile Joined April 2010
United States16 Posts
August 17 2010 00:09 GMT
#62
You may find this site, and this particular page useful: http://rts-sanctuary.com/index.php?portal=TAD&act=sc2Statistics&cmd=05
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 00:11 GMT
#63
On August 17 2010 09:09 hadoken5 wrote:
I think I posted something like this, but I didn't do it with all the graphs like you did. Good job though!

don't kid yourself
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 00:12 GMT
#64
On August 17 2010 09:05 darmousseh wrote:
You do realize that all statistics relating win percentage are irrelevant because of the matchmaking scheme? Matchmaking ensures that all players have roughly a 50% win percentage. The only way to test imbalance is to have a large number of games not using the matchmaking system.


Again, I'm not testing players, I'm testing games. The data I'm using show not that players have a 50% win ratio, but that races have a 50% win ratio on a per-game basis. I'm not saying you're totally wrong, but it's important to recognize the difference and consider the relationship between per-game win rate and per-player win rate as indirectly related.
st3roids
Profile Joined June 2010
Greece538 Posts
August 17 2010 00:13 GMT
#65
I also thin k imbalances occur more in the pro leagues when good players try to exploit the different openings as terran or protoss and they have the require apm and multitasking to do it .

Also some builds are extremely powerfull but this doesnt mean that everyone knows and performs them perfect or do the very same builds all the time .
petered
Profile Joined February 2010
United States1817 Posts
August 17 2010 00:26 GMT
#66
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.
More specifically, it assumes that a diamond player is just as likely to choose terran as a bronze player. So the fact that there are more zergs in diamond proportion wise as there are in bronze could suggest that zerg is not weak, or it could suggest that bronze newbies are just less likely to pick zerg.

The key to making comparisons is always to hold one factor constant. AMM makes this very very difficult to do.

The best way to compare racial imbalances, in my mind, is to collect info on random players, and their win ratios with different races. In that case you are holding the skill of the player constant, and hence you can make some progress. You still have to make some pretty big assumptions, but it will get you farther than pure win/loss ratios.
This, my friends, is the power of the Shikyo Memorial for QQ therapy thread. We make the world a better place, one chainsaw massacre prevention at a time.
Gnial
Profile Blog Joined July 2010
Canada907 Posts
August 17 2010 00:27 GMT
#67
On August 17 2010 08:54 GagnarTheUnruly wrote:
Another worthwhile conclusion, in my opinion, is that for an individual person who feels they are the victim of race imbalances, what they are actually experiencing is just trouble with a particular matchup. These data offer what in my opinion is pretty strong evidence that race alone is not a strong determinant of success at a given player level. You can't randomly pick a game played by a platinum player and predict the outcome based on the race of that player.


This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.
1, eh? 2, eh? 3, eh?
nihlon
Profile Joined April 2010
Sweden5581 Posts
Last Edited: 2010-08-17 00:30:05
August 17 2010 00:28 GMT
#68
On August 17 2010 08:54 GagnarTheUnruly wrote:
Another worthwhile conclusion, in my opinion, is that for an individual person who feels they are the victim of race imbalances, what they are actually experiencing is just trouble with a particular matchup. These data offer what in my opinion is pretty strong evidence that race alone is not a strong determinant of success at a given player level. You can't randomly pick a game played by a platinum player and predict the outcome based on the race of that player.


So if I compile a list of all TvZ games in Platinum, will I find a perfectly even winning percentage or not? You can't say the bolded part and then claim the game is balanced...



Banelings are too cute to blow up
koswinner
Profile Joined April 2010
United Kingdom27 Posts
Last Edited: 2010-08-17 00:46:37
August 17 2010 00:32 GMT
#69
On August 17 2010 08:01 GagnarTheUnruly wrote:
Show nested quote +
On August 17 2010 07:45 mahnini wrote:
given enough games and time winrates for players on any ladder should be approaching 50% so i guess if we assume that blizzard's matchmaking system is working correctly, we can look to racial distribution at top levels for an indication of balance right?


I agree. Here's a graph of the racial distributions. The y-axis is proportion of that race in the games played pool for a certain league.

[image loading]

This indicates that racial imbalances aren't causing weak races to get held back in lower leagues. If some of the earlier criticisms were true, that matchmaking obviates differences in racial performance, we should see some races gaining prominence and others losing it as you move through the leagues. In particular, the races that indicated as slightly weak in my analysis should fall out of diamond. Comparing silver through diamond you can see that this isn't the case. For example, zerg gets more common.

I think it's reasonable to conclude that the races are pretty balanced, but I acknowledge that some of the criticisms I'm getting are valid.

The analysis was a chi-square analysis comparing observed distributions vs. homogenous distributions (assumed under random sorting).

Also, this is definitely science, because it uses a hypothesis-based testing approach. Math is just a tool to accomplish the science. Whether it's good science seems to be stimulating a rigorous debate LOL.

The 'real' mathematical way to test for imbalance would probably require treating players individually, and using regression-based approaches to predict performance based on race, league placement, etc. That way one could parse out the influence race has on win rate. I don't have access to that kind of data, though. The best would be to control for player as a variable, to see if players consistently perform better with certain races than others.

Edit: these^^ are games played not players active, so take the graph with the appropriate grain of salt.

This is just bs. You are omitting various factors in your analysis. For example, at lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.
rextyrann
Profile Joined July 2009
Germany41 Posts
August 17 2010 00:44 GMT
#70
On August 17 2010 08:01 GagnarTheUnruly wrote:

The analysis was a chi-square analysis comparing observed distributions vs. homogenous distributions (assumed under random sorting).



in not one of the post you answer to the problem of the matchmaking system. but this sentence of you should be explanation enough. it is NOT a random sorting. thats why it is a matchmaking system. thats why the test you used doesnt apply on those stats.

but kudos to your work. very well presented and appart of the wrong starting point of a random sorting it would be significant. there is just no way of analysing balancing issues just by stats unless we do have all the data about matchup stats and a sample of games outside of the matchmaking system...
st3roids
Profile Joined June 2010
Greece538 Posts
Last Edited: 2010-08-17 00:54:00
August 17 2010 00:53 GMT
#71
Id like to ask , there literally millions of games each day in bnet.

how can u gather that data accurately without working for blizzard.
Chronald
Profile Joined December 2009
United States619 Posts
August 17 2010 01:01 GMT
#72
Lol, you make it seem like you are trying to prove the game is imba. Yet your conclusion proves that it is so finitely imba, that is isn't noticeable.

I think your theory about map-making is the right approach. To really do away with the zerg early game weakness, maps need to be bigger. But these issues will be sorted out soon.

Don't forget to check out the iCCup maps, they are way more balanced then the Blizz official ones.
Got that.
holy_war
Profile Blog Joined July 2007
United States3590 Posts
August 17 2010 01:02 GMT
#73
On August 17 2010 09:53 st3roids wrote:
Id like to ask , there literally millions of games each day in bnet.

how can u gather that data accurately without working for blizzard.


Data crawling of Battle.net profiles from people's accounts.
Chimpalimp
Profile Joined May 2010
United States1135 Posts
August 17 2010 01:03 GMT
#74
This is all and stuff but you have to take percentage of people that play each race at each level into account. I am fairly sure that a much higher percentage of bronze players play Terran as any other race, because as everyone may know: Zerg is ICKY, Toss is for queers, and Terran is America.
I like money. You like money too? We should hang out.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-17 01:18:52
August 17 2010 01:13 GMT
#75
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.
MamiyaOtaru
Profile Blog Joined September 2008
United States1687 Posts
Last Edited: 2010-08-17 01:21:39
August 17 2010 01:17 GMT
#76
On August 17 2010 10:18 Oddysay wrote:
finaly someone who show zerg are not imbalanced and dont need buff

This shows no such thing. Matchmaking ruins this sort of analysis.

if *if* zerg is really underpowered, zerg players will get placed down until they are playing worse terrans, toss, or other zerg of a similar level until their win rate normalizes. The win rate looks normal, but says nothing about how the terrans and toss they are playing would be ranked a lot lower if they were zerg, or how the zerg would be ranked higher if he was a terran or toss.
Oddysay
Profile Blog Joined October 2007
Canada597 Posts
August 17 2010 01:18 GMT
#77
finaly someone who show zerg are not imbalanced and dont need buff

pratice and get good , stop hope blizzard will fix the game so you can win
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 01:21 GMT
#78
On August 17 2010 10:17 MamiyaOtaru wrote:
matchmaking kills this. if *if* zerg is really underpowered, zerg players will get placed down until they are playing worse terrans, toss, or other zerg of a similar level until their win rate normalizes. The win rate looks normal, but says nothing about how the terrans and toss they are playing would be ranked a lot lower if they were zerg, or how the zerg would be ranked higher if he was a terran or toss.


But we would see that on a graph of race use vs. placement. In that case we would see more zerg in lower leagues and fewer in higher leagues. We don't see such a pattern, suggesting that if matchmaking is having an effect, it probably isn't a strong one. Of course there's no way to know for sure without testing it directly.
blueblimp
Profile Joined May 2009
Canada297 Posts
Last Edited: 2010-08-17 01:24:49
August 17 2010 01:23 GMT
#79
On August 17 2010 09:26 petered wrote:
The best way to compare racial imbalances, in my mind, is to collect info on random players, and their win ratios with different races. In that case you are holding the skill of the player constant, and hence you can make some progress. You still have to make some pretty big assumptions, but it will get you farther than pure win/loss ratios.


Yes. In fact, not only is it the best way to evaluate racial imbalance, I claim it's the only way short of getting players to actively participate in a study. To see why this is, consider an arbitrary race, say Protoss. Looking at statistics alone, if you don't use Random, how are you going to tell the difference between "Protoss players are 2x as skilled as non-Protoss players" and "Protoss is 2x as good as other races"? It's completely impossible. I know that's a contrived example, but it's exactly the problem with evaluating Zerg balance, given that the race is pretty unfriendly to newbies.

So to sum up: please look at games random players play and evaluate their per-race stats.

Edit: I'm not saying this way is flawless either. You'll still have doubts about "well maybe random players tend to be higher-skilled at race X than race Y", but at least that's not likely to make a big difference.
AmishNukes
Profile Joined May 2010
United States98 Posts
August 17 2010 01:23 GMT
#80
You say that the differences are small, but if you consider the fact that matchmaking is designed to keep win rates at 50% and the sample size is large you end up with small percentage-wise deviations being the sign of a much larger imbalance.
johnlee
Profile Joined June 2009
United States242 Posts
August 17 2010 01:25 GMT
#81
Lots of effort put in this, so propz for that.

But what you've essentially done was SIMPLY show variance between winning percentages for each race. So if the data suggested at every level all the races' winning percentages were 50%, would it mean that the game was balanced?

Nope. Not. At. All. Absolutely not.

When we're discussing the term "imbalance" in SC2 we're talking about a specific part of the game that gives an advantage to a specific race such that the other race or races must either rely on the opponent's mistakes to win or be disadvantaged in the game.

For example, when SC1 first came out and spawning pools were 150 minerals. 4 pool was UNBELIEVABLY strong and anyone who chose to use that build order would most likely win.

BUT for different reasons such as "not wanting to be cheap" or "wanting a standard macro game" people would opt out of that option, and we'd still see a good number of protoss and terran at the top of the ladders. Does that mean the game was balanced? No.

I'm not suggesting that SC2 is balanced, nor am I suggesting that it is imbalanced. I just want to point out that imbalance should only be figured out by seeing which BO or units provide an un-overcome-able (couldn't think of a word LOL) or at least very-difficult-to-overcome advantage to a race between players of EQUAL skill. Exactly how can we quantify that? I don't know.

But we can't show imbalance by using your numbers. I feel bad cause I feel like I'm shutting down your work... but it's wrong

All of this with due respect.
Bore
MamiyaOtaru
Profile Blog Joined September 2008
United States1687 Posts
Last Edited: 2010-08-17 01:27:40
August 17 2010 01:26 GMT
#82
On August 17 2010 10:21 GagnarTheUnruly wrote:
Show nested quote +
On August 17 2010 10:17 MamiyaOtaru wrote:
matchmaking kills this. if *if* zerg is really underpowered, zerg players will get placed down until they are playing worse terrans, toss, or other zerg of a similar level until their win rate normalizes. The win rate looks normal, but says nothing about how the terrans and toss they are playing would be ranked a lot lower if they were zerg, or how the zerg would be ranked higher if he was a terran or toss.


But we would see that on a graph of race use vs. placement. In that case we would see more zerg in lower leagues and fewer in higher leagues. We don't see such a pattern, suggesting that if matchmaking is having an effect, it probably isn't a strong one. Of course there's no way to know for sure without testing it directly.

that assumes each race is chosen equally. What we do know is zerg is the least played. The percentage goes up as you go higher, but it's hard to rule out the possibility that trend is due to newbies playing them less (because they are harder lol). It's almost impossible to measure balance outside of the very top players
koswinner
Profile Joined April 2010
United Kingdom27 Posts
August 17 2010 01:35 GMT
#83
On August 17 2010 10:13 GagnarTheUnruly wrote:
Show nested quote +
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

Show nested quote +
This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

Show nested quote +
This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D
petered
Profile Joined February 2010
United States1817 Posts
Last Edited: 2010-08-17 01:38:54
August 17 2010 01:37 GMT
#84
I am not suggesting that any of your math is incorrect, just the way you are interpreting it. You are making really big assumptions that just don't work in my mind.

Within any given game, the chosen race has very little implication on likelihood of victory. Ok, this is a true fact which you have proven.

However, you then go on to state that if there were an imbalance, it would most likely show up in the distribution of races to the different league. This is the spot where you are making assumptions. You are assuming little to no race changing, you are assuming(as I said before) that people of different skill level have the same probability of choosing race X, and you are assuming that people in the same league are getting matched up against similar opponents, which we don't even know. The top players in a league might get matched up against players from the higher league more often.

I really appreciate your efforts but you just can't make conclusions from the tests you have done. Likely there is no way to determine racial imbalances from the data provided.
This, my friends, is the power of the Shikyo Memorial for QQ therapy thread. We make the world a better place, one chainsaw massacre prevention at a time.
Bitters
Profile Blog Joined August 2010
Canada303 Posts
August 17 2010 01:40 GMT
#85
Not sure if this has been mentioned yet but...

This only looks at leagues instead of ranks within. Really, this seems to be more of a test of the matchmaking system than anything.

If in each league, the top 66.66% ranks were a mix of terran and protoss, with zerg comprising the bottom 33% ranks, how would your test account for this? If the ranks were laid out like this, there would be a clear imbalance since "diamond" zergs couldn't outrank "diamond toss" and would be getting more wins from platinum players. Obviously, this is an extreme case, but it does raise an issue.

Also, looking at racial wins versus other races might be enlightening. What are the TvZ and ZvP percentage wins by race? If balanced, you would assume 50% in each case. However, with this data, you could still run into the problem of breaking it down only by leagues.
gods_basement
Profile Blog Joined August 2010
United States305 Posts
August 17 2010 01:47 GMT
#86
one of the main pillars of the "Terran Imbalance" argument is that terran players are just worse than the rest of the world, so therefore the game is imbalanced, because the statistics are even when skill levels are not.
(TT~TT)
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 01:53 GMT
#87
@koswinner

I have to admit that the fact that you personally attacked me in your post caused me not to read yours very carfefully. I'll try not to repeat that mistake but I still don't know how I could accurately paramatize 'value of investment.' I suppose it would be interesting to see if people do better with one race the more they played with it to the exclusion of the other races, but this would only influence balance if it was more important in some races than others. I just don't think it's the most likely alternative explanation for the results I showed.

As far as omitting variables from a regression, obviously you reduce the predictive power of your model, but it doesn't really affect your ability to determine the importance of the effects that you can model. It's all sort of a moot point for the time being because I have no ability to get player-specific data to run a regression model.

In any case, the reason I did this was for fun. Hopefully people are enjoying this post, or at least having fun picking on me. Defininitely people pointed out some things that I didn't think of, but I still think it's neat to think about the results I got and what they might imply about the state of balance of the game. And yes, I do think my results constitute 'evidence.' It's obvious you aren't convinced, but I'm glad others seem to have found some value in my little project.

@ negative feedback people: no worries, I'm not bothered. It's nice to get constructive feedback even if it's negative!
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-17 02:10:56
August 17 2010 01:58 GMT
#88
On August 17 2010 10:40 Bitters wrote:
Not sure if this has been mentioned yet but...

This only looks at leagues instead of ranks within. Really, this seems to be more of a test of the matchmaking system than anything.

If in each league, the top 66.66% ranks were a mix of terran and protoss, with zerg comprising the bottom 33% ranks, how would your test account for this? If the ranks were laid out like this, there would be a clear imbalance since "diamond" zergs couldn't outrank "diamond toss" and would be getting more wins from platinum players. Obviously, this is an extreme case, but it does raise an issue.

Also, looking at racial wins versus other races might be enlightening. What are the TvZ and ZvP percentage wins by race? If balanced, you would assume 50% in each case. However, with this data, you could still run into the problem of breaking it down only by leagues.


Both good points. As for the first, it seems unlikely that races would be nonrandomly distributed within leagues and not among leagues. In fact, if you go to some of the sources that have been posted, you can see that race distributions are even within leagues. As for the cause, it's hard to say, but the most parsimonious explanation is that the matchmaking system is responding to the races equally. This is consistent with a hypothesis of good general balance.

I'd really like to do the second part. Maybe someone at sc2ranks will see this and send me some raw data.

However, you then go on to state that if there were an imbalance, it would most likely show up in the distribution of races to the different league. This is the spot where you are making assumptions. You are assuming little to no race changing, you are assuming(as I said before) that people of different skill level have the same probability of choosing race X, and you are assuming that people in the same league are getting matched up against similar opponents, which we don't even know. The top players in a league might get matched up against players from the higher league more often.


My reasoning is this: Let's assume that the races are imbalanced and that players have race affinity and never change affinity. If the matchmaking system works, the players with race affinity for a weak race will lose more games than the players with affinity for the strongest race. They'll have trouble rising in the ranks, and the strong-race players will rise quickly, resulting in an uneven distribution of races.

Now relax the assumption about race switching. The players who are unsuccessful will switch over to the superior race, amplifying the effect.

Now assume the matchmaking system doesn't work -- so players are randomly distributed among the leagues -- and the races are imbalanced. In that case the strong race will be more likely to win its matches within each league, because the matchmaking system isn't biasing the leagues at all. The strong race will emerge as the race with the best win %. Even if the matchmaking system could explicity move players based on both their points and their race (which it doesn't) the only way for it to balance win ratios would be for it to place the entire spectrum of player skills for each race into each league (which we know it doesn't do for obvious reasons).

On the other hand if the game is balanced and the matchmaking system works, then we get the observed results in the most parsimonious way imagineable. Because of it's parsimony, and it's ability to explain the observed (and admittedly incomplete) data, it's the best starting point as a hypothesis for future investigation.
koswinner
Profile Joined April 2010
United Kingdom27 Posts
August 17 2010 02:07 GMT
#89
On August 17 2010 10:53 GagnarTheUnruly wrote:
@koswinner

I have to admit that the fact that you personally attacked me in your post caused me not to read yours very carfefully. I'll try not to repeat that mistake but I still don't know how I could accurately paramatize 'value of investment.' I suppose it would be interesting to see if people do better with one race the more they played with it to the exclusion of the other races, but this would only influence balance if it was more important in some races than others. I just don't think it's the most likely alternative explanation for the results I showed.

As far as omitting variables from a regression, obviously you reduce the predictive power of your model, but it doesn't really affect your ability to determine the importance of the effects that you can model. It's all sort of a moot point for the time being because I have no ability to get player-specific data to run a regression model.

In any case, the reason I did this was for fun. Hopefully people are enjoying this post, or at least having fun picking on me. Defininitely people pointed out some things that I didn't think of, but I still think it's neat to think about the results I got and what they might imply about the state of balance of the game. And yes, I do think my results constitute 'evidence.' It's obvious you aren't convinced, but I'm glad others seem to have found some value in my little project.

@ negative feedback people: no worries, I'm not bothered. It's nice to get constructive feedback even if it's negative!

actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.



TheMick
Profile Joined April 2010
Great Britain164 Posts
August 17 2010 02:16 GMT
#90
so the percentages difference is barely noticable 1-2%, can just mean there is slightly better players playing terran.
good work thou, and nicely layed out.
http://eu.battle.net/sc2/en/profile/265104/1/HyperioN/ My SC2 profile!
Bitters
Profile Blog Joined August 2010
Canada303 Posts
August 17 2010 02:17 GMT
#91
another interesting thing to look at might be how these stats change over time. we are still less than a month after release, and people are learning new tricks and how to abuse features of their race.

if terran is overpowered due to whatever reasons, we may see these trends increase over time as their players lean towards whatever unit or composition, etc. makes them imba.

what might be an interesting stat to test (if possible) would be the average diamond league points by race. divisional ranks would be good two, however with how divisions work (like new ones) ranks themselves aren't very meaningful. summing all diamond league points by race might give a better insight on how terran is performing within the league.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 02:18 GMT
#92
On August 17 2010 10:35 koswinner wrote:
Show nested quote +
On August 17 2010 10:13 GagnarTheUnruly wrote:
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
RMmanlots
Profile Joined May 2010
United States95 Posts
Last Edited: 2010-08-17 02:20:49
August 17 2010 02:19 GMT
#93
I hate to say it, but this analysis misses a critically important point. Because of blizzards auto-match making system, all players should win about the same % of games. If a crappy player plays an OP race, they should rise in standings until they level out at about 50% winrate.

Basically, win rate % is not a reliable way of determining if a race is over powered. The problem isn't Terran's win rate %, its the considerably lower level of skill needed to achieve that %.
Do you want to live forever?
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 02:20 GMT
#94
On August 17 2010 11:07 koswinner wrote:actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.



This kind of stats argument might be best for PM's because it's drifting from the OT, but I want to make it clear that I never claimed (or should have claimed) that my data had predictive power. That's not really sensible with a simple chi-square analysis.

Regarding regressions, excluding important effect variables can cause your r2 to go down, but adding unimportant effect variables artificially inflates r2 and reduces the accuracy of other parameters due to autocorrelations and spurious effects of the excess predictors. Adding predictors will always increase r2 but it's not always a good idea to add predictors. Often it's not the r2 that's important, but the parameter estimates. It's really unimportant because I don't have the data to do a regression and I probably never will.
Biochemist
Profile Blog Joined February 2009
United States1008 Posts
August 17 2010 02:36 GMT
#95
I love how 80% of the posts in this thread are pointing out the exact same flaw in the study.
Sixes
Profile Joined July 2010
Canada1123 Posts
August 17 2010 02:37 GMT
#96
On August 17 2010 07:29 Toids wrote:
Ya.... you can't pull numbers from the ladder to explain balance. You need to get data from outside of the matchmaking system.


Having reached the same conclusion I was wondering if someone had done this.

Really there is only 1 sample I can think of, all the pro games in tournaments since release (as no balance changes have been made).

Taking every game (including rounds of 16 and up or so) should give a large sample size (though still likely biased by the skill of some individuals). Interesting stats would be the win percentages, mostly the matchup specific ones (say if Z was way better against P than T) as this might avoid some of the player specific error.

Anyone feel up to it?
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 02:41 GMT
#97
On August 17 2010 11:37 Sixes wrote:
Show nested quote +
On August 17 2010 07:29 Toids wrote:
Ya.... you can't pull numbers from the ladder to explain balance. You need to get data from outside of the matchmaking system.


Having reached the same conclusion I was wondering if someone had done this.

Really there is only 1 sample I can think of, all the pro games in tournaments since release (as no balance changes have been made).

Taking every game (including rounds of 16 and up or so) should give a large sample size (though still likely biased by the skill of some individuals). Interesting stats would be the win percentages, mostly the matchup specific ones (say if Z was way better against P than T) as this might avoid some of the player specific error.

Anyone feel up to it?

wouldn't that be far more susceptible to variance? we'd be able to rack up a lot of data but they would still be from the same 10-20 players over time.
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
EZjijy
Profile Blog Joined February 2010
United States1039 Posts
August 17 2010 02:47 GMT
#98
The game hasn't even been out for a month yet. I'm sure the data is still premature, although those graphs do not look too bad at all. I mean, how balanced can you expect? A straight 50% is never going to be possible. There is also the variable of better players playing more of a certain race.
TheGeo
Profile Joined July 2010
United States51 Posts
August 17 2010 02:52 GMT
#99
I do love the statistics behind but you don't take into account other things that could effect the data enough. Things like how a lot of the good players will play the race they think is the most "OP" which is usually Terran. The not so good players will not care because they are not nearly as concerned about winning or losing. This skews the data to favor Terran in the high levels of play.
Geo the Geo
koswinner
Profile Joined April 2010
United Kingdom27 Posts
Last Edited: 2010-08-17 03:12:23
August 17 2010 03:02 GMT
#100
On August 17 2010 11:18 mahnini wrote:
Show nested quote +
On August 17 2010 10:35 koswinner wrote:
On August 17 2010 10:13 GagnarTheUnruly wrote:
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.


Attachness actually isn't that difficult if you use some indirect way of testing it. For example you could just test the correlation between change in proportion of each race and the opionion pools about which race is considered as most powerful/imba. Even if for SC2 there isn't enough sample space yet, but we could certainly use other similar type of games such as WC3 and SC1, which probably could be some valid proxy. The datas were available but just nobody really bothered to record it. Some simpler indicator could be some poll asking about whether players would consider/ is considering changing race if their race is having problem. Of course these does not distinguish between different attachness between different skilled players. If you want, just do the same survey for different groups.
Your argument is valid ONLY IF proportion of players in each level represents the balance, i.e. ONLY IF other factors are not affecting player's pick of race and change of race, and assuming each race's population has homogenous characteristics, i.e. they have similar ability, some of them does not struggle harder to get their status as opposed to other races. Then with a more detailed breakdown of bracket such as to top 20 or 10 and tournament oriented top pros it will probably be some valid test. But obviously some of the assumptions are just tooo strong/unrealistic, as players do change away from weak races to stronger ones.. Just look at WC3. Though this effect is much less signficant among top players, who already invested significantly in their particular race.

So if you see quite a number of, or even a significant proportion of the very top players of one race (which is considered as weakest) is changing to other races (mainly the commonly considered imba one), while no top players from other races changed their race, is this just an accident? Does it say anything? How about win rate of these top players with their respective races in major tournaments? All of them, I believe, are much better / more reliable indicator compared to yours, as they require much simpler/realistic assumptions and are obvious enough to overwhelm the rest.


Milkis
Profile Blog Joined January 2010
5003 Posts
Last Edited: 2010-08-17 21:12:04
August 17 2010 03:10 GMT
#101
Edit: looks like i need to read more carefully
TitleRug
Profile Blog Joined May 2010
United States651 Posts
August 17 2010 03:17 GMT
#102
On August 17 2010 11:36 Biochemist wrote:
I love how 80% of the posts in this thread are pointing out the exact same flaw in the study.

lol, that's true. You did a good job OP. I appreciate your effort.
coLCruncher fighting!
OhJesusWOW
Profile Joined August 2010
United Kingdom127 Posts
August 17 2010 03:22 GMT
#103
OP, the work you put into this is impressive and I appreciate the effort you put into all of it. But basically, your conclusion was that Starcraft 2 is not perfectly balanced. Honestly, I don't think any matchup other than a mirror matchup can be considered perfectly balanced. I can't say that a study on these numbers was entirely necessary, or even valid for that matter. Continue the pursuit though! Nice presentation.
Red Bull is the new Mountain Dew.
koswinner
Profile Joined April 2010
United Kingdom27 Posts
Last Edited: 2010-08-17 03:36:09
August 17 2010 03:28 GMT
#104
On August 17 2010 11:20 GagnarTheUnruly wrote:
Show nested quote +
On August 17 2010 11:07 koswinner wrote:actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.



This kind of stats argument might be best for PM's because it's drifting from the OT, but I want to make it clear that I never claimed (or should have claimed) that my data had predictive power. That's not really sensible with a simple chi-square analysis.

Regarding regressions, excluding important effect variables can cause your r2 to go down, but adding unimportant effect variables artificially inflates r2 and reduces the accuracy of other parameters due to autocorrelations and spurious effects of the excess predictors. Adding predictors will always increase r2 but it's not always a good idea to add predictors. Often it's not the r2 that's important, but the parameter estimates. It's really unimportant because I don't have the data to do a regression and I probably never will.

Are you copying/pasting from some sources? As I cannot see your logic in this statement, which looks abit irrelevant.
What I have been talking is omitting a SIGNIFICANT VARIABLE, which I stressed throughout my arguement, how does it linked to your logic that adding insignificant variable is a bad thing?
Parameter estimates? Oh yea you know about parameter estimates, but don't you even know that the parameter estimates CARRIES ALMOST NO CREDIBILITY if you omitted some important significant variables? For example, variable A has an actual parameter value of -5, variable B has an actual parameter value of +10, variable C has an actual parameter value of +5, then if you run a regression on A alone you may be well gettting some estimate that A has a parameter value of some positive number, depending on the degree of correlation between ABC, which is obviously false and completely bullshit.

R-square is just a side evidence.. not the main one.

I'm looking like an idiot by trying to educate you what is a scientific piece while urs is far from it.. Not wasting more time on it. Enjoy your 'Scientific Study'. :D

User was warned for this post
stepover12
Profile Joined May 2010
United States175 Posts
August 17 2010 03:35 GMT
#105
@OP, It's a decent study, but ladder data maybe not good as many have pointed out.
It's better to collect your own data from tournaments where the top pros play, that is where game balance matters most.
stepover12
Profile Joined May 2010
United States175 Posts
August 17 2010 03:50 GMT
#106
On August 17 2010 12:28 koswinner wrote:
Show nested quote +
On August 17 2010 11:20 GagnarTheUnruly wrote:
On August 17 2010 11:07 koswinner wrote:actually.. value of investment is the easy thing you could get. I believe if we do not consider those top pros who does not play ladders seriously, the rating point could just be an valid proxy for it.

Omitting more than one significant variable and relying on a single factored model will probably cause your R-square to go to some pathetic value with a huge error term. In practice, we throw this model like this to rubish bin directly instead of trying to interpret its pathetic 'preditictive power'. If you are relying on such thing to support your claim, then don't call it 'scientific', because in no way it is. Your point is just not any better than anyone who argue it's imbalanced based on one of the many potentially significant variable.



This kind of stats argument might be best for PM's because it's drifting from the OT, but I want to make it clear that I never claimed (or should have claimed) that my data had predictive power. That's not really sensible with a simple chi-square analysis.

Regarding regressions, excluding important effect variables can cause your r2 to go down, but adding unimportant effect variables artificially inflates r2 and reduces the accuracy of other parameters due to autocorrelations and spurious effects of the excess predictors. Adding predictors will always increase r2 but it's not always a good idea to add predictors. Often it's not the r2 that's important, but the parameter estimates. It's really unimportant because I don't have the data to do a regression and I probably never will.

Are you copying/pasting from some sources? As I cannot see your logic in this statement, which looks abit irrelevant.
What I have been talking is omitting a SIGNIFICANT VARIABLE, which I stressed throughout my arguement, how does it linked to your logic that adding insignificant variable is a bad thing?
Parameter estimates? Oh yea you know about parameter estimates, but don't you even know that the parameter estimates CARRIES ALMOST NO CREDIBILITY if you omitted some important significant variables? For example, variable A has an actual parameter value of -5, variable B has an actual parameter value of +10, variable C has an actual parameter value of +5, then if you run a regression on A alone you may be well gettting some estimate that A has a parameter value of some positive number, depending on the degree of correlation between ABC, which is obviously false and completely bullshit.

R-square is just a side evidence.. not the main one.

I'm looking like an idiot by trying to educate you what is a scientific piece while urs is far from it.. Not wasting more time on it. Enjoy your 'Scientific Study'. :D


I think that you are immature and have no idea what you are talking about.
Gyro
Profile Joined May 2010
Norway36 Posts
August 17 2010 03:56 GMT
#107
The game has been out like a month. And the metagame is constantly changing.
Let new strategies come and go. Let the general skill level of players rise. Let the sc2 evolve a bit before looking at statistics. (that sounded cheesy =P)

And btw, the kind of statistics i'd like to see is wins by matchup. Does zerg win more against protoss than terran?
Or game lenght by matchup, which matchup is statistically the longest/shortest.
Which race has the shortest games on average.

Or a graph with race popularity by number of games played on the ladder, over time.
That really hurt
retro-noob
Profile Joined June 2010
110 Posts
Last Edited: 2010-08-17 03:59:24
August 17 2010 03:57 GMT
#108
OP--

Are you sure that the algorithm for assigning league (and division) placement is blind to race?

Blizzard may want ~1/3 of each division to be comprised of each race and could accomplish this by placing the top x% of each race in each division or by promoting players above a certain fixed threshold, but capping the amount of Terran at ~35% for instance and then creating a new division when additional Terrans join up.

In either of those cases, the question would be begged from the start. Blizzard would have designed for this distribution to occur by rules, not by game balance.

Also, I think it is important to note that race choice is at least partially a function of skill level with a given race OR of perceived balance advantages. Just on that point alone, a good indicator of potential imbalance would be an underrepresented race across the entire population.

It's also worth generalizing from these two points. It is possible that Blizzard has any number of mechanics behind the scene designed to help players maintain a 50% win rate. It may present you with more ZvZs if you win more at ZvZ, for example.

It may give you relatively weak terrans to play against compared to similarly skilled zerg. Because we know with certainty that Blizzard wants all players to have a 50% win ratio and to be well-distributed across leagues (at least until you get to the very top), it is reasonable to assume that other significant factors apart from balance may in part be driving these statistics.

If some of what I've said here is true, we would expect at least some of the following:
*There are fewer Zerg players overall
*Zerg players have poor winrates against Terrans in general
*Zerg players play against Terrans less often
*Zerg players play against Terrans of comparable skill less often

It's the third point that would be most subtle to detect. If Blizzard has a matchmaking system that always lets Zerg play against a Terran opponent who is actually a tick less skilled but represents the match as even, then things would seem statistically balanced all the way up to the very top of the ladder.

That, however, is where it would break down. When you get to the very top as Terran, there would be no Zergs who are a tick more skilled than you to be matched up with. This would mean that a Zerg player could cruise through the ranks and then hit a brick wall when they near the top of diamond.

Consistent with that hypothesis, one would expect the percentage of Zerg at the highest meaningful tier of skill (maybe top 5 diamonds across all divisions? maybe top 50 players on each server? or maybe that's not quite the top meaningful tier, it may need to be top 10 per server, the extreme right on the bell curve) to be underrepresented relative to the ratios found elsewhere on the ladder.

While a much smaller possible sample to examine, I think I've made a good case that it's possible that an imbalance could be cheated against, hidden and swept under the rug through the matchmaking algorithm all the way to the very top where it then could no longer be hidden.

I recommend that someone more statistics minded than I am looks there next for further analysis.
rezoacken
Profile Joined April 2010
Canada2719 Posts
August 17 2010 03:59 GMT
#109
Well there too many problems with your study :/ It is a good effort but your data does not reflect what you are trying to see...

Your data are win% of players / race and /league. But what you really want to show is that one race beats the other one and this is not the same thing because people battles are not randomly made because of the matchmaking system.
See it like this : What would happen if Terran would be far more powerful than the 2 others : Every diamond would be Terrans and their battles TvT, which would give you an exact 50% in wins for Terran Diamond. Your method would qualify this as balanced.

So for a better approach you have to take data about population in leagues and compare them to general population (or a part of it like how many X is there in the top 500 of the top 100 000 players compared to how many of X there are in the 100 000).
If you have clear data of like the top 5000 and can order them you can make non-parametric studies (like Wilcoxon rank tests to see a difference in average rank).
You can also work with data involving number of game Xplayer won against X,Y,Z races I think...

But even with more accurate data it's not that easy to consider it as proof or not, because of the simple fact that what we want is the game to be balanced for pro level and not overall levels and there are too few pro games data to tell anything. On top of that take into account that these kind of thing involves the races are played at their best, which is far from the case right now... watch top players discover different ways to play the game for the average player to follow.
Either we are alone in the Universe or we are not. Both are equally terrifying.
phuzi0n
Profile Joined April 2010
United States308 Posts
August 17 2010 04:01 GMT
#110
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.

QFT


User was warned for this post
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 04:03 GMT
#111
On August 17 2010 12:02 koswinner wrote:
Show nested quote +
On August 17 2010 11:18 mahnini wrote:
On August 17 2010 10:35 koswinner wrote:
On August 17 2010 10:13 GagnarTheUnruly wrote:
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.


Attachness actually isn't that difficult if you use some indirect way of testing it. For example you could just test the correlation between change in proportion of each race and the opionion pools about which race is considered as most powerful/imba. Even if for SC2 there isn't enough sample space yet, but we could certainly use other similar type of games such as WC3 and SC1, which probably could be some valid proxy. The datas were available but just nobody really bothered to record it. Some simpler indicator could be some poll asking about whether players would consider/ is considering changing race if their race is having problem. Of course these does not distinguish between different attachness between different skilled players. If you want, just do the same survey for different groups.
Your argument is valid ONLY IF proportion of players in each level represents the balance, i.e. ONLY IF other factors are not affecting player's pick of race and change of race, and assuming each race's population has homogenous characteristics, i.e. they have similar ability, some of them does not struggle harder to get their status as opposed to other races. Then with a more detailed breakdown of bracket such as to top 20 or 10 and tournament oriented top pros it will probably be some valid test. But obviously some of the assumptions are just tooo strong/unrealistic, as players do change away from weak races to stronger ones.. Just look at WC3. Though this effect is much less signficant among top players, who already invested significantly in their particular race.

So if you see quite a number of, or even a significant proportion of the very top players of one race (which is considered as weakest) is changing to other races (mainly the commonly considered imba one), while no top players from other races changed their race, is this just an accident? Does it say anything? How about win rate of these top players with their respective races in major tournaments? All of them, I believe, are much better / more reliable indicator compared to yours, as they require much simpler/realistic assumptions and are obvious enough to overwhelm the rest.



i suppose a random poll would work but i would seriously question the reliability of such a poll to determine attachedness. i'm not sure i understand why my argument would only be valid with homogeneous races. my argument is that if you use a controlled group to test all races and use that to gauge attachedness you introduce inherit racial biases as all races in starcraft are not played the same. viable play styles, necessary mechanics, and the ability of a player to understand the depth of the race will all factor in.

i think we may have different ideas of imbalance. are we talking about imbalance as in an imbalance of wins:losses or as in one race is inherently inferior to another? even now at the height of TvZ imbalance discussion we see proportional amounts of Zerg players in the top 100. even when every post in this forum makes it seem like the matchup is utterly impossible there are STILL proportional amounts of zergs in the top 100. with PvZ generally considered balance, how come zerg's aren't underrepresented?

if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
retro-noob
Profile Joined June 2010
110 Posts
August 17 2010 04:12 GMT
#112
On August 17 2010 13:03 mahnini wrote:
if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?


See the post I just made for my thoughts on this.

In short, I'm proposing that a disproportionate representation of Zerg at the highest possible level (open for debate what that means) may well be the only reliable indicator we'd have of an imbalance given the known sophistication of Blizzard's matchmaking system as well as the potential unknowns in that system.
retro-noob
Profile Joined June 2010
110 Posts
August 17 2010 04:19 GMT
#113
...And I'll add on that note, that the current global top 20 according to sc2ranks.com shows:

10 Terrans
7 Protoss
2 Zerg
1 Random

So that's 50% Terran, 35% Protoss, 10% Zerg, and 5% random.
mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 04:22 GMT
#114
On August 17 2010 13:12 retro-noob wrote:
Show nested quote +
On August 17 2010 13:03 mahnini wrote:
if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?


See the post I just made for my thoughts on this.

In short, I'm proposing that a disproportionate representation of Zerg at the highest possible level (open for debate what that means) may well be the only reliable indicator we'd have of an imbalance given the known sophistication of Blizzard's matchmaking system as well as the potential unknowns in that system.

it may eventually be but is it even close to being reliable right now and is that alone sufficient to claim inherent racial imbalance? iloveoov once went on a 80% or so tear through the highest leagues in the world, did that mean the other races were unequipped to deal with terran? if we want to use only the best players in the world as data points wouldn't it require MUCH more time to eliminate many factors that would increase variance in results?
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
Demarini
Profile Joined May 2010
United States151 Posts
August 17 2010 04:24 GMT
#115
we already know it's imba, why all the topics about it?

User was warned for this post
retro-noob
Profile Joined June 2010
110 Posts
Last Edited: 2010-08-17 04:31:31
August 17 2010 04:28 GMT
#116
On August 17 2010 13:22 mahnini wrote:
Show nested quote +
On August 17 2010 13:12 retro-noob wrote:
On August 17 2010 13:03 mahnini wrote:
if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?


See the post I just made for my thoughts on this.

In short, I'm proposing that a disproportionate representation of Zerg at the highest possible level (open for debate what that means) may well be the only reliable indicator we'd have of an imbalance given the known sophistication of Blizzard's matchmaking system as well as the potential unknowns in that system.

it may eventually be but is it even close to being reliable right now and is that alone sufficient to claim inherent racial imbalance? iloveoov once went on a 80% or so tear through the highest leagues in the world, did that mean the other races were unequipped to deal with terran? if we want to use only the best players in the world as data points wouldn't it require MUCH more time to eliminate many factors that would increase variance in results?


No one is claiming racial imbalance merely because of the stats at the top of the leagues. People are claiming the imbalance because of the nature of the games being played at the top of those leagues as well. While this thread explores a statistical approach, others have taken an analytical approach to the mechanics of the game itself, and more than a few have been cogent.

Regardless, the scenario I propose, if in fact that is a desired or undesired consequences of Blizzard's matchmaking system, would render statistical analysis of the distribution in leagues entirely moot. The same would go for analyzing the TvZ win-rate across the entire game--it would probably look a lot like 50/50 with a teeny rounding error known as the top of the ladder.

Because in effect, Terrans on the entire ladder would be playing their TvZs with a slight handicap in how the matchup was decided--except for at the very, very top. Where we'd expect to see lopsided racial representation just like what we actually do see. Same for tournaments.
koswinner
Profile Joined April 2010
United Kingdom27 Posts
Last Edited: 2010-08-17 04:43:08
August 17 2010 04:32 GMT
#117
On August 17 2010 13:03 mahnini wrote:
Show nested quote +
On August 17 2010 12:02 koswinner wrote:
On August 17 2010 11:18 mahnini wrote:
On August 17 2010 10:35 koswinner wrote:
On August 17 2010 10:13 GagnarTheUnruly wrote:
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.


Attachness actually isn't that difficult if you use some indirect way of testing it. For example you could just test the correlation between change in proportion of each race and the opionion pools about which race is considered as most powerful/imba. Even if for SC2 there isn't enough sample space yet, but we could certainly use other similar type of games such as WC3 and SC1, which probably could be some valid proxy. The datas were available but just nobody really bothered to record it. Some simpler indicator could be some poll asking about whether players would consider/ is considering changing race if their race is having problem. Of course these does not distinguish between different attachness between different skilled players. If you want, just do the same survey for different groups.
Your argument is valid ONLY IF proportion of players in each level represents the balance, i.e. ONLY IF other factors are not affecting player's pick of race and change of race, and assuming each race's population has homogenous characteristics, i.e. they have similar ability, some of them does not struggle harder to get their status as opposed to other races. Then with a more detailed breakdown of bracket such as to top 20 or 10 and tournament oriented top pros it will probably be some valid test. But obviously some of the assumptions are just tooo strong/unrealistic, as players do change away from weak races to stronger ones.. Just look at WC3. Though this effect is much less signficant among top players, who already invested significantly in their particular race.

So if you see quite a number of, or even a significant proportion of the very top players of one race (which is considered as weakest) is changing to other races (mainly the commonly considered imba one), while no top players from other races changed their race, is this just an accident? Does it say anything? How about win rate of these top players with their respective races in major tournaments? All of them, I believe, are much better / more reliable indicator compared to yours, as they require much simpler/realistic assumptions and are obvious enough to overwhelm the rest.



i suppose a random poll would work but i would seriously question the reliability of such a poll to determine attachedness. i'm not sure i understand why my argument would only be valid with homogeneous races. my argument is that if you use a controlled group to test all races and use that to gauge attachedness you introduce inherit racial biases as all races in starcraft are not played the same. viable play styles, necessary mechanics, and the ability of a player to understand the depth of the race will all factor in.

i think we may have different ideas of imbalance. are we talking about imbalance as in an imbalance of wins:losses or as in one race is inherently inferior to another? even now at the height of TvZ imbalance discussion we see proportional amounts of Zerg players in the top 100. even when every post in this forum makes it seem like the matchup is utterly impossible there are STILL proportional amounts of zergs in the top 100. with PvZ generally considered balance, how come zerg's aren't underrepresented?

if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?

The problem with your arguement about those factors are that, if you are familiar with statistical models, you know that it is only reasonable to add more variables if you know that this variable is statistically significant.. adding more insignificant variable will have a detrimental effect about your estimation. So the problem is not that whether you added have impact or not, or are they significant? Very likely not.

As I already pointed out long ago, that if other factors (like change of race) contribute significantly to your proposed result, proportional representation, then it just doesn't carry any credibility. Maybe in some statistical terminology you didn't get it, I'll give you a easy-to-understand example: We could very likely to have that if zerg players never change, the it will get significantly under represented in top 100, but with a significant amount of zerg (especially lower level) quitting their race, zerg looks like proportionally represented in top 100, or even overly represented. So this arguement is just not valid. Due to the current mm system, the validity of racial representation increases exponentially the higher the level is. The problem is that, with top 10 or something, there is not enough sample. But together with the fact that in every region's ladder, whether top 10 or top 20 Z is very under-represented and T is overly represented.

You are just comparing two very naive arguement in your last paragraph..
I have listed out a lot more much better and valid arguement/indicators, why don't you compare "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." to them?

BTW: win% between best players of each races pretty much represent the inherent imbalance, they are almost the same thing with large sample.

PS: your previous arguement is only valid if the players of each races are homogenous rather than the race itself. Because it could be that players of certain races are more hardworking/smarter than others etc., though unlikely.

Well, I'm done with this topic, spent too much time on this..

mahnini
Profile Blog Joined October 2005
United States6862 Posts
August 17 2010 05:00 GMT
#118
On August 17 2010 13:32 koswinner wrote:
Show nested quote +
On August 17 2010 13:03 mahnini wrote:
On August 17 2010 12:02 koswinner wrote:
On August 17 2010 11:18 mahnini wrote:
On August 17 2010 10:35 koswinner wrote:
On August 17 2010 10:13 GagnarTheUnruly wrote:
On August 17 2010 09:26 petered wrote:
Distribution of race amongst leagues is sadly not valid as an indicator of racial balance. It makes the key assumption that players of different skill level are picking the different races at the same distribution.


Graphing race distribution against league level isn't a statistical test and therefore it doesn't make any assumptions. What that graph shows is that roughly equivalent numbers of games are being played by a particular race at each league level. What this suggests is that there is no sorting effect, whereupon a weak race is held back into lower leagues because players that favor that race are having trouble advancing because they are losing games with that race. It is an indirect way of testing that hypothesis. Viewed in the context of the other data, it suggests (but doesn't prove) that AMM is not the only, or even an important, factor in keeping race performance even within leagues.

I totally agree that it would be great to analyze the data using player as an explicit factor, but I don't have access to that data.

This doesn't prove that there aren't matchup imbalances.

Terran could beat Protoss 60% of the time, Protoss could beat Zerg 60% of the time, and Zerg could beat Terran 60% of the time.

At the end of the day each race would have an approximately ~50% win ratio, as supported by your graphs and charts.

However, TvP, PvZ and ZvT would all be imbalanced. The imbalances would just cancel one another out in terms of overall win ratios.


I agree totally. It would be fun to do that but again I lack the data. If someone can get it for me, I'll do that analysis.

This is just bs. You are omitting various factors in your analysis. At lower levels, when players get crushed with a certain race, they tend to change race easily. i.e. a significant variable you have omitted from that diagram is attachness to a certain race, which is obviously positively correlated with skill level. This is just because the amount of 'investment' in a certain race increases with skill level, and the players' utility is usually a function of 'value of investment', which is something like max{Value of investment in T, value of investment in P, value of investment in Z}. With the ratio of (value of investment/time or effort invested) an effective indicator of ratio balance, assuming an representative agent who is trying to maximise his utility. To avoid/minimize this problem you should either gather some reliable information about the parameter of this variable or picking some sample which will exclude this, i.e. pick the 'most attached' bracket, i.e. diamond, or even high end diamond, pro leagues and tournament.
Picking some result and trying to interpret it as solely caused by one factor when obviously there are other factors at work is an indication that either you are very biased, i.e. have a strong incentive to distort the result towards a certain direction, or your level of skill in utilizing 'scientific method' is just horrible.
So, this is not science, just some kid trying to prove his view in the name of science with the help of pseudo/naive/broken scientific method.Last edit: 2010-08-17 09:38:21


This post is not very constructive. What you're suggesting is an absurdly complex model. And please don't disparage my abilities as a scientist. I'm actually a really good scientist and I have some skill at dealing with difficult data.

I would like to be able to use a regression model to see how race, placement, and matchup affect the performance of individual players, but as I've noted repeatedly I don't have access to that data. In science when you can't get certain data you need to take indirect approaches that often involve making important assumptions. Often, there are ways to test those assumptions either directly or indirectly, but in this case the data set is extremely complicated, particularly due to match placement.

Also, I really need to emphasize that very few assumptions are required to do a chi square test. There are no distributional assumptions to the test. It simply tells us very clearly that within each of the leagues, if a match is picked at random the outcome is totally independent of the one race entering that match. The test doesn't assume that the players are distributed randomly among the races or anything like that. It just tests the hypothesis that states are nonrandomly distributed among the categories being analyzed. The data show that within a league the races have quantifiably different but functionally equal chances of winning randomly selected games. This is a point of fact. There are three non-mutually exclusive possible causes for this that I can think of:

1) the balance is good
2) the matchmaking system is accomodating for poor balance
3) the matchup balance or map balance is poor but it evens out when you ignore the confounding factors

There is no way to test the third cause, so we need to suspend it for now, and refer to better judement that it is probably happening but may not be extremely important. It's certainly a hypothesis that bears testing, however. The second cause can be tested indirectly by graphing race use frequency with league status. Since there appears to be no pattern, it suggests that the second cause is also not important. This leaves the first cause. Given consideration of the possible causes of this pattern, it is a reasonable conclusion that good balance is probably largely responsible. It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.

This is not to say that high level players like IdrA, who play in a rarefied realm with tight builds and well rehearsed timing, might not sense conditions that give certain races advantages at certain times. Certainly in BW we've witnessed major shifts of the 'metagame' that resulted in periods of dominance for the various races.


1. What I said was actually not a proposition of that model to test the balance issue, that model was just backing my point that that particular variable (attacheness to race) is very likely to be signficant factor in the overly simplified model you were proposing. And I have already enlightened you how to bypass problem like that. i.e. for that particular problem, pick samples within the same group, and I already pointed out that datas for high end diamond is readily available.
If you know how to run a regression then I assume you should know the devastating effect it will be in omitting one significant variable, don't you? Not to mention what you omitted is not only one significant variable.. So basically what I was proposing was just a multi-factor model, which is soooo common in practice, your single factor model is just 'absurdly oversimplified'. With such a skyrocketing error term and a tiny R-square caused by omitting significant variables, as a objective scientist I have no idea how could you claim that your overly simplified single factor model could explain anything at all. So the logic is simple, if that model is way toooo simplified to get the result, don't claim you got the result with some scientific method.

2. 'It also means it's ignorant to make statements like 'Terran is unbalanced,' because there is no evidence to support such a statement, and becasue the evidence that does exist suggests the opposite.'
ROFL, 'EVIDENCE', you call the result of your overwhelmingly over simplified model .... EVIDENCE??
And you are ignoring all other much more reliable indicator, like proportion of Z at top level, or some opionion pool around the world about 'the weakest race' and 'is ZvT imba'.
Nice scientist :D

testing for attachedness to race would bring about even more headache inducing factors such as style of play, mechanical requirements, and depth of understanding. we can go on and on about missing factors but we are able to make certain conclusions with the data we already have i think. anyway like i stated before the proportion of top 100 zerg players matches that of the proportion of zerg players in the general population (if someone wants to check my math and do statistical magic on it, that'd be great).

a lot of what's going is we have SOME concrete data that weighs in the favor of the races being balanced it's not a 100% thorough scientific study but that doesn't mean you sure turn a blind eye towards it. afterall, the opposing argument is simply referencing anecdotal evidence of zvt being hard and pointing out that A, B, and C top zerg players say it's imbalanced.


Attachness actually isn't that difficult if you use some indirect way of testing it. For example you could just test the correlation between change in proportion of each race and the opionion pools about which race is considered as most powerful/imba. Even if for SC2 there isn't enough sample space yet, but we could certainly use other similar type of games such as WC3 and SC1, which probably could be some valid proxy. The datas were available but just nobody really bothered to record it. Some simpler indicator could be some poll asking about whether players would consider/ is considering changing race if their race is having problem. Of course these does not distinguish between different attachness between different skilled players. If you want, just do the same survey for different groups.
Your argument is valid ONLY IF proportion of players in each level represents the balance, i.e. ONLY IF other factors are not affecting player's pick of race and change of race, and assuming each race's population has homogenous characteristics, i.e. they have similar ability, some of them does not struggle harder to get their status as opposed to other races. Then with a more detailed breakdown of bracket such as to top 20 or 10 and tournament oriented top pros it will probably be some valid test. But obviously some of the assumptions are just tooo strong/unrealistic, as players do change away from weak races to stronger ones.. Just look at WC3. Though this effect is much less signficant among top players, who already invested significantly in their particular race.

So if you see quite a number of, or even a significant proportion of the very top players of one race (which is considered as weakest) is changing to other races (mainly the commonly considered imba one), while no top players from other races changed their race, is this just an accident? Does it say anything? How about win rate of these top players with their respective races in major tournaments? All of them, I believe, are much better / more reliable indicator compared to yours, as they require much simpler/realistic assumptions and are obvious enough to overwhelm the rest.



i suppose a random poll would work but i would seriously question the reliability of such a poll to determine attachedness. i'm not sure i understand why my argument would only be valid with homogeneous races. my argument is that if you use a controlled group to test all races and use that to gauge attachedness you introduce inherit racial biases as all races in starcraft are not played the same. viable play styles, necessary mechanics, and the ability of a player to understand the depth of the race will all factor in.

i think we may have different ideas of imbalance. are we talking about imbalance as in an imbalance of wins:losses or as in one race is inherently inferior to another? even now at the height of TvZ imbalance discussion we see proportional amounts of Zerg players in the top 100. even when every post in this forum makes it seem like the matchup is utterly impossible there are STILL proportional amounts of zergs in the top 100. with PvZ generally considered balance, how come zerg's aren't underrepresented?

if at the tip-top level, let's say the top 10, zerg are underrepresented and there is an imbalance in wins and losses, is that enough to claim inherent racial imbalance? it is much easier to say, "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." than it is to say, "zerg's are currently underrepresented and therefore at the zerg is an inherently inferior race." such an extraordinary claim requires much more evidence don't you think? much more than the opinions of top gamers over a 3 week timespan at least?

The problem with your arguement about those factors are that, if you are familiar with statistical models, you know that it is only reasonable to add more variables if you know that this variable is statistically significant.. adding more insignificant variable will have a detrimental effect about your estimation. So the problem is not that whether you added have impact or not, or are they significant? Very likely not.

As I already pointed out long ago, that if other factors (like change of race) contribute significantly to your proposed result, proportional representation, then it just doesn't carry any credibility. Maybe in some statistical terminology you didn't get it, I'll give you a easy-to-understand example: We could very likely to have that if zerg players never change, the it will get significantly under represented in top 100, but with a significant amount of zerg (especially lower level) quitting their race, zerg looks like proportionally represented in top 100, or even overly represented. So this arguement is just not valid. Due to the current mm system, the validity of racial representation increases exponentially the higher the level is. The problem is that, with top 10 or something, there is not enough sample. But together with the fact that in every region's ladder, whether top 10 or top 20 Z is very under-represented and T is overly represented.

You are just comparing two very naive arguement in your last paragraph..
I have listed out a lot more much better and valid arguement/indicators, why don't you compare "look zerg's are proportionally represented and therefore the game is inherently balance at the moment." to them?

BTW: win% between best players of each races pretty much represent the inherent imbalance, they are almost the same thing with large sample.

PS: your previous arguement is only valid if the players of each races are homogenous rather than the race itself. Because it could be that players of certain races are more hardworking/smarter than others etc., though unlikely.



ok that makes sense to me but isn't that in itself very simplified? as far as i know battlenet keeps your race singular until you played more games with another race but your stats remain the same so i see what you are saying about lower levels switching races. you're right, there's definitely a possibility there to skew results.

what i was trying to say with the naive arguments is that at the moment proving anything statistically is a one way street in favor of balance, that is you can only prove balance and not imbalance (inherent racial imbalance) because not enough time has passed to do so, at least directly.

though, i'm still not sure what you mean by attachedness to a race. if you mean that simply to be willing or unwillingness to switch race in the face of adversity or perceived imbalance that seems like a very complex thing to do as there many factors that would affect the outcome of that value even though it would not affect the model you are applying attachedness to. does that make sense?
the world's a playground. you know that when you're a kid, but somewhere along the way everyone forgets it.
c.Deadly
Profile Joined March 2010
United States545 Posts
August 17 2010 05:03 GMT
#119
On August 17 2010 12:10 Milkis wrote:
taking differences of percents is not a scientific study, nor is it statistically sound.

please don't just look at percents and try to frame it as statistics. this is far from how you would be approaching it. you don't even have a theory of how this actually works out nor do you consider an actual model, you just subtract differences and hope it works out

holy crap what people try to pass as statistics on the internet is absolutely appalling


This is the truth - There's no measure of certainty or variance in these statistics, and you'd have to assume the game is actually balanced to set it all to a bell curve.

What about considerations of unknown variables? What if players new to SC (and RTS games) are more attracted to Terran because of familiarity through the campaign, leading to Terrans having a much lower win% in Bronze league?
Milkis
Profile Blog Joined January 2010
5003 Posts
Last Edited: 2010-08-17 05:18:15
August 17 2010 05:16 GMT
#120
The problem with your arguement about those factors are that, if you are familiar with statistical models, you know that it is only reasonable to add more variables if you know that this variable is statistically significant.. adding more insignificant variable will have a detrimental effect about your estimation. So the problem is not that whether you added have impact or not, or are they significant? Very likely not.


It's only reasonable to add more variables if you want to see if the variable has an effect. Statistics in the end is based on theory -- you shouldn't concentrate on the actual model itself but on the theory first.After you figure out what you think may affect the variables, you make a model with the variables.

Secondly, no. If a variable is insignificant, then all it does at best is increase the variance a bit (meaning you'll need more samples). If the sample is large enough then this isn't even a problem. You usually figure out a variable is insignificant after fitting it to a statistical model. You don't know if it's significant or not until you have done so. Missing a significant variable is a much bigger problem than adding in an insignificant one, and you can easily filter out insignificant variables. But even then, that's never for certain.

I can't even comprehend the rest of your post.

This is the truth - There's no measure of certainty or variance in these statistics, and you'd have to assume the game is actually balanced to set it all to a bell curve.


It doesn't have to do with certainty or variance, but the model has no statistical basis at all. You have to start with how you think the data is distributed, and in order to that you need better theory.

Also being balanced has nothing to do with it being normal ("set it all to a bell curve"). Balance is a pesky, if not, impossible think to measure and it only paints a relative picture given the strategies, maps, and the players' innate skillsets at hand. Arguing balance based on statistics will be a lot more complicated than these simplistic approaches.



GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 05:21 GMT
#121
On August 17 2010 14:03 c.Deadly wrote:
Show nested quote +
On August 17 2010 12:10 Milkis wrote:
taking differences of percents is not a scientific study, nor is it statistically sound.

please don't just look at percents and try to frame it as statistics. this is far from how you would be approaching it. you don't even have a theory of how this actually works out nor do you consider an actual model, you just subtract differences and hope it works out

holy crap what people try to pass as statistics on the internet is absolutely appalling


This is the truth - There's no measure of certainty or variance in these statistics, and you'd have to assume the game is actually balanced to set it all to a bell curve.

What about considerations of unknown variables? What if players new to SC (and RTS games) are more attracted to Terran because of familiarity through the campaign, leading to Terrans having a much lower win% in Bronze league?


I was going to leave this alone, but the post you're quoting definitely isn't true. A Chi-square analysis is a foundational test in statistics. It has many uses for testing distributional assumptions and it's presented here in it's simplest form. It's absolutely the most appropriate test to use given the data that were available. The reason there's no variance in the statistics I used is because the Chi-square test doesn't use variance to generate its test statistic. As far as certainty goes, I have it, I just didn't report it because I thought the editorial standards of a video game website would permit me to publish a data analysis without reporting p-values, test statistics, and sample sizes. If your curious the p values for the Chi squared statistics using the full 48.6 million game sample sizes are 58.35 for diamond (p < 0.001), 76.44 for plat (p < 0.001), 40.79 for gold (p < 0.001), 5.15 for silver (p = 0.161), and 6582.10 for bronze (p < 0.001).

I would love to acount for variables like the ones you mentioned, but I can't measure them. You can't include data into a statistical analysis if you can't measure them. Instead, the only hypothesis that I could test was one about the distribution of win probability among the races. I demonstrated that the win probability is essentially evenly distributed. For reasons discussed earlier, this suggests that the balance state is good for a vast majority of players, although further investigation would be werented, were this actually a real study.
guitarizt
Profile Blog Joined March 2009
United States1492 Posts
August 17 2010 05:57 GMT
#122
On August 17 2010 07:23 Mindcrime wrote:
Scientific proof that the matchmaking system is working

That is all that this is. No conclusions about balance can be drawn from looking at win%, on ladder, when the matchmaking system is specifically designed so that you win about 50% of your games. :|


Exactly. When you deal with every day stuff a percent here or there doesn't look that big but when you're dealing with millions of games with a system that is supposed to be evenly matching everyone up I think op's results show a lot. Still I'm not all about the numbers even though I know blizzard is. I put more weight into how people are doing in the tournaments since you have to adjust to your opponent and such. I almost view it as a shortcut to what the ladder will or should do in the future although I'm not sure if that's correct or not.
“There is nothing noble in being superior to your fellow man; true nobility is being superior to your former self.” - Hemingway
heishe
Profile Blog Joined June 2009
Germany2284 Posts
Last Edited: 2010-08-17 06:09:37
August 17 2010 06:07 GMT
#123
Lol OP, you didn't really spend all that time analyzing an AMM system? You perfectly proved that the matchmaking is working, nothing more and nothing less. Is that concept really that hard to grasp, even for math nuts?

Wait a couple of more weeks and than gather data from static competition like tournaments, non-AMM leagues etc. That will be data you can use.

On August 17 2010 14:57 guitarizt wrote:
Show nested quote +
On August 17 2010 07:23 Mindcrime wrote:
Scientific proof that the matchmaking system is working

That is all that this is. No conclusions about balance can be drawn from looking at win%, on ladder, when the matchmaking system is specifically designed so that you win about 50% of your games. :|


Exactly. When you deal with every day stuff a percent here or there doesn't look that big but when you're dealing with millions of games with a system that is supposed to be evenly matching everyone up I think op's results show a lot. Still I'm not all about the numbers even though I know blizzard is. I put more weight into how people are doing in the tournaments since you have to adjust to your opponent and such. I almost view it as a shortcut to what the ladder will or should do in the future although I'm not sure if that's correct or not.


No, that is variance in the matchmaking, because it is most likely not perfect until every player has 1000+ games (it is a guessing algorithm most likely).
If you value your soul, never look into the eye of a horse. Your soul will forever be lost in the void of the horse.
Milkis
Profile Blog Joined January 2010
5003 Posts
Last Edited: 2010-08-17 21:12:26
August 17 2010 06:08 GMT
#124
Edit: and stop assuming things
Lighioana
Profile Joined March 2010
Norway466 Posts
Last Edited: 2010-08-17 06:15:30
August 17 2010 06:14 GMT
#125
Taking leagues lower then diamond into consideration when talking about balance is bad. Why is that? Because when we discuss about balance we want to talk about what's possible in the game not about what the unexperienced players are doing. Even in Diamond, the players that are not within 25th place in their league should not really be considered.
And forgive me nothing for I truly meant it all
mikado
Profile Joined April 2010
Australia407 Posts
August 17 2010 06:23 GMT
#126
Well this was a waste of time.

User was warned for this post
perditissimus
SpiciestZerg
Profile Joined August 2010
United States154 Posts
Last Edited: 2010-08-17 07:24:34
August 17 2010 07:22 GMT
#127
I'm not gunna bitch about how statistics tests i dont completely understand work, so correct me if I'm wrong.

But one criticism in your methods. If hypothetically:
Zerg won 75% against Protoss
Protoss won 75% against Terran
Terran won 75% against Zerg

wouldn't the way you analyzed it determine each race is perfectly balanced?


edit: wtf are people criticizing you for saying its imba? I'm pretty sure you showed for all practical purposes it is balanced. (assuming your methods were right)
The answer to all life's questions is more zerglings.
Certa
Profile Joined July 2010
30 Posts
Last Edited: 2010-08-17 07:43:12
August 17 2010 07:40 GMT
#128
I really don't think these stats are enough to determine imbalance, and I don't think stats that can determine imbalance will be available for at least another few years.

With that aside, I think the win/loss ratios for every race will smooth themselves out over time. Stop worrying. =)
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 07:58 GMT
#129
On August 17 2010 15:08 Milkis wrote:


I will admit that I missed the part about you using chi square test to compare distributions (specifically because you didn't report it and your entire analysis was based off your ridiculous chart).


Not sure why. I say (specifically report?) it in line 4 of the methods.

Run the Chi Square again, and rather than not weighing the "random" distribution you're testing against, weight that according to the distribution of players. The site you have listed as your method has that available. This is because the base player distribution is not random. I'm guessing what you did was assume they should all get the same number of wins given the number of games played at that level.


No need to put "random" in quotes, unless other "words" need to be in quotes as well. And of course I did the analysis in the way you described, as would any competent person doing a chi squared test (which I describe elsewhere in this thread). Your guess was wrong. Why would you assume I did it wrong? Do I really need to lay it all out in a post in a Starcraft forum? If you were wondering couldn't you have asked nicely?

If you used percentages for your chi squared then there really isn't much I can say since there's too many issues arising with that because that's probably not even normalized properly -____-


Then don't say much, because I didn't use percentages. I used the raw data for 58 million games. I also describe simulating the data for fewer games elsewhere in the post.

Also never do a power test after you run a statistical test again. Because that is absolutely and utterly a meaningless number. You run Power tests only when you're designing the test, not after you run the tests. You seem to have used power to see how many games you need to run to detect imbalances... well... what kind of imbalances? 0.1% differences? Or the differences you ran and found on the chi squared? If it's the latter it's an utterly worthless figure. The former? Then you need to be arguing about what causes that instead of just pointing at some numbers and saying "oh look it's imbalanced" which is what you did and what caused most of the anger in my post. Just posting numbers and saying "look at what the numbers say" doesn't mean jack if you don't have a theory you're actually testing.


Don't give me a lecture on how to run statistical analyses. Almost everything you've said has either been patently false or based on false assumptions of my intelligence. The only reason I did the power test was because I thought it would be fun to see how many games you would have to sample before you could tell a difference in the inbalance. I thought it was neat that it took about a million games before the sample size was large enough to detect differences. I guess I should've shot myself instead, for being stupid enough to do a post hoc power test. Let me remind you that you know NOTHING about how much statistical knowledge I have and that it may not be safe to assume that you know more than I do.

Next time you run a test, decide before hand what you're actually testing. "Okay, I think 1% is imbalanced, let's test if there's a 1 % difference". All you did was literally just provide some summary statistics since you did a complete after the fact analysis rather than actually testing something.


All I did was to try to make a post I thought would be interesting to some members of the community, showing that there's no obvious statistical reason to suspect that the races are inbalanced, based on a limited data set that I discovered last night. It's not like this is my dissertation project. Now I'm pretty much through with this thread. Most people have been civil and I've tried to appreciate the discussion, but there's been a disturbing number of really hostile posters with openly bad attitudes. Is this forum always like this?
LlamaNamedOsama
Profile Blog Joined July 2010
United States1900 Posts
August 17 2010 07:59 GMT
#130
On August 17 2010 07:38 dcberkeley wrote:
Show nested quote +
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

Scientific != science


No, but scientific refers to a specific methodology that is not followed in the OP, which seems more aimed at being a statistical study (although many have already pointed out that it still doesn't really use statistics but rather a layman's look at numbers).
Dario Wünsch: I guess...Creator...met his maker *sunglasses*
AyJay
Profile Joined April 2010
1515 Posts
August 17 2010 08:00 GMT
#131
Wheres baller when we need charts?
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-17 08:08:32
August 17 2010 08:07 GMT
#132
On August 17 2010 16:59 LlamaNamedOsama wrote:
Show nested quote +
On August 17 2010 07:38 dcberkeley wrote:
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

Scientific != science


No, but scientific refers to a specific methodology that is not followed in the OP, which seems more aimed at being a statistical study (although many have already pointed out that it still doesn't really use statistics but rather a layman's look at numbers).


Science is a process by which you formulate a hypothesis based on a theory, a prior hypothesis, or an observation, and then devise a method to objectively test that hypothesis.

I heard a hypothesis that the races were inbalanced, observed a dataset that suggested that win rates were race independent, and formulated a hypothesis that win rates were race independent. I then found support for my hypothesis by analyzing a data set of 58 million cases using a time honored statistical technique. Where does that deviate from the definition of science?

Also, I'm not a layman, and despite what you may have heard, a Chi-square analysis is in fact a statistical technique. In fact, it's possibly the most widely used technique in scientific literature, particularly in advanced statistical modeling where it's used for model verification.
MarsAttacks
Profile Joined August 2010
20 Posts
August 17 2010 08:27 GMT
#133
sorry folks,

as a statistician consultant (yea kill me please), this statistic discution is quite a non-sens (at least the OP, and many comments on the first pages)

it doesn't even compare race-X vs race-Y distribution W/L
my overall corporal temperatur is fine, while my head is rosting in an oven and my feets are in cold water..
can't discut longuer on it, it is raining now at my office, i need to go buy some sunglasses to have the sunshine back

hint : it would have be more interesting to dig into random player games, even if conclusion may prove nothing in the end

if you want an undisputed balanced game : GO game is for you (or chess, but nowadays even a PC soft beat masters). Of course it is black&white 2D... no flashy battle :/
Apolo
Profile Joined May 2010
Portugal1259 Posts
Last Edited: 2010-08-17 08:51:22
August 17 2010 08:42 GMT
#134
I think you didn't mention the simple fact that Battle.net actively messes up the winrate by tryign to make everyone 50-50,i.e, Battle.net is manipulating (not a bad thing) the winrates to go to 50-50. Have you considered that could alter your conclusions and make this anlysis not that valuable? If that didn't happen this would be great, but all this proves is that battlle.net efficiently matches people with others of their skill.

Since this is this way, it's better to look at tournaments. There's no win rate messing there, but only raw data. You should do some recent tournament race analysis if you have the free time, since you seem to have some taste for it , and that would be more valuable i believe.
RonNation
Profile Blog Joined April 2010
United States385 Posts
August 17 2010 08:45 GMT
#135
balance is only relevant for the top 1% of players, any results of players below that is inconsequential
threehundred
Profile Joined July 2009
Canada911 Posts
August 17 2010 08:51 GMT
#136
tbh if you are earning as much money as blizzard im pretty sure they higher enough mathematicians/statisticians to do GIANT SPREADSHEETS for balancing and tuning of their different mechanics.

for godsake, they have GIANT SPREADSHEETS for world of warcraft, a game which i'd like to say is has a HUGE SET OF COMPLICATIONS/BALANCING ISSUES
KimTaeyeon MEDIC MU fighting! ^^;;
Baarn
Profile Joined April 2010
United States2702 Posts
Last Edited: 2010-08-17 08:58:46
August 17 2010 08:54 GMT
#137
There are imbalances in every game made. Some games have bigger imbalances than others but that doesn't make the game completely terrible. They balanced this enough to resemble maybe wow but kept the dumb above their parents blockbuster MW2. The idea was to get as many people as possible to buy it. It's still #1 seller 3 weeks in a row. I don't see the point in another one of these list statistics threads to show the matchmaking system is working like it was intended to.
There's no S in KT. :P
stochastic
Profile Joined April 2010
United States16 Posts
August 17 2010 08:59 GMT
#138
On August 17 2010 15:08 Milkis wrote:
Show nested quote +
On August 17 2010 14:21 GagnarTheUnruly wrote:
I was going to leave this alone, but the post you're quoting definitely isn't true. A Chi-square analysis is a foundational test in statistics. It has many uses for testing distributional assumptions and it's presented here in it's simplest form. It's absolutely the most appropriate test to use given the data that were available. The reason there's no variance in the statistics I used is because the Chi-square test doesn't use variance to generate its test statistic. As far as certainty goes, I have it, I just didn't report it because I thought the editorial standards of a video game website would permit me to publish a data analysis without reporting p-values, test statistics, and sample sizes. If your curious the p values for the Chi squared statistics using the full 48.6 million game sample sizes are 58.35 for diamond (p < 0.001), 76.44 for plat (p < 0.001), 40.79 for gold (p < 0.001), 5.15 for silver (p = 0.161), and 6582.10 for bronze (p < 0.001).


I will admit that I missed the part about you using chi square test to compare distributions (specifically because you didn't report it and your entire analysis was based off your ridiculous chart). However, there's still too many assumptions going in there, not even going into the matchmaking issue.

Run the Chi Square again, and rather than not weighing the "random" distribution you're testing against, weight that according to the distribution of players. The site you have listed as your method has that available. This is because the base player distribution is not random. I'm guessing what you did was assume they should all get the same number of wins given the number of games played at that level.

If you used percentages for your chi squared then there really isn't much I can say since there's too many issues arising with that because that's probably not even normalized properly -____-

Also never do a power test after you run a statistical test again. Because that is absolutely and utterly a meaningless number. You run Power tests only when you're designing the test, not after you run the tests. You seem to have used power to see how many games you need to run to detect imbalances... well... what kind of imbalances? 0.1% differences? Or the differences you ran and found on the chi squared? If it's the latter it's an utterly worthless figure. The former? Then you need to be arguing about what causes that instead of just pointing at some numbers and saying "oh look it's imbalanced" which is what you did and what caused most of the anger in my post. Just posting numbers and saying "look at what the numbers say" doesn't mean jack if you don't have a theory you're actually testing.

Next time you run a test, decide before hand what you're actually testing. "Okay, I think 1% is imbalanced, let's test if there's a 1 % difference". All you did was literally just provide some summary statistics since you did a complete after the fact analysis rather than actually testing something.
[/i]

pretty reprehensible post. i applaud the OP for being so levelheaded in dealing with responses of this kind

no, the original analysis isn't flawless. but i don't think that makes it meaningless. take from it what you will
AcOrP
Profile Joined November 2009
Bulgaria148 Posts
August 17 2010 09:00 GMT
#139
Everything in this statistic is wrong becouse this stats are not enought to do such analyze...
The fake balance come from less number zerg players and the fact that there are alot less zergs in diamond than terran. So from 100 players in div zergs are only 20 to be diamond zerg it takes alot more skill than Diamond terran. So in diamond there are very skilled terrans and not so skilled terran players. While diamond zergs should be alot better than some of the terrans. So the ladder system will advance to diamond zergs with higher skill level than, terrans. If u are avarage skill player and abusing terran get you to diamond where u face alot more skilled zerg players you bring down the whole diamond terran win%. So from this statistic you see balance but in fact this come from skills not from ingame balance. how many good players are in diamond and how many casual ?

User was warned for this post
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 09:28 GMT
#140
On August 17 2010 17:59 stochastic wrote:
i applaud the OP for being so levelheaded in dealing with responses of this kind

no, the original analysis isn't flawless. but i don't think that makes it meaningless. take from it what you will


Thanks, I'm trying but I did lose my head a little bit. I've edited the OP in a way that hopefully will cause people to react less agressively towards it and take it in the spirit in which it was originally intended.

I also don't want to spend much space in the post discussing the stats, because they're a little wonky and I had to fudge the numbers slightly due to the nature of the data set that I had access to. Choosing what data to use and how to set up the analysis was a little tricky given the nature of the data. Nothing I did should impact the overall findings, however. If people are genuinely curious than I can elaborate in the reply thread.
Flyingdutchman
Profile Joined March 2009
Netherlands858 Posts
August 17 2010 09:41 GMT
#141
as mentioned earlier, you can't draw conclusions from this due to the AMM. Balance discussion or conclusions should be drawn from personal experience and with as little bias as possible. Good effort though
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 17 2010 09:45 GMT
#142
Thanks. As I've said, if the races are imbalanced, AMM will have the effect of changing the distribution of the races among the leagues. Because this doesn't happen, my results still provide imformation suggesting that overall balance is good.
x7i
Profile Joined July 2010
United Kingdom122 Posts
August 17 2010 10:22 GMT
#143
if the races are imbalanced, AMM will have the effect of changing the distribution of the races among the leagues
now, will it? logically it should, but considering whole system is designed to create illusion of balance and equality there is no reason to believe it does, it as well might be maintaining equal race distribution on visible level while analysis of hidden matchmaking ratios would paint a very different picture

and of course youre right, but we believe what we choose to believe

anyhow good academic work... and i do find the win ratio graph interesting and meaningful actually ;-)

Keitzer
Profile Blog Joined May 2010
United States2509 Posts
August 17 2010 10:54 GMT
#144
DUDE! You do realize that the reason terran is lower in Bronze is b/c of all the SP terrans who came over to MP are getting placed into Bronze and bringing the W/L down....
I'm like badass squared | KeitZer.489
Ganondorf
Profile Joined April 2010
Italy600 Posts
August 17 2010 11:06 GMT
#145
The race distribution is not balanced indeed. But it is not consistent with popular belief that terran dominate everything, you have a higher % of protoss players across all leagues except bronze. So i'm not sure how that can be used to determine if a game is balanced, rather that critically analyzing matchups and determining why a matchup is imbalanced and how to fix it without messing up the rest.
edahl
Profile Joined February 2008
Norway483 Posts
August 17 2010 11:12 GMT
#146
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

o_O

User was warned for this post
hdkhang
Profile Joined August 2010
Australia183 Posts
August 17 2010 12:41 GMT
#147
On August 17 2010 18:45 GagnarTheUnruly wrote:
Thanks. As I've said, if the races are imbalanced, AMM will have the effect of changing the distribution of the races among the leagues. Because this doesn't happen, my results still provide imformation suggesting that overall balance is good.


The problem is not in your methodology as much as it is in the assumptions you are making about the system.

What do we know about the system?

Anyone who wishes to play on the ladder is accepted

* You can have zero knowledge of the game and yet you will be placed in Bronze (in sports, that makes you 3rd place!), the only prerequisites for acceptance into a league is to just show up 5 times (you can even disconnect 5 times once the game starts and still get in).
* This is an automatic invalidation of any results emerging from the Bronze league as the range of skill present is astronomical!

You only play matches against people in your region

* Comparisons across different battle.net servers is worthless.

The hidden MatchMakingRating number is based only on wins/losses with respect to the current MMR of yourself and your opponent

* What was required to score that win is not considered at all, nor should it be. Unfortunately, this places the onus on making a game as balanced as possible all the more important or else the MMR is worthless.
* This point also explains why the data is "practically worthless".

The AMM will attempt to pair you up with a person with a similar MMR

* Note that I say similar MMR and not similar skill/ability.
* If a race imbalance existed, the MMR would not reveal it since it would simply consider the person using the weaker race as a "poorer" player hence a lower MMR and the person using an OP race as a better player rewarding them with a higher MMR.
* Thus if you were to compare even between players of similar MMR but across the three races, it would reveal nothing of significance since the reason they were given that MMR is due to their win/loss performance against the same people they are being compared against.

Not everyone will play the same number of games

* You may say "well duh" but it bears repeating.
* I think in the original top 200 list one of the players that made it had only played a handful of games, I think it was 7 all up, yet 7 games was enough for the system to determine their MMR to be one of the top 200 on the server.
* I honestly believe that a more stringent pre-requisite for diamond league is needed, e.g. 100 games played.

There are many more other points to make, but let's just start with the above for now.
icezar
Profile Joined June 2010
Germany240 Posts
August 17 2010 12:54 GMT
#148
Very nice post!!!
Even better are the your coments and response to all other.

I am courious what do you make of the race distribution?
There you can see a clear difference.
For example in Diamond League across al regions only 24% plays zerg as opposed to 35% playing Protoss.
Dagon
Profile Joined August 2010
Romania264 Posts
August 17 2010 15:04 GMT
#149
Umm.. I am not sure i understand all of this very good but wouldn't this analisis also proves that rock-paper-scissors is balanced? And it is true. If you take 1.000.000 random games of rock-paper-scissors games and analize them, the win-loss of each Side would be close to 50% but that deffinetly dosen't prove that Rock has a 50% win rate against paper.

Because of the match making system it may be true that random players from all leagues experience no imbalances but the question should be if all the mach-ups are balanced at pro level not that the game îs balanced all around.. I think..

Please correct me if i am wrong about the rock-paper-scissors thing because i really want to understand how this works.
escapeArtist
Profile Joined August 2010
Norway2 Posts
Last Edited: 2010-08-18 16:59:48
August 18 2010 16:53 GMT
#150
Hi there

After reading many of the comments to this great post I have set up some graphs in Excel to see how races are represented across the leages, and how they evolve as skill level evolves.

I work with analysis as a living (I'm not claiming to the best at it), but I'm not a native english so please exscuse me for not using the correct terms.

Source: http://www.sc2ranks.com/stats/race/all/1
Date: 18.08.2010

The source of the data may me skewed, but I feel that it's enough players to give a reasonable representation of the general trends.

Total number of players / 1v1 teams: 574 314

First off I had a look at the general representation of all the races:

Race Distribution
[image loading]

As we can clearly see from this cake diagram Protoss is the most represented race if you look at all the brackets and players. And to noones surprise Zerg is the least represented.

For a look at the general representation of players I also sorted the players by leages

Player Distribution
[image loading]
No big surprises here either I think.

So if you filter by popularity of each race by points / leagues then the following line diagram comes up:

Race development by points and leagues
[image loading]

First of all I must comment on the numbers and criterias used in this graph.
I removed the best of the best in each league to prevent huge leaps and irregulareties to the graph since the there was so few players. (You can't compare 30 people to 100 000).
Second I added a moving average(5) to even out the peaks to give a better visual representation of the development for each race.

Not Surprisingly Terran starts out very strong, but as you can clearly see Terran looses popularuty as leagues move up and picks ups slightly again at the upper diamond. This is most likely players moving from random races, but again every race gets immigration from the random race.

Protoss is going relativly strong from lower bronze to upper diamond and Zerg is starting out weak while getting more and more players as it moves up the ladders.

As you can also see from this then all the races gets players from the random race in the diamond leage (probably as players have to focus their time on 1 race to stay competative).

The strongest development from bronze to diamond is withouth a doubt Zerg, while Terran loses the most players.
With this I guess you can say that Terran is the typical "noob" race and Zerg is the typical "Pro" race if you judge only by development.

Now there are many things to consider when judging imbalance and i feel that the OP made a VERY string point in pointing out that the Matchmaking system works VERY well. If you keep in mind that every player above bronze wins about 55% of his matches, meaning that comparing percentage development of each race is a reasonable estimation of balance at the players current place at the ladder. (Sorry if that got complicated as I'm not a native english).

Moving on the my last graph for the evening, the total percentage represented from each race in each league:

Racial representation for each league
[image loading]

This means that 9,1% of Zerg players are in diamond league compared to 6,7% of the terrans are in diamond league.

This shows how many percent of each race is in each leage. Keep in mind that every player wins roughly 55% of their matches.

Here we can see that Zerg and Random players are way ahead of the puny Terran and the lowly Protoss.

You can say that Zerg and Random does better the higher up in the ladder you get, while Terran abd Protoss does worse the higher you get.

Does this mean that Terran is underpowered while Zerg is opverpowered in the higher tier of play?
Most certanly not!
But it does show that every race can stay competative at all levels of play.
I generally don't like to speculate, but I will anyway.
Zerg takes more APM to do well, and that is a typical trademark of a good player. Terran doesn't take as high APM so many newer players pick Terran or Protoss. This may or may not be correct as I have no way of proving this, but it's a personal experience from playing Zerg. (I switched to Terran as I am not good enought to micro everything I have to when Zerging)

I'm not going to conclude that there is perfect balance here, BUT I am going to conclude that if you judge by the numbers then I certanly can't find any proof of huge balance issues.

Even saying that, fact of the matter is that Zerg and Random has the highest representation in the higher level of play compared to their relative playerbase. This does not make Random overpowered lol, so we can't really conculde that it makes Zerg overpowered either.

The only curious point here is that the higher up the ladder the more Zerg there is...

Ofcourse a valid argument is that on the top 0,06% of the players (diamond 1001 +) they may feel that there is balance issues, however this affects such as small amount of the playerbase that if you are going to balance from personal opinions from these players you would most defenetly screw up the general game balance.

Ofcourse if at these levels there are such big differences that tournaments would be Terran & Protoss only then measures must be taken, but so far it looks reasonable.

Time will tell tho, but even if Protoss keeps winning turnaments it does'nt necceracy means that Protoss is overpowered.

All in all if there is inbalance then 99.94% of us woulnd't notice the difference anyway.

travy
Profile Joined April 2010
United States14 Posts
August 18 2010 17:30 GMT
#151
Interesting discussion. I did something similar that I just posted:

http://www.teamliquid.net/forum/viewmessage.php?topic_id=145325
MamiyaOtaru
Profile Blog Joined September 2008
United States1687 Posts
August 18 2010 17:34 GMT
#152
On August 19 2010 01:53 escapeArtist wrote:
The only curious point here is that the higher up the ladder the more Zerg there is...
Because noobs don't play zerg (because you don't play them in the campaign, or because they are harder?) Or because they are harder, only higher level players stick with them? No clue, but I have a hard time drawing any conclusions from those results.
silencesc
Profile Joined July 2010
United States464 Posts
August 18 2010 17:35 GMT
#153
I was wondering if you had the p-value for the chi-square test, I didn't see a null hypothesis or whether or not the results were statistically significant..
Real Men Proxy Gate | TEAM LIQUID HWITINGGGG!! PROUD MEMBER OF UC DAVIS CSL TEAM | "If you don't give a shit about what gum you eat, buy Stride" - Liquid`Tyler on SotG 4/19/2011
eivind
Profile Joined July 2010
111 Posts
Last Edited: 2010-08-18 17:37:22
August 18 2010 17:36 GMT
#154
On August 19 2010 01:53 escapeArtist wrote:
The only curious point here is that the higher up the ladder the more Zerg there is...

All in all if there is inbalance then 99.94% of us woulnd't notice the difference anyway.


People experienced with SC play Zerg and the race is harder to start up with. So I dont think it is that curious that there are more Zerg players there.

Your only proof is that skill can compensate somewhat for balance issues. I can beat 2-3 players at the same time with any race, this doesnt mean that my race is overpowered. If I gave 1 of them a really overpowered race which compensated for their lack of skill, then the matchmaking system would match us up if we were even players!

If we assume that the races are imbalanced then the matchmaking system matches Zerg players with lower skilled Terran players. The only way of noticing imbalance (by only looking at the ladder) this is to look at the top players where there would be a less Zerg players. This data can only be used if the top players are somewhat evenly skilled.
kataa
Profile Blog Joined August 2010
United Kingdom384 Posts
August 18 2010 17:53 GMT
#155
You can't entirely reduce a dynamic game like SC2 simply into a bunch of stats. Lets look for example at the 'imbalance in Bronze, why might this be? Well, most Bronze terrans don't know how to wall off, and tend to tech without building any units so a simple zergling runby will win you almost any game at bronze level - making zerg incredibly powerful. Does this mean Zerg is imbalanced? Clearly not.

Imbalance only really comes into the play in the nosebleeds of any reasonably crafted game. Social issues surround playstyle, attitude ect. can always explain other small variations in win rate. If balancing a game was as simple as getting the win rates to be similar then creating RTS games would be easy as hell.

We're not really going to know how bad or good the balance is for some time. Though, I'll certainly say that players like Masterasia and Sheth have made some very good points that go much deeper into SC2's design that simply 'OMFG nerf reaperz plz.' and it will be interesting to see how blizzard responds to this.

All these statistic prove, is that the match making system is pretty effective.
vesicular
Profile Blog Joined March 2010
United States1310 Posts
August 18 2010 18:02 GMT
#156
I'd like to see the same stats done with only Diamond level players with ELO of 700+. This is really where the game should be most balanced.
STX Fighting!
silencesc
Profile Joined July 2010
United States464 Posts
August 18 2010 18:41 GMT
#157
Stats minor here. I redid (I hope) your results with the numbers you put in here, with a chi-square GOF test for homogeneity. In my findings, the only league that is statistically imbalanced is bronze, based solely on W/L you reported. (the P-value at the end was about .12, higher than the .05 I set as the requirements for the null hypothesis to be overruled. Silver to Diamond, it didn't go above .03, which essentially means that after bronze, no race has a W/L record that is significantly different from any othre race.

I must stress that this is based on the data provided, which means that if the match making system really does try to make people have 50% (which I'm starting to doubt), then it is doing it's job, and keeping the game fair. This isn't to say that in the top 1% (pros), that there isn't inherent imbalance, but that evidence is anecdotal based on a small sample size of games played by pro players, I can run tests on the pro's based on race and W/L based on the reported total games from them on sc2ranking, but without a sample of at least 50 from each player, it won't really matter.

I can tell you, however, that people like IdrA or HuK that play on ladder have huge ratios, IdrA is at like 80% right now, obviously significantly higher than everyone else, but with only about 100 games played by him, I can't say for certain if that's saying things about the caliber of random diamond players, or a real trend.
Real Men Proxy Gate | TEAM LIQUID HWITINGGGG!! PROUD MEMBER OF UC DAVIS CSL TEAM | "If you don't give a shit about what gum you eat, buy Stride" - Liquid`Tyler on SotG 4/19/2011
chuninexam
Profile Joined April 2010
Canada56 Posts
August 18 2010 18:42 GMT
#158
Of course it's imbalanced. Do you really need "scientific" proof to know that?

Do you realise how much time and trial and error is involved in balancing a game like this that has so many variables?
Lalgee
Profile Joined August 2010
United Kingdom65 Posts
August 18 2010 19:11 GMT
#159
Win percentage is not a suitable test for whether or not the game is balanced

The Problem
Suppose a new 4th race is added, which is slightly stronger than the current 3 races. What happens? Well, the players who are playing this race and beating people of equal skill to them, however the rankings system is unable to determine a users skill level. The rankings system is only able to identify how often they are winning. Therefore, the players who play as the new 4th race will be higher in the rankings than they should be based on their skill. The fact that they are now higher in the rankings will result in them being matched with players who are better than them, but not using the imbalanced race. They will still win ~50% of their games, however they are higher in the rankings than people of equal skill, and the rankings have had to input them at a higher skill level in order to balance the imbalance of the race, so to speak.

Further thoughts
The fact that this is how the rankings works makes it nearly impossible to determine whether or not a race is imbalanced, apart from by listening to the people who are in the upper upper ranks. The fact that people like IdrA, White-Ra and MorroW are all able to win tournaments with their respective race means that the races are certainly balanced enough for people of equal skill lower down the rankings to have fair and even matches, where the difference in skill, or frequency/severity of errors is more likely to decide a game than the race they have selected.

An Alternate Approach
Suppose then that there was a way to have players of a certain skill level facing each of the three races, while playing as each of the three races and maintaining their current skill level on the rankings. Fortunately, random players fill this exact mould. Therefore I would suggest that the only way to find out whether or not a race is balanced/imbalanced is to look at the win % for random players while playing as each of the three races. If one race is significantly higher than the others, or one lower than the others, then it could be suggested that that race is imbalanced, one way or another.
"That's Lal-Genius"
Blabla13
Profile Joined February 2010
Comoros5 Posts
Last Edited: 2010-08-18 19:31:20
August 18 2010 19:30 GMT
#160
I'd like to say that I enjoyed the analysis in your post, Gagnar! I hope more data somehow becomes available, It would be interesting to see the data for specific match ups.
Thank you
Bob.
texmix
Profile Joined May 2010
United States106 Posts
Last Edited: 2010-08-18 20:03:35
August 18 2010 19:59 GMT
#161
As others have stated, the OP is based on a flawed methodology. If trying to use stats to figure out which race is overpowered, 4 items need to be controlled:
1. Homogeneous skill in race choice (maybe old BW semi-pro's just gravitate towards Terran in sc2)
2. The matchmaking system instead of random opponents
3. Player MU difference (one player may, in the long run, win 60% pvt, another lose 60% pvt)
4. Player MU skill changes over time (maybe a day9 video will change pvt win stats by several bps in a single week)

To control all 4 of these I suggest mining for at least 1,000 players that:
1. Have players over 200 games
2. Played at least 30 games in the last 72 hours
3. Are in the diamond league

From this list, throw out all games involving a random player (less consistent MU performance), everything older than most recent 30 games, and and calculate the group's median win ratio using the most recent 30 games. Keep the 100 players of each race with win ratios closest to the median win ratio and throw out the other 700 players games. For instance if the 1,000 players have win ratios ranging from 35% to 90% (in most recent 30 games), with median of 55%, then pick the 100 zerg, protoss, and terran players who are closest to 55%. From the remaining 300*30 games, a simple win/loss record for each MU will be about the best possible indication of imbalance I believe data mining can come up with (short of using the same methodology with more games or tweaked ratios).
PsychedelicMonk
Profile Joined July 2010
27 Posts
August 18 2010 20:45 GMT
#162
I think a race being OP isn't necessarily based on how many games they win or their win percentage, but more importantly HOW they win those games. Winning in itself is only the beginning marker in determining the balances of races. However, you can't prove imba based solely on win percentages: you don't have all the facts.

We need to take a deeper look into the game itself. If a Terran player beats a Protoss player 5 out of 5 times, does that mean that Terran is imba? No, it does not. But if in all of those games, the Terran player produces a build that the Protoss player CANNOT beat, no matter how good the micro or macro is, then yes, cry imba. It comes down to how you lose the game, in my opinion.

I'm a Protoss player, have been since BW. 600+ Diamond player but I struggle with Terran (as do most non-Terrans, it seems like). I don't lose games because I can't macro or I can't micro - no, I tend to lose most of my games to Terran because of cheap harass, EMPs, or tanks AI. How come Terran gets to have 2.5 units specifically made for harassing (reaper, banshee, helion - which counts as .5 because it can be used normally as well) when Protoss only gets 1, and it's Tier 3! Or how is it fair that a couple EMPs can cut the overall HP of my army in half in the first 2 seconds of a battle? Or how can seige tanks be so damn smart that they can strategically position their shots to maximize splash damage in my army? My Staulkers don't target-fire on their own.

Imbalances come down to how the game is played, not the end result. I don't cry imba when an MMM ball beats me fair and square, or when a nicely timed Thor/Tank/Helion push catches me off-guard. It's the little things that make Terran an imbalanced race. PvZ is fine just the way it is.

If EMP didn't have aoe, it would come down to microing Temps vs. Ghosts, feedback vs EMP, and that's a fight in which the better player (regardless of imba) would win. If tanks were as dumb as every other unit, it would come down to skill for them to do massive amounts of damage, not imba.

Hopefully, this doesn't seem like a whiny post, because I've taken a lot of time and thought into how I feel the game is going so far. I think the most rational of us can agree there is a problem with Terran, because there hasn't been this much outcry against any other race. Something needs to be done, whatever that may be.
JulianSidewind
Profile Joined May 2010
Canada52 Posts
August 18 2010 21:02 GMT
#163
Scientific proof that SC2 is imbalanced:

It was released less than a month ago.

Give it time people.
kGold
Profile Joined July 2010
Canada66 Posts
August 18 2010 21:04 GMT
#164
The game is imbalanced because of terran opening options, not because a bundle of useless statistics that only prove Blizzard's matchmaking system works to keep all players around the same W/L ratio's............... I am sure I am the hundredth person to mention that.

I feel bad the OP tried to be helpful with his stats and now realizes he wasted his time.
If I lose to a noob, then what am I?
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 18 2010 21:12 GMT
#165
Argh.... understanding of statistics is really really bad. Really really bad. Even academicians are terrible at understanding statistics.

You can't prove anything with statistics. A proof requires logic and proceeds by induction or deduction. All statistics can do is provide evidence that it's highly likely or unlikely to be imbalanced. It's highly circumstantial. Even then you need to clearly state out ALL your assumptions and if any assumptions fail to be tested then, the evidence will be flawed.
Moderator我们是个踏实的赞助商模式俱乐部
TaKemE
Profile Joined April 2010
Denmark1045 Posts
August 18 2010 21:12 GMT
#166
The overall winrate is useless, you need win rate from each matchup.
Parodoxx
Profile Joined May 2010
United States549 Posts
August 18 2010 21:13 GMT
#167
On August 17 2010 07:18 nam nam wrote:
Get back to me when you have calculated the win percentage against respective race.

User was warned for this post

He did. an average win rate in any division would assume playing each of the other races an equal amount, what that means is either zerg wins against protoss way more then they should or terran is losing equally against all races.

Hidden_MotiveS
Profile Blog Joined February 2010
Canada2562 Posts
Last Edited: 2010-08-18 21:21:55
August 18 2010 21:18 GMT
#168
On August 19 2010 04:59 texmix wrote:
As others have stated, the OP is based on a flawed methodology. If trying to use stats to figure out which race is overpowered, 4 items need to be controlled:
1. Homogeneous skill in race choice (maybe old BW semi-pro's just gravitate towards Terran in sc2)
2. The matchmaking system instead of random opponents
3. Player MU difference (one player may, in the long run, win 60% pvt, another lose 60% pvt)
4. Player MU skill changes over time (maybe a day9 video will change pvt win stats by several bps in a single week)

To control all 4 of these I suggest mining for at least 1,000 players that:
1. Have players over 200 games
2. Played at least 30 games in the last 72 hours
3. Are in the diamond league

From this list, throw out all games involving a random player (less consistent MU performance), everything older than most recent 30 games, and and calculate the group's median win ratio using the most recent 30 games. Keep the 100 players of each race with win ratios closest to the median win ratio and throw out the other 700 players games. For instance if the 1,000 players have win ratios ranging from 35% to 90% (in most recent 30 games), with median of 55%, then pick the 100 zerg, protoss, and terran players who are closest to 55%. From the remaining 300*30 games, a simple win/loss record for each MU will be about the best possible indication of imbalance I believe data mining can come up with (short of using the same methodology with more games or tweaked ratios).

I wanted to say this, but feared the backlash of "NO We has psience we is wright". The methodology of the observational study is flawed in a few ways. For one, I don't think you are considering any confounding variables such as how the ranking system comes into effect. If one race is overpowered then it's simple to assume that it will be overrepresented in relation to its total population within the top of diamond rank only. But this could also be confounded by how people think Terran is the strongest race, so the more serious players switch over to that race thinking this is true. In addition, the sample sizes here are very small.

I would like to hear what a statistician, or Blizzard statistician has to say about the data.

edit: Oh I see, the OP understands that the matchmaking systems kind of voids his analysis. I'm sorry if I sounded harsh. Great effort put into this.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-18 21:26:22
August 18 2010 21:19 GMT
#169
On August 19 2010 06:13 Parodoxx wrote:
He did. an average win rate in any division would assume playing each of the other races an equal amount, what that means is either zerg wins against protoss way more then they should or terran is losing equally against all races.



This is the first of many untested assumptions. There are about 4 or 5 other assumptions being made.

On August 19 2010 06:18 Hidden_MotiveS wrote:
In addition, the sample sizes here are very small.


Sample size is large enough - provided we are looking for something that is really imba (+/-5%) rather than just a couple percentage points.
Moderator我们是个踏实的赞助商模式俱乐部
Easy772
Profile Joined May 2010
374 Posts
August 18 2010 21:26 GMT
#170
Dude watch the IEM tournament replays.. yeesh!
"The best way to improve is to play one matchup on one map doing one strategy.. if you are good at one strategy you are a good player, if you are okay at many strategies you are an okay player at best" -Day[9] 181
betaben
Profile Blog Joined September 2007
681 Posts
August 18 2010 21:37 GMT
#171
hi, given the number of games you've studied, are the differences in % actually significant? are they more than any statistical errors, or other errors you might be able to think of? If you split the sample in two (let's say by date) do you see any difference in results? do you build that difference into the final error?

ffs, if you quote differences in percentage, it's time you say whether it actually means anything, or whether it's chance, or a static or dynamic situation.

damn lies and statistics.
Zegu
Profile Joined August 2010
Canada52 Posts
August 18 2010 21:37 GMT
#172
so you pounded data for win / loss for different levels, good job but i really don't think that you can call anything imbalanced based on who has a lower win percentage, (which zerg appears to have the lowest by a slight margin) but race mechanics / diversity and zerg is leading both of those with the lowest diversity and most micro intensive mechanics between all of the races zergs errors build up much faster than the other two races
Grimjim
Profile Joined May 2010
United States395 Posts
August 18 2010 21:39 GMT
#173
OP I appreciate the work and thought that went into this, but was all dis-proven within the first couple of posts stating that all you have shown is that the matchmaking system is working. You cannot take balance information from a system designed to always give you a 50/50 win percentage.

And with that, can we close this thread?
I am serious. And my name is Shirley.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 18 2010 21:45 GMT
#174
Best evidence to look at is the performance of random players and watch out for differences in the performance with the three races.

1. No selection biases (assumes blizzard's randomizer is good - check distribution of races when selecting random).
2. The same body of players playing all three races.
3. No biases in opponent selection (assumes blizzard's matchmaker doesn't skew - should check this.)

Analysis should look at the 6 possible match ups separately rather than aggregate into ZvAll, etc.

All this analysis will only capture the current state of the SCII metagame since the current body of players haven't fleshed out all strategies and tactics. There is also significant amount of copycat strategies going on.

All in all, it's easier to logically deduce imbalances based on build order advantages and flexibility rather than observe it in the statistical data. It'd be interesting exercise to try though.
Moderator我们是个踏实的赞助商模式俱乐部
See.Blue
Profile Blog Joined October 2008
United States2673 Posts
Last Edited: 2010-08-18 21:49:30
August 18 2010 21:48 GMT
#175
The problem with your 'science' is that you have a forgone conclusion in your head about what reality is and you're trying to twist facts to suit reality. As others have said, you're drawing conclusions about a causal relationship from an imperfect data set with tons of degrees of freedom. Furthermore you have no basis to claim that win percentages indicate any sort of racial imbalance; there is a one hundred percent correlation between being born and dying but that doesn't mean one directly causes the other. You have little to no evidence to support any claim of causality. I'm all for trying to bring quantitative reasoning to Starcraft, but this pretty wildly off mark. Sorry to sound so harsh, but isnt that what peer review is about
Techno
Profile Joined June 2010
1900 Posts
August 18 2010 21:50 GMT
#176
The win rate does not prove imbalance.
Would a "perfectly balanced" Starcraft 2 have perfect 50% win rates? No.
Hell, its awesome to LOSE to nukes!
rS.Sinatra
Profile Joined May 2010
Canada785 Posts
August 18 2010 22:01 GMT
#177
It's hard to take numerical data to prove statistical imbalances in this game... mostly because there are too many variables.. a simplified version of "too many variables..."

IntoTheRainbow > Idra > Tester > IntoTheRainbow in a tournament... now multiply this concept by 1 billion since there are infinite possibilities and relationships between the several thousand diamonds world wide...
www.rsgaming.com
senti
Profile Joined August 2010
4 Posts
August 18 2010 22:03 GMT
#178
The data the OP used doesn't accurately reflect what it's trying to show. Imbalances are seen through data between the different combinations of match ups possible (PvT, TvZ, etc.) at different time points in the game. This is obviously difficult data to gather, and so far all we got are win/lose data of races (in different leagues) without variables like time and all the different match ups. This complex problem could be better shown with data on different match ups with games organized by length (i.e. group 1 is <17 mins of game time, group 2 is >17 mins of game time). This isn't a polished idea, but with more and more tests, you could definitely see at what intervals of time in the particular match up there seems to be a certain win/loss ratio and try to figure out the reason.

Even if you're a good player and you can tell that you have a certain disadvantage at some time interval, you still need to persuade the masses. With that point, good job OP for your work to support your idea even though it's a bit rough.
aseq
Profile Joined January 2003
Netherlands3973 Posts
August 18 2010 22:08 GMT
#179
If you could just show how many players there are in each group, that would be great. If there are similar amount of players for each race in each league, I seriously don't see the problem, unless you can argue diamond league is too big and should be split into diamond and S+ or something.

And as mentioned before, T > Z > P > T does make matches unbalanced, but not these stats. But I'd first concentrate on the general strength of a race.
Roggay
Profile Joined April 2010
Switzerland6320 Posts
August 18 2010 22:27 GMT
#180
Statistics are cool and interresting, BUT if you want to "prove" than the game is imbalanced, look at the game itself not at some shiny statistics. Imo what really matters in the end is how the game is actually played at top level and if there are imbalances you should spot them by looking/playing games (at top level).

Plus if you spot an imbalance by actually playing the game, you can fix the matchup in a correct way. Imo statistics are a bonus, not a "proof" for imbalances.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-18 23:20:55
August 18 2010 23:12 GMT
#181
I was wondering if you had the p-value for the chi-square test, I didn't see a null hypothesis or whether or not the results were statistically significant..


p-values are nearly zero for all leagues except silver, where it is above 0.2 or 0.3 if memory serves. That's with a n = ~50 million.

Stats minor here. I redid (I hope) your results with the numbers you put in here, with a chi-square GOF test for homogeneity. In my findings, the only league that is statistically imbalanced is bronze, based solely on W/L you reported. (the P-value at the end was about .12, higher than the .05 I set as the requirements for the null hypothesis to be overruled. Silver to Diamond, it didn't go above .03, which essentially means that after bronze, no race has a W/L record that is significantly different from any othre race.


Unless I'm misunderstanding something, that's the test I've done, but with a different sample size. If you use percentages as your counts than your sample size is 100, unless you've adjusted them for the actual sample size (as I have). For a chi-square test it's important that the correct sample size be used.

@ all the people who think my results are meaningless due to matchmaking:

I've said this a lot by now, but I'll state it again for the last time here:

I have reasoned that if the game is imbalanced, that imbalance must manifest as either 1) a difference in win ratios for the different races or 2) a difference in race prevalence as you increase player skill level, except under one of several unlikely scenarios and one likely one.

The degree to which it shifts from condition 1) to condition 2) depends on the strength of the matchmaking system. Since we don't see 1) (as my data show), and we don't see 2) (as I have said and escapeartist has shown), we can conclude that the game is balanced, at least for regular league play.

The unlikely scenarios:

a. Blizzard's matchmaking system is wise to racial imbalances, and choses lower level opponents for a player of a given ranking if they play as a weak race vs. a strong race. The only reason they would do this would be to 'hide' racial imbalances from the player and/or the community.

b. Blizzard's matchmaking system does nothing, and each league is a random sample of the regional player population.

c. People have no race loyalty, and randomly pick their race before each match.

The likely scenario:

d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.

@ all the people who think the test is inappropriate because I haven't modeled enough variables that affect win rate

I don't have access to data that will allow me to do that. I'd like to, but I can't. In science when that's the case, you have to look for other ways that you can use to test a question. In my case initially reasoned that an imbalance would lead to a difference in win rates among races. People immediately pointed out that that wasn't the case, due to matchmaking. However, I then realized that the matchmaking system would force weak races into the lower leagues.

I checked to see if that happened and amended my analysis with a graph showing that it doesn't. Escapeartist has since analyzed this in more detail and come to the same conclusion, although nobody to my knowledge has done an analysis for lower league play. People have shown, however, that it's not true for the top hundred or so players in each region.

@ the people who think stats are useless

It's been shown before that stats are a much better way of assessing the truth than anecdotal knowledge. Even experts often have misperceptions, and misperceptions often produce feedback loops. Stats are at least partially resistant to this.

That said, I think opinions and impressions of top-level players (IdrA's thoughts on high level ZvT matchups, e.g.) still warrant attention, and consistently held beliefs warrant scientific investigation. In fact, that's what I did with respect to win rates for league play!

Finally, thanks everyone for your interest! I'll keep trying to answer questions but I know I'll miss some and for that I apologize.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-19 00:02:02
August 18 2010 23:48 GMT
#182
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.

All your stats have shown is that that the populations of zerg, protoss, and terran favorite players have similar win rates at various levels, and there is an interesting distribution of races across all the leagues. These populations of players aren't the same, so it doesn't show anything about the underlying skill level of players in those leagues nor does it eliminate selection biases for picking races or the possibility of varying levels of imbalances at different skill levels.

For example:
It might take a certain type of mind to play zerg well and not all players are suited to it, or the average zerg might have to be better, or the learning curve for zerg is easier but current skill ceiling is lower.

Everything in statistics is retrospective and only past state of affairs. It only captures up to the current state of the SC2 metagame. In fact, the statistics could be hopeless outdated if there is a sharp change in the metagame out there. Moreover, it doesn't say anything about the imbalances should players figure out how to play it optimally.
Moderator我们是个踏实的赞助商模式俱乐部
febreze
Profile Joined April 2010
167 Posts
August 18 2010 23:52 GMT
#183
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.


Its been said before, but, due to faulty assumptions made by the author this study must be redone to be relevant.
Beauty in truth, deception with dogma, meaning through life.
cocosoft
Profile Joined May 2010
Sweden1068 Posts
August 19 2010 00:10 GMT
#184
For those who says that the matchmaking system making players have 50-50 is causing the data the guy is using to be incorrect, READ what he wrote:
On August 17 2010 07:13 GagnarTheUnruly wrote:
People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.
Agreeing with this.
¯\_(ツ)_/¯
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
Last Edited: 2010-08-19 00:57:55
August 19 2010 00:44 GMT
#185
On August 19 2010 08:48 TanGeng wrote:
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.


Phenomena are proof of themselves. Mechanisms require validation. The only thing you need to do to prove that different races have different win rates is to show that they do (and they do... 56.06% is different from 55.56%). The question the statistics answer is: is the difference due to chance? Specifically, they estimate the chance that the observed test statistic (chi-square value representing the standardized difference between observed and expected win rates in this case) could occur if it was picked at random. In this case, we take a p-value of something on the order of 1e-12 to be sufficient evidence that the difference is not due to chance, that we can feel comfortable saying it with certainty. The difficulty comes with interpretation and generalization. These require reasoning and logic, and careful consideration of all possible explanations. Such analysis often leads to additional questions, as we've witnessed in this thread. It's the scientific process in a nutshell.

It's a common misconception that science and statistics can't prove anything. I like a statement I just read in an intro plant ecology textbook:

"The popular image of the scientific method protrays it as a process of falsifying hypotheses. This approach was codified by... Karl Popper (1959). In this framework we are taught that we can never prove a scientific hypothesis or theory. Rather, we propose a hypothesis and test it; the outcome of this test either falsifies or fails to falsify the hypothesis. While hypothesis testing and falsification is an important part of theory testing, it is not the whole story, for two reasons."

First, the approach "fails to recognize knowledge accumulation." The author goes on to say that although in a strictly philisophical scientific knowledge can never be known with absolute certainty, "we also recognize that some knowledge is so firmly established and bolstered by so many facts taht the chance taht we are wrong is very much less than the chance of winning the lottery several times in a row." It's important to note that we can estimate with some accuracy what are chance is of being correct.

I would also add that even when we use statistics to 'disprove' a hypothesis, that falsification is associated with it's own probability level. In actual practice, it's much easier to show that patterns exist and processes happen than it is to show the opposite, because if a pattern is not found it's generally impossible to know if it's because it shouldn't be found or because the research approach was inadequate.

Second, science often isn't concerned with falsification, and instead asks questions about the relative importance of processes, and this doesn't fit the Popperian framework. This is generally better science anyways.

- from Gurevitch et al. 2006. The Ecology of Plants
GameTime
Profile Joined May 2010
United States222 Posts
August 19 2010 01:01 GMT
#186
I just don't think 0.5% is enough for me to even consider that the game is imbalanced, if it's only .5% it seems pretty balanced to me.

It would be cool if you showed the winning percentages with all the matchups.
Only the winner deserves to win.
youngminii
Profile Blog Joined May 2010
Australia7514 Posts
August 19 2010 01:14 GMT
#187
I can't believe how many people are still using the 'matchmaking explains imbalance' argument when the OP fucking explained it (eventually).
lalala
Miros
Profile Joined August 2010
Australia10 Posts
Last Edited: 2010-08-19 01:24:48
August 19 2010 01:21 GMT
#188
What exactly is the point of analyzing the overall win percentage of the races? It would be much more interesting to see win percentages for each matchup (TvZ, TvP, PvZ). Maybe Terrans win 60% of their matches against Zerg, but less than 50% against Protoss.

Or am I missing something?
eiswand
Profile Joined July 2010
Germany44 Posts
August 19 2010 01:34 GMT
#189
Considering the game was released only 2 weeks ago the game is damn good balanced. But some pro gamers think Terran is a little bit too strong and what happens? Hordes of fanboys and noobs think this is the ultimate truth and start to go on a imbalance crusade.

Just today I met a Protoss player who said "Terran is so fucking imba". I asked him why. His answer: "Dunno, they all say it in the forums"........

Again: No RTS will ever be 100% balanced. NEVER. But considering the game is only 2 weeks in stores the balancing is DAMN GOOD. Now stop whining and enjoy the game.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 01:37 GMT
#190
Ugh...

Science can prove things with observation and experiment. With science you can even observe the processes and phenomena at work. With statistics, you have none of this. It's merely input states and observed states. All phenomena and mechanism is ignored.

With statistics, you can reach near certainty on certain specific statements, and those statements are very very specific. In general, science has massively abused statistics by jumping to conclusions that do not match the very specific statements being shown to be near certain.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.

Based off of your data, you haven't even shown that there is imbalance. It is merely that the population of zerg, protoss, and terran favorite players have observably different winning percentages across the various leagues, and that if it were a given that the game was perfectly balanced and populations at each level possess the exact same skill sets, it'd be hard to produce the observed winning percentage rates purely by chance. That is all the statistics tells you.

This is an extremely minimalistic conclusion and of no real value at all. There are some conclusions you can deduce from that by looking at it logically, but there isn't much there.
Moderator我们是个踏实的赞助商模式俱乐部
mierin
Profile Joined August 2010
United States4943 Posts
Last Edited: 2010-08-19 02:19:35
August 19 2010 02:17 GMT
#191
How many of you have watched G2 of TLO vs MadFrog in the IEM tournament? + Show Spoiler +
Madfrog played perfectly and still got steamrolled. TLO was even behind by a significant margin economically due to MF's counter, yet by the grace of MULEs managed to completely own MF.

EDIT: I'm even somewhat of a TLO fanboy, and have a problem with this. Both games of the series are really telling IMO.
JD, Stork, Calm, Hyuk Fighting!
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 19 2010 02:47 GMT
#192
I'm sorry, but I'm having a lot of trouble understanding what you're trying to say. I'll give it my best shot, though. Also, please stop prefacing all your posts with 'ugh...' If you're frustrated by what I'm saying, it's possible that you are the one who's missing something, and not me.

I think you're creating a little bit of a false dichotomy between science and statistics. Science is a method of obtaining understanding, and statistics is an important mathematical tool that scientists use. You're right that statistical methods ask and answer very specific questions, and that assumptions of certain tests can limit inference, but statistics isn't the only means of scientific inference. We also use logic and theoretical understanding to interpret statistical results. In fact, doing this is necessary in order to achieve scientific progress. I also think it's very unfair to state that science in general has jumped to conclusions and abused statistics. If you're going to make such sweeping statements you should present examples.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.


Here I'm not sure what you're saying. In this case the null hypothesis was no difference in win rates. My data suggest a significant but very small departure from the null hypothesis (a positive result). The chance of a false positive is the type 1 error, and is equivalent to the p-value. The 'negative' doesn't have a low p-score... a negative result would have a high p-score. I have not calculated the chance of a false negative, and I'm not interested in that question because I haven't seen a negative result. Also, I'm not sure what you mean by there being other false positives. There's only one test statistic and the chance of a false positive is almost zero. Are you talking about other explanations?

Also, the results don't involve the population of zerg, protoss and terran players -- it's the results of zerg, terran, and protoss games that I measured. Players aren't accounted for and aren't important given my reasoning described in other posts. You're right about the results, they show with near certainty that there's a small departure from the null hypothesis (which assumes the first condition you cite but not the second one). And you're right that there's no more that the statistics tells us. Which is why we move from statistical inference to logical inference. Then we learn that the results mean that the game is well balanced.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 19 2010 02:52 GMT
#193
On August 19 2010 11:17 mierin wrote:
How many of you have watched G2 of TLO vs MadFrog in the IEM tournament? + Show Spoiler +
Madfrog played perfectly and still got steamrolled. TLO was even behind by a significant margin economically due to MF's counter, yet by the grace of MULEs managed to completely own MF.

EDIT: I'm even somewhat of a TLO fanboy, and have a problem with this. Both games of the series are really telling IMO.


You totally freaked me out! I had a different TLO v MadFrog game going in a diff. window, and I thought you were sending some freaky evil post letting me know you were hacking me. Then I realized you were talking about a different game and why you brought it up...

It does seem that at high level play a lot of players feel zerg is too weak. But why is it doing so well in Asia?
andyrichdale
Profile Joined April 2010
New Zealand90 Posts
August 19 2010 03:03 GMT
#194
On August 19 2010 09:44 GagnarTheUnruly wrote:
Show nested quote +
On August 19 2010 08:48 TanGeng wrote:
Ugh... stats are good for evidence of phenomena. They aren't proof of phenomena. Stats are better at disproving BS theories than they are at proving theories. They are also better at capturing the empirical outcome of an unknown phenomenon without explaining why.


Phenomena are proof of themselves. Mechanisms require validation ... In actual practice, it's much easier to show that patterns exist and processes happen than it is to show the opposite, because if a pattern is not found it's generally impossible to know if it's because it shouldn't be found or because the research approach was inadequate.


TanGeng got owned

Nice work OP
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-19 04:06:56
August 19 2010 03:14 GMT
#195
On August 19 2010 11:47 GagnarTheUnruly wrote:
Also, the results don't involve the population of zerg, protoss and terran players -- it's the results of zerg, terran, and protoss games that I measured. Players aren't accounted for and aren't important given my reasoning described in other posts. You're right about the results, they show with near certainty that there's a small departure from the null hypothesis (which assumes the first condition you cite but not the second one). And you're right that there's no more that the statistics tells us. Which is why we move from statistical inference to logical inference. Then we learn that the results mean that the game is well balanced.


It's about the games? I don't see games statistics by race at sc2ranks. It's grouped by player and only their favorite race is shown. Anyone with a favorite race can play a lesser number of games as any of the other races (including random). There is no insight into the exact number of games won or lost by any of the races selections. You would have to assume that they played their favorite race exclusively.

Also you would also have to assume perfectly even skill distribution among player populations if you want perfectly matching win rates - unless you want to assume that players didn't actually pick their favorite races and they were assigned one randomly by battlenet.

When there is a p score of 1e-12, then it's with near certainty that it's not by pure chance. By assuming the negative then showing the negative to have a low p score, the conclusion is merely that a false positive is unlikely given your assumption. But it doesn't say anything about the probability of the false positives given a positive test result. Nor does it factor in all false positives - some of which have plausible alternate explanations.


This is basic Bayesian logic. Let's call perfect balance with equally skilled players your negative condition, and imbalanced game your positive condition. Your positive test is a significant difference in win rates among the populations players with favorite races. The scenarios covered by your assumption of perfect balance with equally skilled players overlaps a bit with the scenario where you get differences in win rates.

How likely do you think you will have a game with perfect balance with equally skilled players anyways?
If your estimate is close to 1, then the overall chances that you got a false positive increases while your chances of a true positive decreases. Your probability of a false positive given a positive test result is high. (In case of rare diseases, doctors often ask for a confirmation test since the false positive rates is nearly the same as a true positive rates.)

If it's close to 0, then your false positive chances arising from a perfectly balance and equally skilled players scenario were nearly nil anyways, so why would you care about it at all to begin with? You want to eliminate other reasons why you might register a false positive for imbalance instead.
Moderator我们是个踏实的赞助商模式俱乐部
hdkhang
Profile Joined August 2010
Australia183 Posts
August 19 2010 05:54 GMT
#196
On August 19 2010 09:10 cocosoft wrote:
For those who says that the matchmaking system making players have 50-50 is causing the data the guy is using to be incorrect, READ what he wrote:
Show nested quote +
On August 17 2010 07:13 GagnarTheUnruly wrote:
People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.
Agreeing with this.



Really? The OP is wrong, there is no two ways about it.

I'll repost here what I wrote back on page 8... which by the way, the OP conveniently ignores. Pay careful attention to the points regarding MMR and how they are used by the AMM to produce the exact result that Blizzard want you to see which is the exact result that the data shows. It is a SELF FULFULLING PROPHECY! All the math in the world, no matter how fancy you try to be won't be of any use since you are using faulty data/metrics to try and prove a point. It would be like me trying to show that the data is clean, by using only "cleaned" data and filtering out the "dirty" data. Blizzard have "cleaned" the data, it's plain and simple to see.


The problem is not in your methodology as much as it is in the assumptions you are making about the system.

What do we know about the system?

Anyone who wishes to play on the ladder is accepted

* You can have zero knowledge of the game and yet you will be placed in Bronze (in sports, that makes you 3rd place!), the only prerequisites for acceptance into a league is to just show up 5 times (you can even disconnect 5 times once the game starts and still get in).
* This is an automatic invalidation of any results emerging from the Bronze league as the range of skill present is astronomical!

You only play matches against people in your region

* Comparisons across different battle.net servers is worthless.

The hidden MatchMakingRating number is based only on wins/losses with respect to the current MMR of yourself and your opponent

* What was required to score that win is not considered at all, nor should it be. Unfortunately, this places the onus on making a game as balanced as possible all the more important or else the MMR is worthless.
* This point also explains why the data is "practically worthless".

The AMM will attempt to pair you up with a person with a similar MMR

* Note that I say similar MMR and not similar skill/ability.
* If a race imbalance existed, the MMR would not reveal it since it would simply consider the person using the weaker race as a "poorer" player hence a lower MMR and the person using an OP race as a better player rewarding them with a higher MMR.
* Thus if you were to compare even between players of similar MMR but across the three races, it would reveal nothing of significance since the reason they were given that MMR is due to their win/loss performance against the same people they are being compared against.

Not everyone will play the same number of games

* You may say "well duh" but it bears repeating.
* I think in the original top 200 list one of the players that made it had only played a handful of games, I think it was 7 all up, yet 7 games was enough for the system to determine their MMR to be one of the top 200 on the server.
* I honestly believe that a more stringent pre-requisite for diamond league is needed, e.g. 100 games played.

There are many more other points to make, but let's just start with the above for now.
texmix
Profile Joined May 2010
United States106 Posts
August 19 2010 12:51 GMT
#197
On August 19 2010 06:18 Hidden_MotiveS wrote:
Show nested quote +
On August 19 2010 04:59 texmix wrote:
As others have stated, the OP is based on a flawed methodology. If trying to use stats to figure out which race is overpowered, 4 items need to be controlled:
1. Homogeneous skill in race choice (maybe old BW semi-pro's just gravitate towards Terran in sc2)
2. The matchmaking system instead of random opponents
3. Player MU difference (one player may, in the long run, win 60% pvt, another lose 60% pvt)
4. Player MU skill changes over time (maybe a day9 video will change pvt win stats by several bps in a single week)

To control all 4 of these I suggest mining for at least 1,000 players that:
1. Have players over 200 games
2. Played at least 30 games in the last 72 hours
3. Are in the diamond league

From this list, throw out all games involving a random player (less consistent MU performance), everything older than most recent 30 games, and and calculate the group's median win ratio using the most recent 30 games. Keep the 100 players of each race with win ratios closest to the median win ratio and throw out the other 700 players games. For instance if the 1,000 players have win ratios ranging from 35% to 90% (in most recent 30 games), with median of 55%, then pick the 100 zerg, protoss, and terran players who are closest to 55%. From the remaining 300*30 games, a simple win/loss record for each MU will be about the best possible indication of imbalance I believe data mining can come up with (short of using the same methodology with more games or tweaked ratios).

I wanted to say this, but feared the backlash of "NO We has psience we is wright". The methodology of the observational study is flawed in a few ways. For one, I don't think you are considering any confounding variables such as how the ranking system comes into effect. If one race is overpowered then it's simple to assume that it will be overrepresented in relation to its total population within the top of diamond rank only. But this could also be confounded by how people think Terran is the strongest race, so the more serious players switch over to that race thinking this is true. In addition, the sample sizes here are very small.

I would like to hear what a statistician, or Blizzard statistician has to say about the data.

edit: Oh I see, the OP understands that the matchmaking systems kind of voids his analysis. I'm sorry if I sounded harsh. Great effort put into this.


I am a statistician and stand by that methodology as a reasonable indicator of racial balance.

The win rate does not prove imbalance. Would a "perfectly balanced" Starcraft 2 have perfect 50% win rates? No.

It absolutely would assuming a control for skill and the matchmaking system which can be approximated in the study.
Anomandaris
Profile Joined July 2010
Afghanistan440 Posts
August 19 2010 13:01 GMT
#198
statistique!=science
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 15:02 GMT
#199
One question that I always wanted to ask, what is your definition of imbalance?

On August 19 2010 08:12 GagnarTheUnruly wrote:
I have reasoned that if the game is imbalanced, that imbalance must manifest as either 1) a difference in win ratios for the different races or 2) a difference in race prevalence as you increase player skill level, except under one of several unlikely scenarios and one likely one.

The degree to which it shifts from condition 1) to condition 2) depends on the strength of the matchmaking system. Since we don't see 1) (as my data show), and we don't see 2) (as I have said and escapeartist has shown), we can conclude that the game is balanced, at least for regular league play.

I'm not sure how you can be confident of 2.

Win ratios should be controlled by point levels in a nice random matchmaking system. In your study, you see evidence of higher win rates for players of higher skill levels. Skill level is continuous (not all players in the same league are of the same level). Average win ratios will not match in a league unless the skill distribution of the races in that league allows for that.

First element in figuring out the skill distribution is selection biases. You have to figure out who chooses certain races and why. The population that picks zerg and the population that picks terran as their favorite are different unless you can prove otherwise.

The second element in figuring out the skill distribution is the learning curve. A difference in racial prevalence at any particular skill level is more a function of how steep the learning curve is relative to normal at that particular skill level.

A skill ceiling is any point where skill curve is really steep.
Moderator我们是个踏实的赞助商模式俱乐部
escapeArtist
Profile Joined August 2010
Norway2 Posts
Last Edited: 2010-08-19 16:05:20
August 19 2010 16:03 GMT
#200
On August 20 2010 00:02 TanGeng wrote:

First element in figuring out the skill distribution is selection biases. You have to figure out who chooses certain races and why. The population that picks zerg and the population that picks terran as their favorite are different unless you can prove otherwise.

A skill ceiling is any point where skill curve is really steep.


It may be true that different people pick different races, but this is not neccesary relevant to the imbalance question. Unless you prove that personality equals skill or that skill equals race, then I don't see why this is relevant.

I also don't see how skill curve equals imbalance. I do agree that zerg needs higher apm than say terrans, but if they perform at the same level then I still doesn't see any imbalance issues, since unless proven different we must assume that they have hit the relative skill cap when playing in the upper diamond level. As I have shown in the upper diamond level Zerg is gaining in population. And withouth any intelligent discussion we can safely assume that random takes more skill than ALL the other races. Even them are gaining in popularity in lower diamond leage. This alone strongly support my statement that difficulty is not equal to balance.

My point here is that there are soooo many people crying imbalance, but still I have seen no evidence of this. OP has tried to find evidence of this, and I have tried to find evidence of this, but both of us came up with nothing. As a result of this both of us seem to be leaning towards thinking that the game is balanced.

I personally think that you are going about the wrong way if you are trying to tell us that statistics is not the right way to do it. Let's just say it's your turn to try and prove the inbalance. Or atleast give us some new data sources. Afterall you seem to be convinced that there is imbalance, but all you seem to base it on is personal opinions.

If I understand you correct then the race in question(Zerg) are picked by the best players since they seem to perform on all levels of play, and poor players(Terran) are performing on equal level to them bequase of the difference in imbalance.

I find this vey hard to swallow considering we are dealing with over 500 000 players. Also why would the best players do this? I have seen no reasoning as to why all the "best" players would pick the "worst" race.

To my experience I must say that if it smells like shit, looks like shit and tastes like shit. It's probably shit....
silencesc
Profile Joined July 2010
United States464 Posts
August 19 2010 16:25 GMT
#201
On August 19 2010 08:12 GagnarTheUnruly wrote:
Show nested quote +
I was wondering if you had the p-value for the chi-square test, I didn't see a null hypothesis or whether or not the results were statistically significant..


p-values are nearly zero for all leagues except silver, where it is above 0.2 or 0.3 if memory serves. That's with a n = ~50 million.

Show nested quote +
Stats minor here. I redid (I hope) your results with the numbers you put in here, with a chi-square GOF test for homogeneity. In my findings, the only league that is statistically imbalanced is bronze, based solely on W/L you reported. (the P-value at the end was about .12, higher than the .05 I set as the requirements for the null hypothesis to be overruled. Silver to Diamond, it didn't go above .03, which essentially means that after bronze, no race has a W/L record that is significantly different from any othre race.


Unless I'm misunderstanding something, that's the test I've done, but with a different sample size. If you use percentages as your counts than your sample size is 100, unless you've adjusted them for the actual sample size (as I have). For a chi-square test it's important that the correct sample size be used.

@ all the people who think my results are meaningless due to matchmaking:

I've said this a lot by now, but I'll state it again for the last time here:

I have reasoned that if the game is imbalanced, that imbalance must manifest as either 1) a difference in win ratios for the different races or 2) a difference in race prevalence as you increase player skill level, except under one of several unlikely scenarios and one likely one.

The degree to which it shifts from condition 1) to condition 2) depends on the strength of the matchmaking system. Since we don't see 1) (as my data show), and we don't see 2) (as I have said and escapeartist has shown), we can conclude that the game is balanced, at least for regular league play.

The unlikely scenarios:

a. Blizzard's matchmaking system is wise to racial imbalances, and choses lower level opponents for a player of a given ranking if they play as a weak race vs. a strong race. The only reason they would do this would be to 'hide' racial imbalances from the player and/or the community.

b. Blizzard's matchmaking system does nothing, and each league is a random sample of the regional player population.

c. People have no race loyalty, and randomly pick their race before each match.

The likely scenario:

d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.

@ all the people who think the test is inappropriate because I haven't modeled enough variables that affect win rate

I don't have access to data that will allow me to do that. I'd like to, but I can't. In science when that's the case, you have to look for other ways that you can use to test a question. In my case initially reasoned that an imbalance would lead to a difference in win rates among races. People immediately pointed out that that wasn't the case, due to matchmaking. However, I then realized that the matchmaking system would force weak races into the lower leagues.

I checked to see if that happened and amended my analysis with a graph showing that it doesn't. Escapeartist has since analyzed this in more detail and come to the same conclusion, although nobody to my knowledge has done an analysis for lower league play. People have shown, however, that it's not true for the top hundred or so players in each region.

@ the people who think stats are useless

It's been shown before that stats are a much better way of assessing the truth than anecdotal knowledge. Even experts often have misperceptions, and misperceptions often produce feedback loops. Stats are at least partially resistant to this.

That said, I think opinions and impressions of top-level players (IdrA's thoughts on high level ZvT matchups, e.g.) still warrant attention, and consistently held beliefs warrant scientific investigation. In fact, that's what I did with respect to win rates for league play!

Finally, thanks everyone for your interest! I'll keep trying to answer questions but I know I'll miss some and for that I apologize.



If the p-value is nearly zero, you have to say it's balanced based on your results. Unless it's .05 about zero, you cannot say that you have voided your null hypothesis, so how did you come to the conclusion that the game is imbalanced?
Real Men Proxy Gate | TEAM LIQUID HWITINGGGG!! PROUD MEMBER OF UC DAVIS CSL TEAM | "If you don't give a shit about what gum you eat, buy Stride" - Liquid`Tyler on SotG 4/19/2011
humansherdog
Profile Joined April 2010
Canada85 Posts
August 19 2010 16:31 GMT
#202
On August 17 2010 07:16 GagnarTheUnruly wrote:
The sample size is so vast that random chance can't explain the differences, but at the same time they are so small as to be meaningless.
Oh god this.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
Last Edited: 2010-08-19 17:02:34
August 19 2010 16:59 GMT
#203
On August 20 2010 01:03 escapeArtist wrote:
I also don't see how skill curve equals imbalance. I do agree that zerg needs higher apm than say terrans, but if they perform at the same level then I still doesn't see any imbalance issues, since unless proven different we must assume that they have hit the relative skill cap when playing in the upper diamond level. As I have shown in the upper diamond level Zerg is gaining in population. And withouth any intelligent discussion we can safely assume that random takes more skill than ALL the other races. Even them are gaining in popularity in lower diamond leage. This alone strongly support my statement that difficulty is not equal to balance.


If imbalance is performing at different level when at the same rating level on the ladder then you have proven that the ladder works. This definition of imbalance is trivial.

If imbalance is sharply different learning curves where one race is harder at one point or all points, then you haven't proven anything about that. If imbalance is taking applying a different body of skill sets, then I would think that would be true by definition since the races play differently.

If imbalance is the difference in win rates that you get when all races are being played optimally - at the very very end of the learning curve, you will never get it with statistics because it's retrospective and only captures the history of the metagame.

If imbalance is the difference in win rates between the best that the metagame has to offer, then you shouldn't look at anything below upper diamond because by those players aren't at what the metagame has to offer.

My thesis is you haven't shown anything substantive to be true on the issue of imbalance. This tool you are using isn't convincing and shouldn't be convincing.


On August 17 2010 07:16 GagnarTheUnruly wrote:
The sample size is so vast that random chance can't explain the differences, but at the same time they are so small as to be meaningless.

You CANNOT interpret it this way. This is a huge leap in logic unsubstantiated by your statistical tool.
Moderator我们是个踏实的赞助商模式俱乐部
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 17:31 GMT
#204
I'm going to call pure BS on this "scientific" process. It a simple failure in logic and you're trying to pawn it off on us. When you prove P implies Q, what you have also proven the contrapositive ~Q implies ~P.

Now you attach some statistical mumble-jumple to your P implies Q and expect us to believe the converse, Q implies P? Fuck that.
Moderator我们是个踏实的赞助商模式俱乐部
texmix
Profile Joined May 2010
United States106 Posts
Last Edited: 2010-08-19 19:19:37
August 19 2010 19:10 GMT
#205
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.

A simple way to get a better result would be taking the top couple hundred players who choose Random, download match histories, and look at how often each of them win as P, T, and Z. The results at least be closer to something meaningful (though that doesn't say much).
Yuka
Profile Joined June 2010
United States133 Posts
Last Edited: 2010-08-19 19:34:59
August 19 2010 19:34 GMT
#206
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.

A simple way to get a better result would be taking the top couple hundred players who choose Random, download match histories, and look at how often each of them win as P, T, and Z. The results at least be closer to something meaningful (though that doesn't say much).


As a Random player myself (and looking at all the requests for Random player data throughout the thread) I at first considered that to be a better metric. However after further thought, I don't think that data would be as useful because:

a) true, purely Random players seem to be something of an outlier in the overall data
b) most Random players are not equally skilled with all three races, a key assumption that would have to be valid for the data to be useful

So as you say, it'd be meaningful, but not by much more.

Kudos to OP by the way for trying this undertaking. At the bare minimum, it has generated some interesting discussion and serious thought.
Race? No, I'm equally bad with all of them.
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 19 2010 20:06 GMT
#207
Random players can have selection bias. Basically the possibility is that it may take a special kind ofnplayer to play and want to play as random, and by using randoms, your taking an unrepresentative sample.

As a case study it's better than looking at the different players playing different races, but it falls far from a controlled experiment. The experiment would be choosing players at the same skill point (e.g. platinum 500) and forcing those players to play random for a certain number of games, and looking at the results. This is still flawed since it will also measure how well skill or lack thereof transfers between the races for those players dedicated to playing a single race.

As constructed I doubt these statistics will answer any questions about balance in sc2 when a non-trivial definition is use without a lot more work.


Moderator我们是个踏实的赞助商模式俱乐部
socal50
Profile Joined July 2010
United States93 Posts
August 19 2010 20:08 GMT
#208
i think the graph shows its balanced more than anything, despite terrans having a slight edge
the little imbalance could be due to random variation
nihlon
Profile Joined April 2010
Sweden5581 Posts
Last Edited: 2010-08-19 20:19:38
August 19 2010 20:14 GMT
#209
On August 19 2010 08:12 GagnarTheUnruly wrote:
d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.


That's retarded logic. How can you consider a game functioning in the rock paper and scissor fashion balanced overall? If one matchup is imbalanced, so is the overall balance. Which is what most people are arguing.

Does this sound like a reasonable argument?

Blizzard: "The game is balanced perfectly overall"
Players: "How the hell can you say that when I can't win against the X race?"
Blizzard: "Yeah, but we are talking about overall balance here..."
Banelings are too cute to blow up
Gentlebite
Profile Joined May 2010
United States132 Posts
August 19 2010 20:18 GMT
#210
Factors including player skill level, the amount of the Race in population, this shows matchmaking is balanced but doesn't signify any gameplay thingies
andyrichdale
Profile Joined April 2010
New Zealand90 Posts
Last Edited: 2010-08-19 21:19:16
August 19 2010 21:18 GMT
#211
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.
andyrichdale
Profile Joined April 2010
New Zealand90 Posts
August 19 2010 21:20 GMT
#212
On August 20 2010 05:14 nihlon wrote:
Show nested quote +
On August 19 2010 08:12 GagnarTheUnruly wrote:
d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.


That's retarded logic. How can you consider a game functioning in the rock paper and scissor fashion balanced overall? If one matchup is imbalanced, so is the overall balance. Which is what most people are arguing.



He's just saying that the average win% of each race against every other race is pretty even. Whether or not a game is acceptable given imbalances in certain matchups is another question altogether really.
ParasitJonte
Profile Joined September 2004
Sweden1768 Posts
Last Edited: 2010-08-19 21:23:47
August 19 2010 21:23 GMT
#213
On August 20 2010 05:14 nihlon wrote:
Show nested quote +
On August 19 2010 08:12 GagnarTheUnruly wrote:
d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.


That's retarded logic. How can you consider a game functioning in the rock paper and scissor fashion balanced overall? If one matchup is imbalanced, so is the overall balance. Which is what most people are arguing.

Does this sound like a reasonable argument?

Blizzard: "The game is balanced perfectly overall"
Players: "How the hell can you say that when I can't win against the X race?"
Blizzard: "Yeah, but we are talking about overall balance here..."


You're arguing over semantics. He didn't pass any judgement on whether it was a good thing or not. He simply stated that it was a likely scenario. What you then call it, really doesn't matter.
Hello=)
nihlon
Profile Joined April 2010
Sweden5581 Posts
August 19 2010 21:25 GMT
#214
On August 20 2010 06:20 andyrichdale wrote:
Show nested quote +
On August 20 2010 05:14 nihlon wrote:
On August 19 2010 08:12 GagnarTheUnruly wrote:
d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.


That's retarded logic. How can you consider a game functioning in the rock paper and scissor fashion balanced overall? If one matchup is imbalanced, so is the overall balance. Which is what most people are arguing.



He's just saying that the average win% of each race against every other race is pretty even. Whether or not a game is acceptable given imbalances in certain matchups is another question altogether really.


I know what he is saying, it just makes very little sense in using that kind of logic when discussing balance. Saying the game is balanced overall, just because of that fact is just pointless. Who wants to play a rock paper and scissor game?
Banelings are too cute to blow up
nihlon
Profile Joined April 2010
Sweden5581 Posts
Last Edited: 2010-08-19 21:33:43
August 19 2010 21:27 GMT
#215
On August 20 2010 06:23 ParasitJonte wrote:
Show nested quote +
On August 20 2010 05:14 nihlon wrote:
On August 19 2010 08:12 GagnarTheUnruly wrote:
d. The races are balanced overall but matchups are imbalanced, in a rock-paper-scissors fashion. I favor protoss, and I really feel that I struggle against Terran.


That's retarded logic. How can you consider a game functioning in the rock paper and scissor fashion balanced overall? If one matchup is imbalanced, so is the overall balance. Which is what most people are arguing.

Does this sound like a reasonable argument?

Blizzard: "The game is balanced perfectly overall"
Players: "How the hell can you say that when I can't win against the X race?"
Blizzard: "Yeah, but we are talking about overall balance here..."


You're arguing over semantics. He didn't pass any judgement on whether it was a good thing or not. He simply stated that it was a likely scenario. What you then call it, really doesn't matter.


No it's not just arguing semantics in this case. He have been using that point to argue the game is balanced earlier in the thread when it's clearly not in such hypothetical situation. He is the one playing with semantics to prove his own points when he uses the overall win % to prove that the game is balanced. (So yes he is passing judgement)
Banelings are too cute to blow up
Stargazer
Profile Joined May 2010
United States10 Posts
Last Edited: 2010-08-19 21:42:28
August 19 2010 21:32 GMT
#216
I did some analysis yesterday of the top 200 players currently on the sc2 ladder and got some interesting results. I agree with the many other posters who have said that analysis should be done in the top levels of play for accurate assessment of any sort on game balance for two reasons.

First, only at high levels of play do balance issues become relevant. Why bother arguing imba in silver if you can just learn how to macro properly to get into gold or platinum?

Secondly, and I think more importantly because people tend to overlook this, it is only at the lowest and highest end of the matchmaking system that the game will show any signs of imbalance among races based on performance. Think of each race's population in sc2 as a ladder--wordlplay definitely intended :D--with the highest players at the highest rungs in top diamond and the lowest at the bottom of bronze. Ideally, each of the three ladders (or four if you want to include random) will stand equally tall and have about the same distribution in skill level. If the game is favored toward one race, however, we get a translational effect, where the best of race A are better than the best of race B. Then the good of race A become better than the good of race B and so get matched up with the very good players of race B, and so on. Matchmaking doesn't recognize any 'inherent' skill, only performance. Thus, the players of race A won't have a significantly better performance overall, since the middle population occupies over 99% of the ladder. They will, however, have a stronger performance at the top and bottom (theoretically, but not in reality) of the ladder. Since the lowest end, the worst of the bronze league, won't provide anything useful for us, we need to analyze the high end of the ladder.

As a disclaimer, this is a preliminary analysis. It is by no means exhaustive and it makes no conclusive claims. It's more of a thumbnail look at the trends, as there are no chi-square tests or other tools used to check for statistical relevance except for common sense. If someone wants to run more involved statistics on this data set, by all means please do, but don't criticize it for being too weak to support its conclusions because I am only using it to hint and show correlation. I only speculate about causative factors and I leave evidence of that to further healthy discussion and deeper analysis.

Data taken from http://sc2ranks.com/stats/
From the top 200 players as of yesterday evening, I looked at race frequency, % diamond players of each race in top 200, the mean and median points of each race in the top 200, and the mean points and frequency for each quartile of each race.

Section 1: Race frequencies in the top 200

Top 200 Race
Population-----random--protoss--terran-----zerg
200--------------3------------61--------90----------46

Clearly there are a lot of terran players in the top 200, but let's not jump to any conclusions.

Race breakdown by top 200
Qrtl Random Protoss Terran Zerg Cutoff (points)
1Q 1-----------17---------26------7-----------1127
1Q 1-----------15---------22------10---------1066
1Q 0-----------13---------21------16---------1023
1Q 1-----------16---------21------13---------991
overall 3--------61--------90-------46---------991

Note: random is not considered for later analysis because of it's extremely low representation in the top 200.
Also Note: the quartiles didn't break evenly because of point ties at cutoff locations. So we have 51, 48, 50, 51 for the quartile breakdowns.

We see an increasing proportion of the terran players in the top of the top 200, while protoss remains fairly level and zerg shows the opposite trend of terran.


Section 2: Individual race analysis of the top 200

The following charts breakdown performance by each quartile of the individual race's top 200.

Protoss average points of top 200
Qrtle avg pts---size
1Q--1180.25---16
2Q--1093.87---15
3Q--1044.20---15
4Q--1005.14---15
mean------1082.49
median---1074.00

Terran average points of top 200
Qrtle avg pts---size
1Q--1177.55---22
2Q--1100.86---22
3Q--1052.09---23
4Q--1007.61---23
mean------1083.31
median---1073.00

Zerg average points of top 200
Qrtle avg pts---size
1Q--1147.75---12
2Q--1066.09---11
3Q--1037.64---11
4Q--1005.50---12
mean------1064.78
median---1049.00

Race Performance Differentials
Qrtle Pro-Zer__Ter-Zer__Ter-Pro
1Q__32.50___29.80___-2.70
2Q__27.78___34.77___7.00
3Q__6.56____14.45___7.89
4Q__-0.36____2.11____2.47
Mn__17.71___18.53___0.82
Md__25.00___24.00___-1.00

These charts show a bit more in-depth the trends we already noticed in the first section. Terran has a much larger population in the top 200, but they don't dominate in performance compared to protoss. Protoss has a better performance from its top quartile than terran, but terran outperforms protoss in each subsequent quartile. Also, protoss has a slightly higher median while terran has a slight edge on mean.

The most alarming trend is the underperformance of Zerg at this level. Zerg have nearly a 30point deficit on each of the top two quartiles in average points and has a much lower mean and median compared to the other two races.


Section 3: Race representation in top 200 from diamond

Race distribution by league
league----random--protoss--terran--zerg
diamond 4525------15725---13621--10771

Race representation of top 200 from diamond
Race-----------% in top 200 from diamond
Random-------0.066%
Protoss--------0.388%
Terran----------0.661%
Zerg------------0.427%

The only thing to point out here is that a lot more terrans perform at the top 200 level proportionally compared to the other races.



Conclusions
Terran has the lion's share of the top 200 compared to the other two races and has strong performance in each quartile of its top 200 players. Protoss, while underrepresented among top (read: diamond) players, also has strong performance while Zerg is both weak in number and performance among the top 200. Additionally, Terran has a much larger proportion of its top players in the top 200, which leads one to believe that, consistent with its trends to perform well at the top of the top 200, terran also probably performs very well at a high level, diamond.

There are many factors to consider here. But it seems reasonable from this data set to at least think that terran needs a nerf and zerg a buff. However, there could be many plausible reasons for this, from better players playing terran to the metagames being undeveloped and so on, but I think the most reasonable explanation for this correlation is that terran is imba right now and also that zerg needs a buff. We can expect to have a high amount of randomness associated with a small sample size (n=200). This analysis is very preliminary and I did not test for statistical relevance, although I hope you will see its relevance even without those helpful tools.

Other factors to consider:
win/loss ratio, games played

I hope this provides some helpful food for thought on the current balance issues.
LlamaNamedOsama
Profile Blog Joined July 2010
United States1900 Posts
August 19 2010 21:52 GMT
#217
On August 17 2010 17:07 GagnarTheUnruly wrote:
Show nested quote +
On August 17 2010 16:59 LlamaNamedOsama wrote:
On August 17 2010 07:38 dcberkeley wrote:
On August 17 2010 07:35 neobowman wrote:
Isn't this math and not science?

Scientific != science


No, but scientific refers to a specific methodology that is not followed in the OP, which seems more aimed at being a statistical study (although many have already pointed out that it still doesn't really use statistics but rather a layman's look at numbers).


Science is a process by which you formulate a hypothesis based on a theory, a prior hypothesis, or an observation, and then devise a method to objectively test that hypothesis.

I heard a hypothesis that the races were inbalanced, observed a dataset that suggested that win rates were race independent, and formulated a hypothesis that win rates were race independent. I then found support for my hypothesis by analyzing a data set of 58 million cases using a time honored statistical technique. Where does that deviate from the definition of science?

Also, I'm not a layman, and despite what you may have heard, a Chi-square analysis is in fact a statistical technique. In fact, it's possibly the most widely used technique in scientific literature, particularly in advanced statistical modeling where it's used for model verification.



The definition of the scientific process includes both experimentation and observation in the acquisition of data. If you claim to know your statistics, then you should easily know that there's a clear distinction between an experiment and an observational survey, and you clearly didn't alter any of the variables.

Also, as far as I recall none of the statistics was present in the original post: after the post appears to have been edited a couple hours after my post, and as far as I know you were updating it with the actual statistic substance when I was posting. Part of your discussion reflects this and my recollection of your original statements.

For ex:
" Within those leagues, Terran has a slight advantage (see, Terran is IMBA!), meaning you’ll win about 2 games in a thousand more often than you should"

or

"A Diamond Solo Zen Master would have to play 1801 games to win as random and 1800 to win as zerg, but only 1794 to win as Protoss and 1784 to win as Terran. So if you want Terran mastery, you’ll get it in 17 fewer games than a random player!"

or

"you’d have to play about a million games before you started to notice that the races were imbalanced in the diamond league"

These are incorrect interpretations of statistics/data. For example, in the very last one, the identification of a p-value less than the alpha for statistical significance only indicates that it's probable that the initial results were not by chance, not an actual quantification of bias, just a determination that there may be some.
Dario Wünsch: I guess...Creator...met his maker *sunglasses*
figq
Profile Blog Joined May 2010
12519 Posts
August 19 2010 22:03 GMT
#218
Even if we had real random match making to draw conclusions from, I fail to see why balance is properly measured by winning %. Let me try to explain. One race could still be significantly easier and lower skill capped, but get even winning % with the other races - in that case this race is just more "cheesy", is designed around risky all or nothing plays, which are not difficult, but also don't get imbalanced amount of wins. Meanwhile, another race could be really hard to play, but still get high enough winning % to be even with the rest - enough people are able to put enough effort to get wins. In other words, some races could be ridiculous, and other could be serious, and still the results of their win/lose ratios could be even, in a truly random match-making. Such state is officially regarded as balanced, but that is misleading.
If you stand next to my head, you can hear the ocean. - Day[9]
PlagueRat
Profile Joined July 2010
United States39 Posts
August 19 2010 22:06 GMT
#219
Cool cool but the MMS screws with your numbers pretty bad, I'd like to see percentages for match-ups that would be interesting
And its true, the clouds just hung around, like black cadillacs, outside a funeral.
Hunch
Profile Blog Joined August 2010
Canada336 Posts
August 19 2010 22:09 GMT
#220
of course sc2 is imbalanced, there is no question what so ever, the point is that 10 years from now nothing that we talk, discuses and rage about balance will matter because the game will change, im sure that when blizz puts out their latest patch ppl are going to change their minds on what is balanced and what isn't balanced.

its just funny to me how much people stress about how balanced or unbalanced the game is, instead why dont we try and talk about what could be improved or just stfu and enjoy the game as it is right now, which some ppl wont do im sure but its just a thought.

i mean it looks like you wrote a lot of interesting stuff but after the first paragraph i kinda just skimmed the rest and looked at the nice little pictures there which im sure 80% of the posters did.

well gl with w/e your trying to do here
I have a Hunch.770
texmix
Profile Joined May 2010
United States106 Posts
August 20 2010 02:33 GMT
#221
On August 20 2010 06:18 andyrichdale wrote:
Show nested quote +
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


No, there would just be more terran in the top 200/diamond league, but win % would not differ due to the match making process. The original post is 100.000% useless.
hdkhang
Profile Joined August 2010
Australia183 Posts
August 20 2010 02:44 GMT
#222
On August 20 2010 06:18 andyrichdale wrote:
Show nested quote +
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 20 2010 03:04 GMT
#223
On August 20 2010 11:44 hdkhang wrote:
Show nested quote +
On August 20 2010 06:18 andyrichdale wrote:
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!


Exactly. And the effect of that is a shift in race frequencies, where there are more terrans and fewer zerg and protoss as you move up the leagues. Since we don't see this happening for league play, we infer that the game is balanced.
Mamojo
Profile Joined August 2010
Canada38 Posts
August 20 2010 03:19 GMT
#224
I actually agree with your statistics and your method to show that SC2 isn't balanced is very impressive, because I did calculation of my own win-lose ratio and compared it to the one you have, and it seems pretty close together.
Half
Profile Joined March 2010
United States2554 Posts
Last Edited: 2010-08-20 03:50:47
August 20 2010 03:28 GMT
#225
I think this thread is kind of funny. Because I distinctly remember the forum collectively yelling at Blizz's balance team when they had this information and we didn't that all they did was "balance using stats" and how "stats don't mean anything". And here we are, in their shoes, over analyzing stats that "don't mean anything".

This stat doesn't show much. However, we can conclusively draw these to points from it.

#1: Something in the games design is causing less players to play Zerg, which may or may not be a problem.
#2: Their is no game breaking imbalance that greatly detracts from the game on non-professional levels.

What it doesn't show is

Terran is balanced
Zerg is underpowered.

At all.
Too Busy to Troll!
explicit
Profile Joined August 2010
52 Posts
August 20 2010 03:36 GMT
#226
On August 20 2010 11:44 hdkhang wrote:
Show nested quote +
On August 20 2010 06:18 andyrichdale wrote:
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!


This deserves another quote so nobody misses it. Excellent Post!
nam nam
Profile Joined June 2010
Sweden4672 Posts
August 20 2010 03:43 GMT
#227
On August 20 2010 12:04 GagnarTheUnruly wrote:
Show nested quote +
On August 20 2010 11:44 hdkhang wrote:
On August 20 2010 06:18 andyrichdale wrote:
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!


Exactly. And the effect of that is a shift in race frequencies, where there are more terrans and fewer zerg and protoss as you move up the leagues. Since we don't see this happening for league play, we infer that the game is balanced.


Wrongly.
hdkhang
Profile Joined August 2010
Australia183 Posts
August 20 2010 04:11 GMT
#228
On August 20 2010 12:04 GagnarTheUnruly wrote:
Show nested quote +
On August 20 2010 11:44 hdkhang wrote:
On August 20 2010 06:18 andyrichdale wrote:
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!


Exactly. And the effect of that is a shift in race frequencies, where there are more terrans and fewer zerg and protoss as you move up the leagues. Since we don't see this happening for league play, we infer that the game is balanced.


Only you have no starting point and no turning point to compare. Therefore to suggest the current distribution of racial "preference" being reasonably distributed in all leagues accounts for balance, which it clearly cannot, is simply incorrect. It also does not btw point to imbalance, how can it if it can't definitively account for anything other than the AMM system doing it's job?
hdkhang
Profile Joined August 2010
Australia183 Posts
August 20 2010 04:24 GMT
#229
On August 20 2010 12:28 Half wrote:
I think this thread is kind of funny. Because I distinctly remember the forum collectively yelling at Blizz's balance team when they had this information and we didn't that all they did was "balance using stats" and how "stats don't mean anything". And here we are, in their shoes, over analyzing stats that "don't mean anything".

This stat doesn't show much. However, we can conclusively draw these to points from it.

#1: Something in the games design is causing less players to play Zerg, which may or may not be a problem.
#2: Their is no game breaking imbalance that greatly detracts from the game on non-professional levels.

What it doesn't show is

Terran is balanced
Zerg is underpowered.

At all.


Completely agree.

People can choose their race for any number of reasons.

Also, say I devote 100 hours of training in 1 race, and then spend another 100 hours of training in another race, it does not necessarily result in my being equally proficient at both races, for all we know one race may have features which suit my skills/preferences/playstyle much better than another.
Mikilatov
Profile Blog Joined May 2008
United States3897 Posts
August 20 2010 04:37 GMT
#230
This is a really great post, but your thread title is just begging for controversy, haha.

Excellent information though, thanks.
♥ I used to lasso the shit out of your tournaments =( ♥ | Much is my hero. | zizi yO~ | Be Nice, TL.
GagnarTheUnruly
Profile Joined July 2010
United States655 Posts
August 20 2010 05:12 GMT
#231
On August 20 2010 12:43 nam nam wrote:
Show nested quote +
On August 20 2010 12:04 GagnarTheUnruly wrote:
On August 20 2010 11:44 hdkhang wrote:
On August 20 2010 06:18 andyrichdale wrote:
On August 20 2010 04:10 texmix wrote:
If every terran unit suddenly had +25% hit points (making them obviously overpowered), the original post methodology would still conclude the races are equal and have the same 2 pages of statistical garbage backing it up.


What?

If Terran units had 25% extra hit points then Terrans would win considerably more of their matches than they currently do. This would reflect in a win% increase to the point where it's in the "considerably higher than expected" region which would lead to the conclusion that Terrans are over powered.


Not at all.

Let's assume that the current system is "perfectly balanced", that the MMR assigned to each player is accurate, that a 500MMR Terran is as skilled as a 500MMR Protoss is as skilled as a 500MMR Zerg. Let's also assume that there is a perfect distribution of players using each race and that random did not exist. Let's also assume that skills remain exactly the same in the observation period immediately following this change.

Now if every terran unit in this perfectly balanced game had 25% more hit points, the 500MMR Terran player will have a greater chance at winning against his 500MMR Zerg/Protoss cohorts. So obviously the 500MMR Terran will have to be readjusted upwards, and the Zerg/Protoss now having trouble against 1/3 of the matchups will be adjusted down. So to make it easier to follow, we make up some numbers and come up with the Terran who used to be 500MMR now having an easier time winning 2/3 of his matchups is bumped up to 600MMR, the Zerg/Protoss having trouble with 1/3 of his matchups is now down to 450MMR.

The former 500MMR Teran player no longer is considered the same "skill level" as the former 500MMR Zerg/Protoss. So who is considered his "peer" in the eyes of the AMM? Answer is the newly 600MMR Zerg/Protoss who would have been 660MMR Zerg/Protoss under the "old, perfectly balanced" game. Now that the MMRs have been adjusted in accordance with the game, the former 500MMR Terran players with their shiny new 25% HP buff now has a 50% win ratio against former 660MMR Zerg/Protoss players. This 50% win ratio is exactly what the data will show you, and no amount of fancy analysis will reveal an imbalance even if it is obvious there is one.

BTW, 25% is just a number, don't get hung up over it.. there of course will come a point where that number will utterly break the game and result in nobody being able to win against an OP race, but that would be so painfully obvious you would not need maths to show it to be so.

e.g. 200HP Marines with 5 armour, everything else stays the same. There is no way that a Terran would lose if he went 7 RAX Marine.

I keep saying this but it keeps getting ignored and people still go about wasting their time: using clean/smoothed data will result in no imbalance issues revealed! Seriously guys, stop wasting your time!


Exactly. And the effect of that is a shift in race frequencies, where there are more terrans and fewer zerg and protoss as you move up the leagues. Since we don't see this happening for league play, we infer that the game is balanced.


Wrongly.


Please elaborate. I can see one possible reason that just occured to me. I apologize if others have made this point and I missed it.

If we make a few assumptions, outlined elsewhere, I think it's clear that racial imbalance + AMM will cause weak races to be pushed lower in the leagues. However, I realize that it's not necessarily intuitive to me how the races will get pushed back.

In the figure below, I've assumed that there are two races. If the races are balanced, and we assume that race choice is influenced by factors other than player skill, then a graph of race use vs. placement should look like the one below (use of the two races may or may not be equal at 50%, but the proportion of players using each race won't change as a function of league placement).

As discussed above, we might predict that imbalance would lead to weak races being pushed down the ladder, but how would that manifest? Let's say the red race is stronger. Would the graph look like fig. A or fig. B? If it results in the pattern shown in A), that will be easy to detect. If it looks like the pattern shown in B), that will be hard to detect.

[image loading]
Half
Profile Joined March 2010
United States2554 Posts
Last Edited: 2010-08-20 06:20:09
August 20 2010 06:14 GMT
#232

Please elaborate. I can see one possible reason that just occured to me. I apologize if others have made this point and I missed it.


I don't think he was saying that basic premise of the post that was wrong

-that a races relative performance directly correlates to more players of that race being placed in a
relatively higher league-

Which is only logical.

Instead, he was saying


Exactly. And the effect of that is a shift in race frequencies, where there are more terrans and fewer zerg and protoss as you move up the leagues. Since we don't see this happening for league play, we infer
Wrongly
that the game is balanced.


That while the premise may be true, that alone cannot not conclude the game/race is balanced /imbalanced.

Statistics only go so far, as we've repeatedly told Dustin Browder. Lets just try to keep that in mind ^_^.
Too Busy to Troll!
Rabiator
Profile Joined March 2010
Germany3948 Posts
August 20 2010 06:30 GMT
#233
Useless thread IMO and abuse of the word "scientific", because every science should still involve common sense to know what affects its samples and differences of less than 1% are meaningless in such a complex game as Starcraft 2, where the daily form of the player has a MUCH greater impact than the actual abilities of the units (= the balance of races).

Many times there are effects which affect a "test" on a different scale, where one thing has a big effect on the test and completely dominates other, much weaker effects. Lets take a hot cup of tea for example, where you just put in the tea bag ...
- The smalles effect present is that of diffusion, where an atom (molecule) randomly changes place with a neighboring atom (molecule).
- If you stay on the earth you will also get convection flow of heat, you know the "hotter water is lighter and will go to the top and thereby stirs the tea" stuff. This is sooo much bigger that you almost cannot measure the diffusion effect on earth.
- If you take a spoon and stir the cup it would make convection immeasurable.

For any competitive game the daily form of the player is really really important and has a much greater effect than 1-2% and since Zerg players have pressured themselves with the "oh I cant win against Terrans because they are IMBA" propaganda they are at a psychological disadvantage to begin with IMO.
If you cant say what you're meaning, you can never mean what you're saying.
Gedrah
Profile Joined February 2010
465 Posts
August 20 2010 06:48 GMT
#234
Balance is a question of available technology and strategies and their timing, and how these relate to ground- and air-travel distances on a given map. Balance is NOT a question of win percentage on the ladders. Your well-put-together study seems to ignore the simple fact that those games aren't all being played by the same two people. The statistical analysis of who wins most often in league games is really meaningless for determining balance--even if Zerg were winning 60% of their games, this could be explained by the behavior of individual zergs kicking ass rather than "imbalance." Okay, okay, it may not be MEANINGLESS, but it's definitely not a scientific basis for claiming imbalances exist.
What is a dickfour?
Drowsy
Profile Blog Joined November 2005
United States4876 Posts
August 20 2010 06:51 GMT
#235
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.

Exactly. You just can't extrapolate anything from the win ratios by race due to this.
Our Protoss, Who art in Aiur HongUn be Thy name; Thy stalker come, Thy will be blunk, on ladder as it is in Micro Tourny. Give us this win in our daily ladder, and forgive us our cheeses, As we forgive those who play zerg against us.
harmony.piano
Profile Joined August 2010
2 Posts
August 20 2010 07:22 GMT
#236
I don't think the balance issue can simply be solved by looking at the stats.
Let me give you a simple example:
Suppose there is a very difficult manoeuvre (say some kind of reaper or hellion harass) that only 0.1% of the Terran players can pull it off consistently. If they manage to pull it off, they are almost guaranteed of a win against any Zerg.
While for the rest of 99.9% Terran players, they are not skilled enough to use this manoeuvre, and they have 49.9% chance of winning against Zergs by playing normally.
So in total, TvZ is about 50:50. But does this means TvZ is balanced? Of course not.
Because now TvZ is already flawed, as the 0.1% pros have a way to win against Zergs.
Pking
Profile Joined May 2010
Sweden142 Posts
August 20 2010 12:09 GMT
#237
On August 20 2010 15:51 Drowsy wrote:
Show nested quote +
On August 17 2010 07:18 StarcraftGuy4U wrote:
None of these stats are worthwhile because the matchmaking system does not assign people like they would in a blind study, instead it is actively adjusting the matches so that every player reaches 50%. The numbers you are pulling are worthless for this reason.

Exactly. You just can't extrapolate anything from the win ratios by race due to this.


To the people saying the statistics is useless because of the matchmaking system, read the OP again.


Discussion: My data show that, within a league, each of the races has a rougly equal chance of winning a randomly selected game. This indicates that the balance of SC2 is probably pretty good. People have pointed out that matchmaking would cause this to happen, because it strives to set each player's win rate at 50%. That in turn would cause the win rate of each race to trend towards 50%. That being the case, poor balance would tend to result in 'weak' races getting pushed into the lower tiers of play. Because we don't see that happening either within or among leagues (data not shown), my data suggest both that the matchmaking system works well and that SC2 is inherrently pretty well balanced.


Zarahtra
Profile Joined May 2010
Iceland4053 Posts
August 20 2010 12:20 GMT
#238
I'm a bit interested in that as I went on sc2ranks.com I saw that zerg fills 24.07% of Diamond global but has 19.78% playerbase. Wouldn't this indicate that zerg was doing pretty decently overall(even if you'd have to take a bit more microscope to see if they are all just filling the last 24.07% of diamond) as they have just over 4% more players in diamond than you'd expect(if we assume players of x race aren't straight better on avg than of y race).
TanGeng
Profile Blog Joined January 2009
Sanya12364 Posts
August 20 2010 12:28 GMT
#239
Fuck, there's even more facepalm than I originally suspected. This deserves a big ugh.

On August 20 2010 14:12 GagnarTheUnruly wrote:
As discussed above, we might predict that imbalance would lead to weak races being pushed down the ladder, but how would that manifest? Let's say the red race is stronger. Would the graph look like fig. A or fig. B? If it results in the pattern shown in A), that will be easy to detect. If it looks like the pattern shown in B), that will be hard to detect.


Your data has only FIVE DISCREET SKILL LEVELS buckets!
You also have no clue what the skill distribution looks like.

You can't assume a distribution for imbalance, show that the distribution doesn't exist, and claim that you proved imbalance doesn't exist or is small. FUCK THAT.

Please spare us all from more of that bullshit.
Moderator我们是个踏实的赞助商模式俱乐部
Thrombozyt
Profile Blog Joined June 2010
Germany1269 Posts
August 20 2010 12:43 GMT
#240
So lets do a little math game... Idra rages at least 3 times a week how imba terran is and that he loses due to that.
Statistics show that zerg lose 1.5 games in 1000 due to imbalance. That means Idra would have to play 2k games a week in order. Now Idra is a tough progamer and he plays 18 hours (even he needs 5 hours of sleep and another hour for food/hygiene) a day 7 days a week. Those are 7560 minutes. That means he is actually that pro, that he ends his average match in 3.76 minutes minus the time it takes to load up the game.

Now I know what I am doing wrong... moar 6pool ftw!

MGHova
Profile Joined April 2010
Canada274 Posts
August 20 2010 12:54 GMT
#241
I didn't bother reading the 12 pages of replies but I just wanted to offer a possibly reason why terran players in bronze league might seem extremely imbalanced. Could this possibly be due to the fact that new players that has only played the campaign(or nothing) first games would play terran because its the race that's first selected when you load the game. These players would play maybe a max of 10-20 games and then quit. Wouldn't the statistical analysis work best if each player had a larger number of games played. This is why i'm guessing platinum and diamond players are the most balanced since generally(not in my case ) you would have to play a fair amount of games to get to diamond/plat league.
Meldrath
Profile Joined June 2010
United States620 Posts
August 20 2010 13:13 GMT
#242
This is a splendid work I thank you for posting it, It should calm down all the whining of zerg v terran, Its the first thing I hear when i beat a zerg with mech. OP race this OP race that get skill loser etc.. there tuants are universally the same
slap me I must be dreaming another "imba" arugment! fffffffffuuuuuuuuuuuuu!!!!!
Izzachar
Profile Joined February 2010
Sweden285 Posts
August 20 2010 13:16 GMT
#243
How weird that you got a significant difference using 50million entries....
AT_Tack
Profile Joined February 2010
Germany435 Posts
August 20 2010 13:17 GMT
#244
Well the OP fell for the "awesome" Blizzard Matchmaking System.
Imagine you are a Zerg Player in the 500 Diamond Rankings and you start dropping games one after another. The matchmaking system will then match you with Gold Level noobs just to ensure you will stay at your ~55% win ratio.

Bronze League is a complete different thing. The ppl there are like "Why cant i switch to Hyperion for some research in multiplayer?" Or "I think 8 workers is enough, isnt it?"
Bronze league might even be a place to mitigate the terran overall win ratio...
seems like a Blizzard imbalance cover up plan.

someone play the X-Files theme pls...
evilm0nkey
Profile Joined October 2009
53 Posts
August 20 2010 13:26 GMT
#245
Guys, these statistics only show that bnet's matchmaking system works well, it says nothing about balance! Of course the win:loss ratio of the races is close to 50%, actually every players win:loss ratio should be around 50% if he plays long enough and is not better than everyone else.

The %-difference from the average rather shows current trends, this means that the situation for diamond / plat zergs is getting worse.

I dont see the need for discussion anyway, everyone knows in ZvT earlygame, Terran has any option and Zerg needs to react. Of course there is always a way for Zerg to counter Terrans action but it's quite hard to react perfectly to a perfectly executed build, when there is 20 different potentially deadly openings for T and scouting is so difficult in this period.
The current imbalance relies in the difference of skill requirement, not in potential strenght of the races. I think this is even worse for lower skilled players than for pros and that explains the low Zerg player percentage overall.

I notice it everyday, so many terran players in diamond league who cannot macro and when their initial 1-base build fails they ragequit and scream imba. Sure i have 50% vs terran but thats because i play vs ridiculusly low skilled people, thats rather sad.
Cyber_Cheese
Profile Blog Joined July 2010
Australia3615 Posts
August 20 2010 13:30 GMT
#246
hmmmm something about the OP having a terran picture seems to make me mistrust his post

after thinking about it, he might be right, but hes proven the wrong things with the wrong evidence
for the game to be balanced, you need a good mix of wins purely at the pro level, noobs can derp around all they like and cry imbalance due to lack of skill, but at the pro scene, everybody has enough skill to properly use all the games mechanics, push when theyr winning instead of waiting to be countered etc

not only that but the claims of imbalance are more related to how many effective harassment options a terran has, while being virtually unharassable @ early game. example: the zerg and protoss generally have no real answer to reapers if they get rushed by them before ling speed or stalkers, and have no early game options outside of those to beat reaper harass
The moment you lose confidence in yourself, is the moment the world loses it's confidence in you.
hdkhang
Profile Joined August 2010
Australia183 Posts
August 20 2010 16:27 GMT
#247
I've already said a lot on this topic so I won't bother to rehash. Instead, I'll post something for you to consider.

If you watch House (MD), you will know that they tend to say stuff along the lines of, show an oncologist a patient and they see cancer, show a neurologist that same patient and they see neurological problems...

It's the same thing happening here, show a math geek some numbers and he sees number crunching opportunities.

Back when BETA started, Roaches were 1 supply + 2 armour, yet Terran were still able to win vs Zerg. People called it imbalanced... it was changed. Now people find the opposite to be true, that Terran is imbalanced and yet somehow just because the game has been released to retail it simply cannot be true?

What are the chances that a game with 3 races without 1:1 counters, a different number of units per race, different game mechanics per race and different play requirements per race is exactly balanced? It is practically impossible... so does this mean the game will never be balanced? Semantically speaking it isn't possible for it to be true, but it can be balanced enough, and that is all anyone is realistically asking for. But most importantly, it needs to be balanced in such a way that it is still fun.

Imagine back in WC2 days where humans had a direct counter to Orcs: we have an archer, well they throw axes- same crap different smell. It was practically a mirror matchup with cosmetic changes and some spells that sort of offset each other (e.g. temporary increase in attack damage offset by ability to heal). We don't want that for SC2, SC:BW was great because having 3 distinct races made for entertaining play and a great deal of variety. It took some time for balance changes to be rolled out and the main reason for those changes was not statistics. Whether SC:BW is balanced today is debatable, but it's balanced enough that people don't get bent out of shape over it.

Just check out the streams from the IEM challenge (day 1 and 2 in particular), do you think this whole Terran OP thing is really just everyone's imagination? Even Morrow mentioned that Terran is very strong right now, and he plays Terran. HasuObs was unhappy about how strong marauders were, when asked if he had changed his strategy would things have been any different, he said "no", they can just stim and do tank like DPS while having enough HP to tank psi storm and collossi damage. Dimaga played a match against Lucifron, killed all his SCVs and still lost, he was not happy that such a devastating blow to Terran, the race able to both turtle and harrass, is also such a beast at economic recovery - the last saving grace of the Zerg. People were surprised when TLO managed to comeback after a severe economic setback vs WhiteRa and still convincingly win.

We really should take into consideration what the pro's say since they are the ones trying to explore the deepest depths of this game. The thing is, even the pros will have differing opinions, each one will have their own point of view, so you will see Zergs saying the matchup is fine, and you will hear Terran saying the matchup is not fine. You'll also have a lot of people on forums countering this by saying that Pro's are the most suspect since their livelihood is at stake and so of course they will want to blame their lack of skill on racial imbalance - they will then go on to say that Blizzard is the one with all the "hard facts", they are the ones with the years and years of experience making games balanced.

Using a car analogy since that is the norm... a race driver tells his crew that the car whilst driveable does not handle very well in certain conditions... the mechanic tells him that isn't possible since he has been building/servicing race cars for years and that since it worked in the last car, this new car which despite being different in many ways should also behave the same. It would never happen.

Blizzard have a financial incentive to make the game easier to play for Terrans, more so than Pros have a financial incentive to whine about their race being UP. You don't even need to dig that deep to realise this is likely the case. The level of polish that Terran have received is far and away greater than that of the other two races, people are asking more for the other 2 races to get the attention to detail that they deserve, because afterall, mirror matches are not generally something everyone wants to eventuate. Blizzard knows that people who are new to the game will likely choose to play as Terran online and as such they don't want them to have a hard time getting some degree of proficiency at the game.
hmsrenown
Profile Joined July 2010
Canada1263 Posts
August 20 2010 16:49 GMT
#248
I think the methodology is sound for what a starcraft 2 player can fathom. Critisizing the methodology is fine cause he has not taken the match making system into account. And plus I do wish to say that people who just comes in and say "terran is imba", the game is not perfectly balanced, so freaking what? Look at the streaky sc:bw winrates plz.
Thenas
Profile Joined May 2010
Sweden107 Posts
August 20 2010 18:58 GMT
#249
If this includes data from the beta period you have made a whole lot of work for nothing as those games contribute to "balance" the scales as in "that race is op so we nerf it" which alters the win/loss %.

Is it retail only or beta data aswell?

Moosie
Profile Joined August 2010
United States44 Posts
August 20 2010 19:11 GMT
#250
interesting way to see it..
Orange Goblin
Profile Joined May 2010
218 Posts
August 20 2010 20:51 GMT
#251
On August 21 2010 01:49 hmsrenown wrote:
I think the methodology is sound for what a starcraft 2 player can fathom. Critisizing the methodology is fine cause he has not taken the match making system into account. And plus I do wish to say that people who just comes in and say "terran is imba", the game is not perfectly balanced, so freaking what? Look at the streaky sc:bw winrates plz.


He has taken the MMS into account, that's not really the issue.

The problem is he completely ignores the players subjectivity and level of skill. Not to mention that "balance" is completely irrelevant except for at the very highest level of skill (which isn't even that high yet, which is why we see someone like Idra being consistently better than most other players).

People need to realise that even the very best players aren't even close to maxing out their potential in pure mechanics. A handful are getting quite good, but still, it's not like you can use some limited dataset based off completely random matches and think you'll get a result that matters. It's like telling a bunch of random people to complete a few laps in Monte Carlo with three different F1 cars, then using the data to determine which car is the best, or if there are any balance problems. It's ridiculous.

Looking at Diamond league as one entity in itself is dubious, at best, and even looking at the ladder in the first place is questionable.

To get any kind of significant data you would need to get players at the very top that have a similar skill-level in terms of mechanics. Them being similarly skilled in micro would also help. Then these would have to play a significant number of games (ideally, thousands), so that you could investigate individual match-ups.

Also, there is the issue of understanding that a lot of the complaining stems from the fact that at a higher level, Zerg is a purely reactive race in a game of advanced RPS (which is extremely macro-based, in comparison to any other RTS). Zerg hasn't got any viable aggressive options early game, which leads to a bunch of different issues, but balance-wise, looking at statistics, it's probably still somewhat subtle in comparison to what high-level Zerg players feel when they know that they have to constantly be on their toes to even stand a chance of winning, while Terrans basically can relax and lean on the opponent to win, and here is the very important bit — they haven't really got any incentive to do anything more than leaning. In other words, the nature of the match-ups vs Zerg is based in the fact that you always have more options, and can always play safer. Will this regularly show up in statistics? With time in high-level play, sure. Across the board with all kinds of idiots playing? Not a chance.
Arkons.pbc
Profile Joined May 2010
United States9 Posts
Last Edited: 2010-08-20 21:05:46
August 20 2010 20:57 GMT
#252
i'd prefer to see a Study of diamond only. I think we'd see zerg Win rate decreasing as you got higher and higher in diamond league.

I never see sub-30 apm toss or zerg players in diamond ladder. Plenty of Terrans in that category though ... and they can still beat me =(
Zhobes
Profile Joined September 2009
United States6 Posts
August 20 2010 21:54 GMT
#253
I'm getting a little puzzled by a lot of these responses. The Diamond-only results are on the chart, and in fact the first thing you read about. The MMS was taken into account (sort of) by saying that an imbalance would stratify the races, and you'd see Zerg sinking lower in the leagues. He doesn't give the data showing the lack of stratification, admittedly, but these responses are acting like he hasn't addressed these questions at all.

Additionally, I think the arguments Zerg players are making in the ZvT matchup are valid, and I've followed these discussions with interest, but here he's taking the numbers, which are in fact the only measurable data in this debate, and concluding what they show.

I don't think I'm good enough at StarCraft to really comment on the balancing issues, but please stop attacking the OP for neglecting details that he covered adequately. Even if you don't read all the responses, at least finish reading the first post before beginning your assault.
nam nam
Profile Joined June 2010
Sweden4672 Posts
Last Edited: 2010-08-20 22:04:16
August 20 2010 22:03 GMT
#254
On August 21 2010 06:54 Zhobes wrote:
I'm getting a little puzzled by a lot of these responses. The Diamond-only results are on the chart, and in fact the first thing you read about. The MMS was taken into account (sort of) by saying that an imbalance would stratify the races, and you'd see Zerg sinking lower in the leagues. He doesn't give the data showing the lack of stratification, admittedly, but these responses are acting like he hasn't addressed these questions at all.

Additionally, I think the arguments Zerg players are making in the ZvT matchup are valid, and I've followed these discussions with interest, but here he's taking the numbers, which are in fact the only measurable data in this debate, and concluding what they show.

I don't think I'm good enough at StarCraft to really comment on the balancing issues, but please stop attacking the OP for neglecting details that he covered adequately. Even if you don't read all the responses, at least finish reading the first post before beginning your assault.


Exactly, and there is the problem. The conclusions are wrong imo. I give him credit for doing the math, but I don't agree with a lot of his conclusions. You simply can't use this data to prove or disprove imbalance unless you have more information.
Normal
Please log in or register to reply.
Live Events Refresh
SOOP Global
15:00
#20
Spirit vs SKillousLIVE!
YoungYakov vs ShoWTimE
LaughNgamezSOOP
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft: Brood War
Shuttle 1499
Stork 674
Soulkey 623
Hyun 113
Barracks 112
Nal_rA 106
Sacsri 93
Rock 43
zelot 30
ToSsGirL 30
[ Show more ]
Terrorterran 29
HiyA 13
ivOry 7
Dota 2
Gorgc7253
qojqva3214
boxi98311
League of Legends
JimRising 307
Counter-Strike
fl0m3548
Stewie2K144
Super Smash Bros
Mew2King106
Heroes of the Storm
Khaldor79
Other Games
B2W.Neo909
Lowko388
Hui .323
KnowMe271
crisheroes182
Fuzer 173
Trikslyr59
FrodaN15
QueenE9
Organizations
Counter-Strike
PGL75340
Other Games
EGCTV1424
StarCraft 2
ESL.tv831
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
sctven
[ Show 14 non-featured ]
StarCraft 2
• 3DClanTV 29
• AfreecaTV YouTube
• intothetv
• Kozan
• IndyKCrew
• Migwel
• sooper7s
StarCraft: Brood War
• Michael_bg 3
• BSLYoutube
• STPLYoutube
• ZZZeroYoutube
Dota 2
• WagamamaTV713
• Ler67
League of Legends
• Nemesis8110
Upcoming Events
Anonymous
8m
CranKy Ducklings6
SOOP
1h 38m
HeRoMaRinE vs Astrea
BSL Season 20
2h 8m
UltrA vs Radley
spx vs RaNgeD
Online Event
12h 8m
Clem vs ShoWTimE
herO vs MaxPax
GSL Qualifier
16h 38m
Sparkling Tuna Cup
18h 8m
WardiTV Invitational
19h 8m
Percival vs TriGGeR
ByuN vs Solar
Clem vs Spirit
MaxPax vs Jumy
Anonymous
22h 8m
BSL Season 20
23h 8m
TerrOr vs HBO
Tarson vs Spine
RSL Revival
1d 1h
[ Show More ]
BSL Season 20
1d 2h
MadiNho vs dxtr13
Gypsy vs Dark
Wardi Open
1d 19h
Monday Night Weeklies
2 days
Replay Cast
3 days
The PondCast
3 days
Replay Cast
4 days
Replay Cast
4 days
Road to EWC
5 days
SC Evo League
6 days
Road to EWC
6 days
Liquipedia Results

Completed

Proleague 2025-05-14
2025 GSL S1
Calamity Stars S2

Ongoing

JPL Season 2
ASL Season 19
YSL S1
BSL 2v2 Season 3
BSL Season 20
China & Korea Top Challenge
KCM Race Survival 2025 Season 2
NPSL S3
Heroes 10 EU
PGL Astana 2025
Asian Champions League '25
ECL Season 49: Europe
BLAST Rivals Spring 2025
MESA Nomadic Masters
CCT Season 2 Global Finals
IEM Melbourne 2025
YaLLa Compass Qatar 2025
PGL Bucharest 2025
BLAST Open Spring 2025
ESL Pro League S21

Upcoming

Rose Open S1
CSLPRO Last Chance 2025
CSLAN 2025
K-Championship
Esports World Cup 2025
HSC XXVII
Championship of Russia 2025
Bellum Gens Elite Stara Zagora 2025
2025 GSL S2
DreamHack Dallas 2025
IEM Cologne 2025
FISSURE Playground #1
BLAST.tv Austin Major 2025
ESL Impact League Season 7
IEM Dallas 2025
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2025 TLnet. All Rights Reserved.