race v race statistics based on 551 "top" replays

bingobango

26 Posts

September 21 2010 12:49 GMT

TLDR: NOT SCIENTIFIC. That said.... happy patch day zerg.

So I've been been working on a replay aggregation site focusing on top players in all gateways. It's been running for a few weeks now and I have 551 "top" player replays (in non-mirror matches). I've chosen to aggregate only sites that have top replays, so top is fairly loosely defined here (but suffice to say it's all very high diamond).

This is seriously not scientific but fun nonetheless.

Here's the breakdown:

zvt
Zerg • 72 wins • 41.86%
Terran • 100 wins • 58.14%
based on: 172 replays

zvp
Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%
based on: 119 replays

pvt
Protoss • 114 wins • 43.85%
Terran • 146 wins • 56.15%
based on: 260 replays

Here's my current stats page:

http://replayspider.com/stats/

I want to flesh this out a bit more (esp. break it down by gateway), but if you have any other ideas or feedback, I'd love to hear it.

ChickenLips

2912 Posts

September 21 2010 12:51 GMT

AMIGAWD PvZ is imbalanced

Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%

bleh
a bigger sample and this might actually be interesting :/

bingobango

26 Posts

September 21 2010 12:52 GMT

The sample is too small, I agree, which is why I didn't release this earlier. v1.0 ends today so it won't be getting any bigger

. It'll be fully automated though, so hopefully throughout v1.1 the sample size will get much larger.

comis

United States333 Posts

September 21 2010 12:54 GMT

First off, don't let the ensuing statistics argument deter you, I like this idea. However, most replay sites include game length and map which I think would be pretty nifty info to have as well. I think it's a cool idea and would like to see your report fleshed out and maybe prettied a bit for public consumption.

EDIT: Can't forget version # as well - it'd be nice to see a history of stats through patches rather than just stats off the current patch pool of replays / clumping them all together

Black Gun

Germany4482 Posts

September 21 2010 12:56 GMT

the sample size itself would be fine under certain conditions. the problem i see here is that many of the players created more than one replay in this sample size, so the influence of some few outstanding players is very big. and mindgames play a big role when the same 2 players play again and again, often times in a row in some tournament boX series.

but im not really surprised to see pvz even more imba than tvz. didnt idra once say that pvz is currently considered the most imbalanced matchup in korea?
im also not surprised to see a tvp imba not too far away from the tvz imba. basically, all the Z whining about tvz has overshadowed the fact that pvz and tvp are slightly imbalanced aswell.

Krohm

Canada1857 Posts

September 21 2010 13:00 GMT

zvp
Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%
based on: 119 replays

WAT

I find that hard to believe, I have a much easier time in ZvP than any other MU when I am zerg.

The rest doesn't surprise me at all though.

GreEny K

Germany7312 Posts

September 21 2010 13:00 GMT

I'd say that those are surprisingly close considering the game just came out. Just a little tweaking required, either that or some races have more developed strategies than others.

GreEny K

Germany7312 Posts

September 21 2010 13:01 GMT

On September 21 2010 22:00 Krohm wrote:
zvp
Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%
based on: 119 replays

WAT

I find that hard to believe, I have a much easier time in ZvP than any other MU when I am zerg.

The rest doesn't surprise me at all though.

Maybe you just have a knack for ZvP?

x7i

United Kingdom122 Posts

September 21 2010 13:08 GMT

and who is uploading those replays? do you have access to every game those players play, or are those: look how badly i got beaten/awesome i am random replays ?
if anything try analysing replays from tourneys [

Zarahtra

Iceland4053 Posts

September 21 2010 13:11 GMT

#10

That's interesting and will deffo be more interesting with a larger sample size. Like someone above me said, having further anazlyses such as win% after x time length would even make this better(if it was possible)

Edit: This cannot be taken to seriously though, atleast not until very very big sample size.

kickinhead

Switzerland2069 Posts

September 21 2010 13:19 GMT

#11

On September 21 2010 21:51 ChickenLips wrote:
AMIGAWD PvZ is imbalanced

Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%

bleh
a bigger sample and this might actually be interesting :/

pvz is just as bad as tvz, just cuz Terran is the topic Nr.1, noone talks about how bad pvz is.

tvp is messed up as well: T is imba early to midgame and toss steamroll them in a macrogame...

Too_MuchZerg

Finland2818 Posts

September 21 2010 13:24 GMT

#12

had some extra games added, removed because data is wrong.

iPlaY.NettleS

Australia4383 Posts

September 21 2010 13:29 GMT

#13

On September 21 2010 22:24 Too_MuchZerg wrote:
All TLPD games from international version (9 maps used)

TvZ: 163-107 (60.4%)
ZvP: 66-90 (42.3%)
PvT: 212-216 (49.5%)

Mirrors:
TvT: 423
ZvZ: 12
PvP: 118

Total games 2261 :

1121 (49.6%)

438 (19.4%)

702 (31.0%)

are these stats legit?
423 TvT and only 12 ZvZ and 118 PvP?
that alone signifies imbalance i think , so many people wanting to play terran

TheFinalWord

Australia790 Posts

September 21 2010 13:32 GMT

#14

Maybe its a psychological effect of everyone complaining about zvt spiriling into a neverending loop of more complaining and percieved imbalance... or maybe it's just a small sample size.
edit:

On September 21 2010 22:29 iPlaY.NettleS wrote:
that alone signifies imbalance i think , so many people wanting to play terran

lol, no. People uploading lots of tvt's to liquipedia does not signify imbalance.

Black Gun

Germany4482 Posts

September 21 2010 13:35 GMT

#15

On September 21 2010 22:29 iPlaY.NettleS wrote:

Show nested quote +

are these stats legit?
423 TvT and only 12 ZvZ and 118 PvP?
that alone signifies imbalance i think , so many people wanting to play terran

i vouch this. the high amount of tvts can be explained for example (not necessarily) by terran players being more successful in tournaments, and thus meeting in tvts in the semis and finals of these tournaments.

just like the most obvious indicator of battle royal´s zerg imbalance was not the 10-2 zvt stats but that there were 30-something zvzs compared to some 15 non-zvzs combined.

Kinky

United States4126 Posts

September 21 2010 13:35 GMT

#16

Seems like the whining about ZvT overshadows the real problems in ZvP

kickinhead

Switzerland2069 Posts

September 21 2010 13:38 GMT

#17

On September 21 2010 22:35 Kinky wrote:
Seems like the whining about ZvT overshadows the real problems in ZvP

nah, the problems for Z in both MU's are legit, I rly don't undrstand hy no1 rly whines about toss...

Too_MuchZerg

Finland2818 Posts

September 21 2010 13:40 GMT

#18

On September 21 2010 22:29 iPlaY.NettleS wrote:

Show nested quote +

are these stats legit?
423 TvT and only 12 ZvZ and 118 PvP?
that alone signifies imbalance i think , so many people wanting to play terran

data was wrong

hmunkey

United Kingdom1973 Posts

September 21 2010 13:41 GMT

#19

Random replays don't mean anything. That said, I do feel z is underpowered based on total ladder results and tournament placements.

bbulzibar

United States80 Posts

September 21 2010 13:44 GMT

#20

Love the site, I would like to see a breakdown by map too! Also, is the same replay (uploaded to multiple sites) counted as unique games? If so, maybe there is a way to filter down to unique games based on map/players/game length.

CruelZeratul

Germany4588 Posts

September 21 2010 13:46 GMT

#21

zvp
Zerg • 42 wins • 35.29%
Protoss • 77 wins • 64.71%
based on: 119 replays

Didn't expect that.

LaLuSh

Sweden2358 Posts

September 21 2010 13:49 GMT

#22

On September 21 2010 22:38 kickinhead wrote:

Show nested quote +

nah, the problems for Z in both MU's are legit, I rly don't undrstand hy no1 rly whines about toss...

There's a pretty clear problem in ZvP and it's that zealots build too fast and don't allow zerg to power drones, ever (vs a good player). Even at high diamond this imbalance isn't noticeable. Only at the very top do you find Protoss players who abuse this properly, with well timed and thought out transitions.

Of course, stalker/colossi are a huge problem too. But hopefully the extra 20 hydras you'll have due to not losing 10+ drones to early zealot pressure will help out in the first big battle. Or the extra 7-8 corruptors.

In ZvT there's nothing specific you can pinpoint the imbalance on.

Adeny

Norway1233 Posts

September 21 2010 13:53 GMT

#23

On September 21 2010 22:19 kickinhead wrote:

Show nested quote +

pvz is just as bad as tvz, just cuz Terran is the topic Nr.1, noone talks about how bad pvz is.

tvp is messed up as well: T is imba early to midgame and toss steamroll them in a macrogame...

YESSSSSS THANK YOU THANK YOU SO MUCH. I thought I was the only one who had an easier time in ZvT than ZvP. Protoss is sooo much stronger early and mid game, only if they don't attack (read: stupid) until you have 3/3/3 upgrades and 200/200 pure ultras does zerg stand a chance. Oh and sorry for derailing.

Regarding the stats though, I'm not really surprised however it doesn't really say much, more of an advert for your website which is fine I guess. Either way, happy patch day zergs, hope we get something nice. :3

bingobango

26 Posts

September 21 2010 13:54 GMT

#24

On September 21 2010 22:41 hmunkey wrote:
Random replays don't mean anything.

I disagree. In fact, my replays are decidedly non-random, which is the problem. If they actually were properly random, it would be much ore significant

On September 21 2010 22:44 bbulzibar wrote:
Love the site, I would like to see a breakdown by map too! Also, is the same replay (uploaded to multiple sites) counted as unique games? If so, maybe there is a way to filter down to unique games based on map/players/game length.

My aggregator has dupe detection (and successfully handles modified replays with chat-ads), so on whichever site I find the replay first gets the "credit", and the newer one goes into the bin. These 551 replays are unique.

As for the actual site, yes, I have filters for seeing games by certain players, races, and matchups. I'm going to add game-length and map soon.

lastmotion

368 Posts

September 21 2010 13:54 GMT

#25

On September 21 2010 22:41 hmunkey wrote:
Random replays don't mean anything. That said, I do feel z is underpowered based on total ladder results and tournament placements.

This. I am sure the OP was biased in picking out replays to make it seem like ZvP more skewed than ZvT and TvP.

There is no way ZvP data is that bad, it's the most balanced SC2 matchup.

Define the top players. From where? By top 550, do you mean consecutively without skipping?

I have high suspicion about the way these replays were picked out / data was made.

Rea

Germany88 Posts

September 21 2010 13:56 GMT

#26

On September 21 2010 22:29 iPlaY.NettleS wrote:

423 TvT and only 12 ZvZ and 118 PvP?
that alone signifies imbalance i think , so many people wanting to play terran

there are more P then T, not in top 551 but overall

and yes, that makes this statistic even worse in terms of balance

bingobango

26 Posts

September 21 2010 14:00 GMT

#27

This. I am sure the OP was biased in picking out replays to make it seem like ZvP more skewed than ZvT and TvP.

I didn't pick replays by hand. These are replays aggregated from 5 sites over the past several weeks that I chose because they had 1) good geographic coverage 2) frequent updates 3) top players.

You can see the sites I used here: www.replayspider.com/about/

Being a bit of a replay junky, I'd say the selection of replays from these 5 sites has really good coverage of the entire "top player" replay scene. If there's replays missing or a site that has replays that I am missing, I'd love to know about it.

There is no way ZvP data is that bad, it's the most balanced SC2 matchup.

I agree with the first part, not so sure about the second part. The selection bias + small sample size makes it a bit squirrely, but it's better than nothing.

Define the top players. From where? By top 550, do you mean consecutively without skipping?

"top" player in this case means whatever the maintainers of the site in question mean by "top" when they upload their replays. You can look through the replays yourself and see what qualifies. I've put the rankings (and sc2rank regional ranks) by each replay.

Sleight

2471 Posts

September 21 2010 14:01 GMT

#28

Hey y'all,

Before this debate turns into some kind of statistical pissing match, I thought I'd link a useful post I made so we can discuss this properly: http://www.teamliquid.net/forum/viewmessage.php?topic_id=153500

I would appreciate seeing actually statistical tests for significance on any of these values. My intuition is that most of these are statistically significant, but I can't be sure without someone actually doing the math.

How does this data hold up to Chi-squared analysis? I suspect that it shows almost perfect balance of the 3 race's overall win percentages.

Wihl

Sweden472 Posts

September 21 2010 14:04 GMT

#29

Zerg in TLOpen:
Round of 512: 144
Round of 256: 90
Round of 128: 43
Round of 64: 17
Round of 32: 5
Round of 16: 1
Round of 8: 1
Round of 4: 0

bingobango

26 Posts

September 21 2010 14:04 GMT

#30

You can't blame me! I tried to put that fire out with my first 3 words

I would appreciate seeing actually statistical tests for significance on any of these values. My intuition is that most of these are statistically significant, but I can't be sure without someone actually doing the math.

How does this data hold up to Chi-squared analysis? I suspect that it shows almost perfect balance of the 3 race's overall win percentages.

Doesn't this conversation start and end by saying the sampling isn't random? There's several really strong biases in this data, because the definition of "top" is inconsistent, humans are uploading these replays and considering them "good enough for upload" and so on.

I've removed some of the noise by only including "top" replay sections of popular sites, but still, drawing serious statistical conclusions from this data is inadvisable.

lastmotion

368 Posts

September 21 2010 14:10 GMT

#31

On September 21 2010 23:00 bingobango wrote:

Show nested quote +

But you picked out the sites by hand. List out the 5 sites you used and why we should take the data from those sites seriously.

On September 21 2010 23:00 bingobango wrote:
"top" player in this case means whatever the maintainers of the site in question mean by "top" when they upload their replays. You can look through the replays yourself and see what qualifies. I've put the rankings (and sc2rank regional ranks) by each replay.

You need to look at tournament wins and professional gaming than wins by casual gamers. For example, you can collect thousands of ICCUP D+ games PvT Matchup and notice that Protoss has more wins than Terran but that doesn't say anything about balance. It just means that Protoss at D+ level is easier to play than Terran.

One last important note:
I noticed that your sample size for each matchup was different. This is a huge flaw. When sample size gets smaller and smaller, it is easier for the percentage to be heavily swung to one side.

For example, compare a data with 4-10 win/loss ratio and compare a data with 745 - 1500. The latter is 49.6% while the former is 40%. That lack of measly 1 win game from the first data made a whooping 10% difference.

So the higher the sample size is, the more it tends to equalize. And your website shows that you used the smallest sample size for ZvP and different sample sizes for all matchups. This data is bad.

Drakmore

United States9 Posts

September 21 2010 14:10 GMT

#32

I thought the same thing, but im not a diamond player so these "top" statistics dont really apply to me so.

bingobango

26 Posts

September 21 2010 14:28 GMT

#33

But you picked out the sites by hand. List out the 5 sites you used and why we should take the data from those sites seriously.

I showed you the sites already. Unless you are ready to accuse them of only posting replays where zerg lose, it's probably safe to safe there's no conspiracy here to "prove" that zerg are underpowered.

You need to look at tournament wins and professional gaming than wins by casual gamers.

You can also thumb through the replay list yourself, and see the types of players in it. I'd describe the players, collectively, as many things, but I'm not sure "casual" would make the list.

One last important note:
I noticed that your sample size for each matchup was different. This is a huge flaw. When sample size gets smaller and smaller, it is easier for the percentage to be heavily swung to one side.

The first statement is pretty much false. Different sample size is not a flaw, at all. Small sample size is, however. This was stated from the outset and this was posted now because v1.0 goes away. The sample size ain't getting any bigger. Starting today I'll be doing v1.1 replays and starting over.

This data is bad.

I don't want to get all theoretical on you, but data cannot be bad. It just is. Only bad conclusions can be drawn from data, and given the opening two words of my post, you can't say you weren't warned. I'd say you might be taking it a bit too seriously.

Santi

Colombia466 Posts

September 21 2010 15:02 GMT

#34

game is balanced imo, we just need better maps.

refraxion

Canada88 Posts

September 21 2010 15:06 GMT

#35

On September 21 2010 22:38 kickinhead wrote:

Show nested quote +

nah, the problems for Z in both MU's are legit, I rly don't undrstand hy no1 rly whines about toss...

I agree, seems like everyone is happy to jump on the T bandwagon, yet there is still toss in the corner who is arguably just as OP in some respects.

Rokk

United States425 Posts

September 21 2010 15:09 GMT

#36

On September 22 2010 00:02 Santi wrote:
game is balanced imo, we just need better maps.

I think the fact that patch 1.1 is coming out today proves you wrong.

Triscuit

United States722 Posts

September 21 2010 15:11 GMT

#37

According to what Tastless and Artosis are saying during the GSL, Koreans find ZvP to be much more difficult than ZvT.

ooni

Australia1498 Posts

September 21 2010 15:14 GMT

#38

On September 22 2010 00:09 Rokk wrote:

Show nested quote +

I think the fact that patch 1.1 is coming out today proves you wrong.

maybe or maybe not or maybe so... or maybe Blizzard wants us to keep playing on steppes of war, and wants to balance the races to suit the current ladder pool. Yeah my blood just started boiling... then I started palming my face

Black Gun

Germany4482 Posts

September 21 2010 15:20 GMT

#39

On September 21 2010 23:28 bingobango wrote:

I don't want to get all theoretical on you, but data cannot be bad. It just is. Only bad conclusions can be drawn from data, and given the opening two words of my post, you can't say you weren't warned. I'd say you might be taking it a bit too seriously.

i wouldnt say data cannot be bad. data either is appropriate for the purpose it is intended for or it is not. if a certain dataset is generally inappropriate for a certain statistical analysis, then u can say the data is bad (for this purpose).

the biggest issue i got with ur analysis is that u excluded mirror matches. if one race is dominant, then this race is more likely to advance far into the tournaments, which results in a higher amount of mirror matches of this race compared to mirrors of the other races. i once again refer to the example of the bw map "battle royal": http://www.teamliquid.net/tlpd/korean/maps/201_Battle_Royal

on the other hand, the amount of players of a race also plays a role. for example: lets assume there were only 2 races, toss and terran. 2/3rd of all players play protoss, but terran has a 75% chance to win a tvp. then a tourney with 512 players will usually see terran-dominated final rounds, with lots of tvt going on there. but because there are initially more protoss players, there are overall more pvps in this tournament.

so basically, a statistically sound analysis of replays would have to account for the difference between the amount of mirror matches and the amount of players of each respective race. but including mirror matches somehow would be a good start anyway.

dudeman001

United States2412 Posts

September 21 2010 15:25 GMT

#40

The Zerg numbers aren't as skewed as I thought they might be. Still, this would be much easier if Blizzard gave us some of this information which I'd expect them to have.
Or maybe not, since a 49% to 51% MU could be enough to set off the imba trigger in whiner nerds.

Uranium

United States1077 Posts

September 21 2010 16:16 GMT

#41

I am looking at the list of player names in the replay aggregator, and these are definitely all top players. I'd say the sample is a pretty accurate representation of the "pro scene" right now. Of course it is small, but it's lonely at the top as they say.

If there are the least number of ZvP replays, it's because ZvP represents the least portion of matches played at the pro level (or any level, probably). This is natural due to the lower popularity of Z and P vs T. In fact the numbers of replays correspond exactly with the popularity of the races:
TvP > TvZ > PvZ
T > P > Z

I think this data is pretty good, assuming these replay sites aggregate all matches from tournaments equally (which they probably do). Seems like a pretty damning statement about how the progamers feel about Zerg. Hopefully the patch today will change some things?

ionlyplayPROtoss

Canada573 Posts

September 21 2010 16:17 GMT

#42

Basically Terran>zerg and protoss and protoss>zerg

NOt really surprised but i think zerg needs buff.

Deleted User 3420

24492 Posts

September 21 2010 16:18 GMT

#43

wow, people are actually looking at this data like it means something

seriously?

Chairman Ray

United States11903 Posts

September 21 2010 16:19 GMT

#44

sample size is a bit small, but great idea. I would love to see this extend to 10,000 replays.

Ketara

United States15065 Posts

September 21 2010 16:22 GMT

#45

This kind of a crawler needs to be in place for the upcoming patch, so two months from now we have several thousand high diamond games to look at, and can be looking at map specific data in addition to matchup specific data.

MrBitter

United States2940 Posts

September 21 2010 16:23 GMT

#46

Very cool concept. I'm looking forward to seeing your data after you've logged a few thousand replays.

Shikyo

Finland33997 Posts

September 21 2010 16:28 GMT

#47

On September 22 2010 01:18 travis wrote:
wow, people are actually looking at this data like it means something

seriously?

This data alone doesn't prove much. However, since it reinforces the data of about 500 different sources of some kind, it all adds up, now doesn't it?

Lefnui

United States753 Posts

September 21 2010 16:28 GMT

#48

On September 22 2010 01:18 travis wrote:
wow, people are actually looking at this data like it means something

seriously?

Well it's very consistent with the state of balance, so yeah, seriously.

EliteAzn

United States661 Posts

September 21 2010 16:38 GMT

#49

I like the idea and effort put into this research. Have fun w/ it after the patch and looking forward to the data. Becareful about duplicate matches (mentioned above). Other than that, nice work and its nice to see I'm not the only one have zvp troubles...

Gigaudas

Sweden1213 Posts

September 21 2010 16:41 GMT

#50

You can't analyze balance on the ladder as people will always have a fairly balanced win ratio. You have to look at where people are placed on the ladder - the race ratio in the top of the ladder and in the last rounds of tournaments are what proves imbalance.

This also means that a race SHOULD be doing better than the other races for a short while after the buff as the players will win more until they're on the same rank as players who play the races that used to be more powerful. If they don't, then they weren't buffed enough.

Sideburn

United States442 Posts

September 21 2010 16:52 GMT

#51

I wretch every time I hear "Sample too small".

Less than 30 might be considered a small sample size.

Nightfall.589

Canada766 Posts

September 21 2010 17:13 GMT

#52

The problem is not that it's too small.

The problem is that it's not a random sample.

STATS 101, people.

IPS.Mardow.

Germany713 Posts

September 21 2010 17:18 GMT

#53

ZvP slowly seems to become harder than ZvT. At least for me and people I talked to (Darkforce for example)

Deleted User 3420

24492 Posts

September 21 2010 17:19 GMT

#54

On September 22 2010 02:13 Nightfall.589 wrote:
The problem is not that it's too small.

The problem is that it's not a random sample.

STATS 101, people.

THIS

(and honestly the sample is small, too. but that's not the primary problem)

TehForce

1072 Posts

September 21 2010 17:27 GMT

#55

Thats not a very clever way to determine balance. Normally people will post replays winning and not horribly losing. Even the players posting replays when they are losing, normally post much more where there are winning.

So its clear that T>P>Z seems to be the case because the number of players are also T>P>Z....

Cheerio

Ukraine3178 Posts

September 21 2010 17:31 GMT

#56

When I was random my races were T>P>>Z. So the result is just what my experience has been.

On September 22 2010 02:13 Nightfall.589 wrote:
The problem is not that it's too small.

The problem is that it's not a random sample.

lol. The only thing that is not random is that they are taken from the highest level play and not from bronze to diamond alltogether. Yeah lets value the balance by absolutely random replays.

dRaW

Canada5744 Posts

September 21 2010 17:31 GMT

#57

TvZ 60%
PvZ 58%
PvT 49.5%

I thought TvP was imba with all those marine rush wins in GSL, but I guess it was much more balanced. I'm actually surprised tho that more people aren't changing to zerg, you would think more people want to play a race less people play

[Or take on the challenge]

FabledIntegral

United States9232 Posts

September 21 2010 17:34 GMT

#58

On September 21 2010 22:24 Too_MuchZerg wrote:
NOT RELATED to bingobangos replay data

All TLPD games from international version (9 maps used)

TvZ: 163-107 (60.4%)
ZvP: 66-90 (42.3%)
PvT: 212-216 (49.5%)

Mirrors:
TvT: 423
ZvZ: 12
PvP: 118

Total games 2261 :

1121 (49.6%)

438 (19.4%)

702 (31.0%)

Remember TLPD counts only from RO16/RO8 and forward if online cup format. LAN games TLPD tries to add all games

omgwut

zeru

8156 Posts

September 21 2010 17:47 GMT

#59

--- Nuked ---

FabledIntegral

United States9232 Posts

September 21 2010 17:53 GMT

#60

On September 22 2010 02:47 zeru wrote:
I'd say data like the one here:
http://sc2ranks.com/stats/region/all/all/all
is more trustworthy than non random small number of replays.

Why in the world would you want a random sample of players? And the sample size he used isn't by any means small.

QueueQueue

Canada1000 Posts

September 21 2010 17:58 GMT

#61

Need a bigger sample size for any definitive analysis.

Deadlyfish

Denmark1980 Posts

September 21 2010 18:00 GMT

#62

I find it kinda silly that whenever we see statistics that TvZ is imbalanced, alot of people say that it is correct and that it proves that there is an imbalance.

But whenever we see data like this, which actually says that ZvP is imbalanced, the same people call this data bad, and useless. And it's the exact same people who said that the "no zergs in top 20" statistic was super useful and showed a clear imbalance.

It's like the roles have switched places.

Not saying that this information is or isnt usefull, it's just a funny observation

QueueQueue

Canada1000 Posts

September 21 2010 18:05 GMT

#63

On September 22 2010 03:00 Deadlyfish wrote:
I find it kinda silly that whenever we see statistics that TvZ is imbalanced, alot of people say that it is correct and that it proves that there is an imbalance.

But whenever we see data like this, which actually says that ZvP is imbalanced, the same people call this data bad, and useless. And it's the exact same people who said that the "no zergs in top 20" statistic was super useful and showed a clear imbalance.

It's like the roles have switched places.

Not saying that this information is or isnt usefull, it's just a funny observation

Yeah, honestly a lot of Z players are more afraid of the ZvP MU than the ZvT as of late. People are told to complain about Terran because it's "the cool thing to do" that they miss other fundamental issues.

Sleight

2471 Posts

September 21 2010 18:10 GMT

#64

On September 21 2010 23:01 Sleight wrote:
Hey y'all,

Before this debate turns into some kind of statistical pissing match, I thought I'd link a useful post I made so we can discuss this properly: http://www.teamliquid.net/forum/viewmessage.php?topic_id=153500

I would appreciate seeing actually statistical tests for significance on any of these values. My intuition is that most of these are statistically significant, but I can't be sure without someone actually doing the math.

How does this data hold up to Chi-squared analysis? I suspect that it shows almost perfect balance of the 3 race's overall win percentages.

Read My Damn Statistics Thread. It's linked above. Stop bickering about useless things.

I am quoting myself so we can move along. The sampling size MAY be too small. How can we find out? Run a series of parametric statistical tests and the Chi-squared analysis on average win rates. That will give us a great idea if the sample is begin enough. In any case, here's the facts:

A) This data obviously cannot be used to definitively generate a conclusion to a different population. It is not a random sampling.

B) This data, while non random and thus not directly applicable to other groups, still needs to be test for significance because you need to prove that it still isn't due to random chance given the population size.

C) If it isn't due to random chance, we can discuss whether or not this result may warrant further examinations in other paradigms. This can be evidence to try and examine a different population, like all of Diamond by random sampling, and see if this trend continues.

Stop bitching about statistics when most of you are saying irrelevant things. Look at the data for itself and conclude something about this sample, then redo the study under different conditions and see if it holds.

Serendipicus

United States90 Posts

September 21 2010 18:14 GMT

#65

Prepatch stats for all diamond players, showing all races are within 1% win ratio. http://www.sc2ranks.com/stats/league/all/1/all/

FabledIntegral

United States9232 Posts

September 21 2010 18:16 GMT

#66

I haven't taken statistics in a while, but can someone please explain why the sample size is too small? From what I recall these sample sizes are quite large for any test needed to be run, far bigger than what's necessary. And these are NOT random, agreed, but we don't want a random sample, we want only top players. If you somehow allocated every top replay ever and then picked a random sample, it'd be fine, but then why wouldn't we just do tests for the population after compiling all that, haha.

Serendipicus

United States90 Posts

September 21 2010 18:26 GMT

#67

Also if zerg only won about 40% of their matches, they wouldn't even be in diamond league.

FabledIntegral

United States9232 Posts

September 21 2010 18:27 GMT

#68

On September 22 2010 03:26 Serendipicus wrote:
Also if zerg only won about 40% of their matches, they wouldn't even be in diamond league.

You obviously don't pay attention to the forums and don't know how the matchmaking system is intended to work.

Black Gun

Germany4482 Posts

September 21 2010 18:33 GMT

#69

On September 22 2010 03:16 FabledIntegral wrote:
And these are NOT random, agreed, but we don't want a random sample, we want only top players.

with "they are not random samples" ppl usually point out to the fact that the players tend to only upload oustanding games or wins of themselves. additionally, many observations (read replays) of this sample belong to the same 2 players playing against each other, so the outcome of these observations depends on each other in the sense of mindgames and psychological effects in a BoX game. also sometimes the style of a particular guy just doesnt fit the style of some other guy. if dimaga is at (for example, made up) 11-3 against demuslim and we only got 100 observations for tvz, then the fact that dimaga seems to dominate demuslim might have an impact on our impression of the general tvz matchup. therefore its not only about the number of single observations in our sample, its also about the variety of features which underly these observations. (for example 40 replays from the ro8 and higher of the iem, but these 40 replays were created by the games between only 8 different players. then there is dependency and less variation in our sample than the nominal sample size of 40 would suggest...)

Sideburn

United States442 Posts

September 21 2010 19:07 GMT

#70

On September 22 2010 02:19 travis wrote:

Show nested quote +

THIS

(and honestly the sample is small, too. but that's not the primary problem)

Really, can you explain why it is too small? Too small for what tests, presuming it was random data?

Deleted User 3420

24492 Posts

September 21 2010 19:24 GMT

#71

too small because in any game where luck is a contibuting factor, the smaller the sample size the greater the chances are you will experience variance induced by that luck factor

with a sample size of only say, 200 replays of a matchup
all it would take is 10 games that skew from the norm(very easily accomplished through variance), to take odds from being 55-45 in one races favor, to being 45-55 now in the other race's favor.

but in reality, with a sample of only 200 games, the variance could be WAY BIGGER than that.

could be. of course. maybe it's spot on though. but who knows... that's the point of having bigger samples.

Mastermind

Canada7096 Posts

September 21 2010 19:28 GMT

#72

How the fuck are the mods leaving this garbage of a thread open. Disgusting.

QuanticHawk

United States32119 Posts

September 21 2010 19:29 GMT

#73

Most people can't be bothered to change their views, no matter how many facts you throw at them.

Per the site: "The definition of a "top" player is up to each individual site to decide. Use at your own risk. "

That doesn't say anything. What's the benchmark YOU used, op??

obviously this means dick with the current sample size and he knows that, but depending on the criteria used, this could be interesting to look at next month.

duh, skimmed it in the shuffle:

On September 21 2010 23:00 bingobango wrote:

Show nested quote +

I agree with the first part, not so sure about the second part. The selection bias + small sample size makes it a bit squirrely, but it's better than nothing.

Show nested quote +

Kind of skews the numbers.

Sairon

47 Posts

September 21 2010 19:46 GMT

#74

On September 22 2010 03:14 Serendipicus wrote:
Prepatch stats for all diamond players, showing all races are within 1% win ratio. http://www.sc2ranks.com/stats/league/all/1/all/

This is not the way to interpret that data. The win ratio will be rather constant as that's the whole point of the ladder system, the ladder system doesn't rank depending on race. One has to look at the race distribution across tiers, but interpreting that data is very hard as you must make certain assumptions, like for example that the distribution of good players for every race is equal.

Serendipicus

United States90 Posts

September 21 2010 19:52 GMT

#75

On September 22 2010 04:46 Sairon wrote:

Show nested quote +

The page on the site does all that you suggested.

Black Gun

Germany4482 Posts

September 21 2010 21:09 GMT

#76

this is exactly what significance tests are testing. oO folks, plz keep in mind that first of all, statistical significance doesnt equal relevance, and secondly that significance does depend on the sample size. for example rolling a dice: even a rigged dice that gives 6 every single time cant be detected as non-regular by statistical tests if all u have is 3 rolls (which ofc turned to 3 sixes...)

as a general guideline, the smaller the true statistical anomaly, the higher the sample size required to detect this anomaly. obviously its gonna be hard to reliably detect deviations in the 1-5% range if the sample size is barely above 100....

FabledIntegral

United States9232 Posts

September 22 2010 20:00 GMT

#77

On September 22 2010 04:24 travis wrote:
too small because in any game where luck is a contibuting factor, the smaller the sample size the greater the chances are you will experience variance induced by that luck factor

with a sample size of only say, 200 replays of a matchup
all it would take is 10 games that skew from the norm(very easily accomplished through variance), to take odds from being 55-45 in one races favor, to being 45-55 now in the other race's favor.

but in reality, with a sample of only 200 games, the variance could be WAY BIGGER than that.

could be. of course. maybe it's spot on though. but who knows... that's the point of having bigger samples.

Uh that's exactly what tests do. They see if the data is simply too far skewed for it to be random chance, or luck. How are you using that as an argument....

And if you were just talking about a normal, random sample of 200 replays, it'd be a rather large sample size, wouldn't it?

Fitzhunt1

United States169 Posts

September 22 2010 20:49 GMT

#78

Also we don't know what level they are at.

Normal

Please or register to reply.

race v race statistics based on 551 "top" replays

Completed

Ongoing

Upcoming