Ladder-Balance-Data - Page 10

skeldark

Germany2223 Posts

July 11 2012 05:40 GMT

#181

On July 11 2012 14:29 redruMBunny wrote:
When people can choose from any color for their car, silver generally "wins" (i.e. silver is the most popular car color). Therefore the color silver is imba, because definitionally things that are popular and have a high "win" rate are imba.

When asked how it could be that any color could be imba, we say - people pick silver, so it is by definition imba! (Weren't you following the argument?)
More or less this whole thread is about redefining imbalance. Go through the OP, see how many assumptions are made.

The diffrence is i dont look what car is popular i look what car drives faster.
So silver cars drive faster than black so i come to the conclusion that the color affects the speed.
And now you say: thats because more people drive silver.
I would now ask: why do you think that more people choosing a car makes it faster?
Can you back this assumption up? like i did with mine?

Cascade

Australia5405 Posts

July 11 2012 05:51 GMT

#182

On July 11 2012 14:18 skeldark wrote:

Show nested quote +

On July 11 2012 13:57 Cascade wrote:

On July 11 2012 13:30 skeldark wrote:

On July 11 2012 13:06 Cascade wrote:
First: nice work on putting these together! Must have been a lot of job.

I'm fine with everything you do, up to the point where you go from average MMR to balance. As many others. You have, very neatly, shown that the average MMR is lower for terran than for zerg. No more, no less.

Why are there more terrans at lower MMR? I don't know.
Because they are UP? Maybe.
Because casual (bad) players are more likely to pick terran due to single player? Maybe.
Because the good players switch away from terran as they perceive them as UP? Maybe.
Because people switch race from terran as they get better? Maybe.
Something else? Could be

True that. But this is balance! its a question how you define balance. But even if the problem is not in the unit design it disrupt the balance of the races = inbalance. Perhaps i use the word to mathematical.

YEs, I think you confuse a lot of people if you let the word imbalance include effects such that single player leading casual players to pick terran. To define balance, I would use something like having an infinite number of equally talented players (whatever that means) train a certain (large) amount of time with one race each, and then let them play an infinite number of games.

And I think most people would use similar definitions.

If you use the word "balance" in a very different way, I suggest you to be very clear with what you mean in the OP, or better, use a different word.

Some comments:

1) Your result is essentially the same as in the sc2ranks link you provide.
I know that MMR is not exactly identical to league, but I think everyone here can agree that if there are more of a race at lower MMR, then that will very likely reflect in more of that race also being in lower leagues. And this is in fact what we see.

I even did a short calculation:
+ Show Spoiler +

Look at the number of players for the three races, in gold and above (to compare to your second calculation). Assign a player in gold 0 points, platinum 1 point, diamond 2 points, masters 3 points and GM 4 points. + Show Spoiler +

This is some sort of toy rating, where each point correspond to a league. I don't know exactly how the MMR are divided into leagues, is one league roughly 1000 MMR? If so, then each point would correspond to around 1000MR. GM works differently ofc, but with so few people in GM (in the sc2ranks sample), it shouldn't matter much.

Take the average number of points for each race:

Toss: 1.026
terran: 1.023
zerg: 1.047

Again, this shows that zerg is a bit above terran, and toss somewhere in between. If indeed a league corresponds to 1000 MMR (does it?), then the difference zerg-terran is 0.024 leagues = 24 MMR, which is consistent with your 30 +- 10. If a league corresponds to much more or less than 1000MMR, enough to bring the 0.024 much outside the 30 +- 10 you have, there is a discrepancy. This could potentially be a matter of the different samples, as your sample is more weighted towards higher levels as I understand. So here the agreement in the value is not important, but rather the general trend that zerg is stronger than terran, and toss a bit undecided in between.

a league is not 1000 MMR
Not 100% (promotion offset != league offset ) but close :
+ Show Spoiler +

Thee main point is valid. You can do it with leagues in generell but someone could come with the argument
(all race x are high in the league all race y are low) so thats why this way is more accurate. But overall its the same i agree.

ok, so if a league is roughly 1000MMR (as the error is about a third of the signal, we don't need to be more accurate than between 800 and 1200 I think), it means that the distribution of players in your calculation and the sc2ranks distribution both gives the same result. And that, as you say, the distributions within the leagues don't do funky stuff. I guess expect, but nice to get confirmation from your more accurate method.
edit: oops, now I understand your plot. the lines are the leagues? So it is more like 500 points on average? And I shouldn't have used linearly increasing steps of points for the different leagues. Anyways, close enough I guess. Same ballpark.

2) Random shows a huge signal. You are fine with going from terran has lower average MMR to terran being UP. By the same argument you would conclude that random is horribly underpowered. And you see on sc2ranks that there are a lot more randoms in the lower leagues (again, consistent with your results). This again is presented with the list of possible explanations above.

I think most agree that random is indeed a bit UP, in the sense that a player with a given time put into training would do the worst with random. However, I would guess that the strongest factor would be that high level players tend to switch away from random because they are UP. If 25% of the strong players would play random, I think the MMR signal would me much smaller. But this is my personal thought only, so nevermind.

Point being, this very strong signal maybe would open you to the possibility of other important factors than balance that can influence the average MMR. But well, nothing conclusive, just a little case study, don't take this point too seriously.

Its like the first point more a question of definition of balance.

haha, yes, not really sure where I wanted to go with the randoms.

3) Then I'm also a bit curious about the way you estimate the error. Why 4 groups? With more groups, you would get larger error, with fewer groups you would get a smaller one. Seems a bit arbitrary. Why not just calculate the standard deviation and calculate the error from that? You should have enough statistics to use the central limit theorem. Anyway, I think you would get similar values, I just got a bit curious.

4 because 4 races = near to the size of the racegroups = near to the same datavalue before i take the average.
This way is not optimal. I know that and this is a valid critic. Here are the reasons why i did not test on standart , normalise and calculated it : i was lazy ... and the random testdata is calculated by my computer with me drinking coffee meanwhile ...
My point is i think the random testdata show the error %. Its a not so exact way but in the end i do the same.
I will publish a better datafile with more accounts. This hole thing is a site project of my mmr calculator

ok, I'd find it much easier to calculate standard deviation than programming the split runs. Just take the average of the squared MMR as well, and the rest is a few lines of plus and minus. I guess you are faster programmer than I am though. ^^

I agree that it is "good enough" despite maybe not being perfect.

4) A better measure is what blizzard does. Namely, look at win rates in different matchups, compensating for MMR difference. I don't think you have the information to do that in your program? This method ofc has it's problems as well, and no matter what blizzard says, I don't believe that they can tell if a race is OP, or if the better players just happen to play that race. And your very small difference in average MMR (consistent with the very small signal in sc2ranks) would probably only give a very small difference in win percentage. Well within the 45% to 55% range blizzard is aiming for. But that is a different story.

I have this data. mmr of both players the matchup and the result.
And i agree that setting the +- 5% allow for great inbalance.

Maybe that would be a better analysis, because then you could see if terrans at a certain MMR struggle the most in TvP or TvZ. TvT should be 50%, and TvZ + TvP (weighted by player frequency) should average to 50% as well (or they would not be at that rank). But it should be possible to see what of the other two races each race has the most problems with.

Let's see if you can reproduce blizzards result first. After that the sky is the limit!

Unless, ofc, you are lazy.

5) No offence meant. The original MMR calculation is a great program (gj!), and it's really cool that you find more uses for it! I just think that you got a bit carried away in the interpretation at a certain point.

Also, I should mention that I don't want to claim anything about balance. + Show Spoiler +

Except collosus ofc, they are imba.+ Show Spoiler +

I don't want to say that any race is or is not UP or OP.

Cheers.

No offence taken ^^. I appreciate your post. Its a nice break from explaining what average does or what the diffrence between an depending and independent error is asking myself what they teach at school in some country's...

Mmm, I hear you.
I mean, I'm fine with people not knowing statistics. It's hard, and not everyone should be required to be an expert to post. I just wish sometimes that people were a bit more aware of what they do and don't know. Then again, I think I myself also sometimes post a bit too confidently in areas that I'm not an expert on, so I can't blame anyone really. But it does discourage this kind of posts, no doubt.

- league is not near to 1000MMR look at the picture. gold to platinum is only 250 mmr
-yes perhaps i should use a diffrent word . But witch one.
- so you can calculate this very fast? in this case
lastest datafile: skeletor.jimmeh.com/mmr/balance.csv

New results are ( after removing everyone under 1k)
Maxerror : 38.7191574666374
ERRORCOUNT : 41.54333333333333% in 5 72.81111111111112% in 10 89.8111111111111% in 15 97.02666666666667% in 20 99.36666666666667% in 25 99.88666666666667% in 30
Race...
T: -28.938886080105476 P 23.43063954261379 Z 0.36671387478577344
Analyse DONE
Zerg and Protoss switch role! halppend in 2 run also but everytime terran stay way under.

How about "uneven player distribution"?

Or just write out "different MMR averages" each time, you don't use the B-word that many times.

Hmm, to calculate the errors I just need one run, but I need both averages: <MMR> and <MMR^2>, where <X> means average value of X for all players.
Then you calculate variance
V = <MMR^2> - <MMR>^2
and standard deviation the square root of that:
S = sqrt(<MMR^2> - <MMR>^2)
then you get the standard error by dividing by the square root of number of samples (for that variable, ie race)
error = S/sqrt(N)

And here it is important that the samples are really independent. No duplicates etc. As there are some systematic effects (due to same player sending in a lot of games for example), N doesn't really represent the number of independent samples. So this will be an underestimate of the errors. Add an (arbitrary) factor 2 to the error (ie, use N/4) and you should be pretty safe.

Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though.

Cascade

Australia5405 Posts

July 11 2012 05:53 GMT

#183

On July 11 2012 14:39 Not_That wrote:
MMR distribution by races.
Click for full version.

Amount of players:
2014 Zerg
1784 Protoss
1516 Terran

The server does matter as MMR is non comparable cross servers. I've decided to remove KR and SEA and keep EU and NA as they are closest to each other in terms of MMRs, and that's where most of our data comes from.

Cool! Can you do 100 or even 200 granularity to make it easier to read? :o)
We are not trying to see any structure smaller than 200 MMR anyway.

Cascade

Australia5405 Posts

July 11 2012 05:55 GMT

#184

On July 11 2012 14:53 Cascade wrote:

Show nested quote +

Cool! Can you do 100 or even 200 granularity to make it easier to read? :o)
We are not trying to see any structure smaller than 200 MMR anyway.

Oh, and can you normalize the plots as well? So that the bins read "% of players" or something instead. Makes it a lot easier to compare. Sorry. :o)

Now all you can see is that there are more zerg players.

skeldark

Germany2223 Posts

July 11 2012 05:55 GMT

#185

On July 11 2012 14:51 Cascade wrote:

Show nested quote +

On July 11 2012 14:18 skeldark wrote:

On July 11 2012 13:57 Cascade wrote:

On July 11 2012 13:30 skeldark wrote:

a league is not 1000 MMR
Not 100% (promotion offset != league offset ) but close :
+ Show Spoiler +

Its like the first point more a question of definition of balance.

haha, yes, not really sure where I wanted to go with the randoms.

I have this data. mmr of both players the matchup and the result.
And i agree that setting the +- 5% allow for great inbalance.

Unless, ofc, you are lazy.

Also, I should mention that I don't want to claim anything about balance. + Show Spoiler +

Except collosus ofc, they are imba.+ Show Spoiler +

I don't want to say that any race is or is not UP or OP.

Cheers.

How about "uneven player distribution"?

Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though.

Like i said. I linked you the data. When do you post the result

I understand you can not just post something about balance without backing it up. But i think i backed it up reasonable.
Not perfect but reasonable. And to be honest im running out of time for today.

Cascade

Australia5405 Posts

July 11 2012 06:03 GMT

#186

On July 11 2012 14:55 skeldark wrote:

Show nested quote +

On July 11 2012 14:51 Cascade wrote:

On July 11 2012 14:18 skeldark wrote:

On July 11 2012 13:57 Cascade wrote:

On July 11 2012 13:30 skeldark wrote:

a league is not 1000 MMR
Not 100% (promotion offset != league offset ) but close :
+ Show Spoiler +

Its like the first point more a question of definition of balance.

haha, yes, not really sure where I wanted to go with the randoms.

I have this data. mmr of both players the matchup and the result.
And i agree that setting the +- 5% allow for great inbalance.

Unless, ofc, you are lazy.

Also, I should mention that I don't want to claim anything about balance. + Show Spoiler +

Except collosus ofc, they are imba.+ Show Spoiler +

I don't want to say that any race is or is not UP or OP.

Cheers.

How about "uneven player distribution"?

Should be fast, but maybe you need to rerun to get the <MMR^2> for the different races. No need to run divided in groups or anything though.

Like i said. I linked you the data. When do you post the result

I understand you can not just post something about balance without backing it up. But i think i backed it up reasonable.
Not perfect but reasonable. And to be honest im running out of time for today.

ahaha, ok. Let me dust off my MS office analysis skills, brb.

I need to leave soon as well, just quick analysis!!

lazyitachi

1043 Posts

July 11 2012 06:08 GMT

#187

T is underrepresented (28.5%) and skewed towards lower league while Z is overrepresented (37.8%) in sample,
Max and Min MMR can deviate by more than 1500 for a single submitter.
The data is right skewed to higher league i.e. not normal.

Seems like data or methodology is not producing any consistent point for analysis.
Remember, first rule of modelling garbage in garbage out.

Not_That

287 Posts

July 11 2012 06:13 GMT

#188

On July 11 2012 14:53 Cascade wrote:

Show nested quote +

Cool! Can you do 100 or even 200 granularity to make it easier to read? :o)
We are not trying to see any structure smaller than 200 MMR anyway.

Here you go:

We tried having % of total players on the y axis. The problem with that is that it doesn't have information regarding the amount of players. The dots at the edges of the graph look very strange, for example 100% of players above 3200 are Protoss. Obviously it's not very useful. We could snip the edges of the graph, but where? How many players are enough? Are 21 players between 2700 and 2750 enough? etc.

Cascade

Australia5405 Posts

July 11 2012 06:35 GMT

#189

On July 11 2012 15:13 Not_That wrote:

Show nested quote +

Here you go:

Thanks!

I mean % of the zerg players in that bin. That is, (number of zergs in that bin)/(number of zergs total). Just like you have plotted now, only divide all zerg entries with the number of zerg players, etc. Now the zerg plot is higher in mid-range, but it is not clear if that is because a larger fraction of zergs have mid-range MMR, or if there are just more zergs.

Cascade

Australia5405 Posts

July 11 2012 06:47 GMT

#190

ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.html

results:
toss:
<MMR> = 1662
samples: 1881
standard deviation: 541
standard error: 12

terran:
<MMR> = 1619
samples: 1598
standrad deviation: 504
standard error: 13

zerg:
<MMR> = 1655
samples: 2113
standard deviation: 419
standard error: 9

Note that the actual error probably is larger than that though, as there are correlations in the sample.
Ignoring that, and setting the terran MMR as zero:
terran: 0 +- 13
toss: 43 +- 12
zerg: 36 +- 9

The difference in MMR between
zerg and terran: 36 +- 16
toss and terran: 43 +- 18
zerg and toss: 7 +- 15

Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.

edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.

Not_That

287 Posts

July 11 2012 07:05 GMT

#191

On July 11 2012 15:35 Cascade wrote:

Show nested quote +

Good thinking.

Same graph normalized, each bar representing the percentage of players of each race in the bin:

Edit: Corrected colors

Cascade

Australia5405 Posts

July 11 2012 07:15 GMT

#192

On July 11 2012 16:05 Not_That wrote:

Show nested quote +

Good thinking.

Same graph normalized, each bar representing the percentage of players of each race in the bin:

Nice!

Now just put the error bars back on that plot, and it's perfect!

*leaving*

lazyitachi

1043 Posts

July 11 2012 07:20 GMT

#193

On July 11 2012 15:47 Cascade wrote:
ok, fixed the errors for you: http://www.megafileupload.com/en/file/360098/balance-ods.html

results:
toss:
<MMR> = 1662
samples: 1881
standard deviation: 541
standard error: 12

terran:
<MMR> = 1619
samples: 1598
standrad deviation: 504
standard error: 13

zerg:
<MMR> = 1655
samples: 2113
standard deviation: 419
standard error: 9

Note that the actual error probably is larger than that though, as there are correlations in the sample.
Ignoring that, and setting the terran MMR as zero:
terran: 0 +- 13
toss: 43 +- 12
zerg: 36 +- 9

The difference in MMR between
zerg and terran: 36 +- 16
toss and terran: 43 +- 18
zerg and toss: 7 +- 15

Taking into account that the errors are underestimates, the signal is barely significant. Maybe 90% or so. Need more data.

edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.

This is a two-tail test. You did not account for P-value.
Assuming high df, 1.036 SE is equivalent to only being 70% certain, 1.96 SE is 95% certain and 3.08 SE is 99.8% certain.

Assuming the data is correct, one might argue that terran is 95% certain to be underpowered BUT as I mentioned the methodology and data seems to be suspect.
I
f you show me that each person has less than 100 - 200 MMR movement then I would be more assured but seems that the calculation of the MMR is so suspect to generate movement of MMR up to 1000+++. Lol.. Math. GIGO

graNite

Germany4434 Posts

July 11 2012 07:23 GMT

#194

Good job, nice statistics.

Is it possible to determine whether a matchup is random or not (espacially mirrors) by looking at winrates by MMR?
What I mean: in ZVZ, how often does a player with (lets say 200 points) smaller MMR win ?
Would that be a good way to detect randomness?

Cascade

Australia5405 Posts

July 11 2012 07:23 GMT

#195

On July 11 2012 16:20 lazyitachi wrote:

Show nested quote +

the standard deviation is over the distribution of all players, not for a single player. each sample is one player.

Not_That

287 Posts

July 11 2012 07:27 GMT

#196

On July 11 2012 16:15 Cascade wrote:

Show nested quote +

Nice!

Now just put the error bars back on that plot, and it's perfect!

*leaving*

How do I figure out error margins for a graph with granularity?
Fixed colors btw.

NoobCrunch

79 Posts

July 11 2012 07:36 GMT

#197

edit: and with that I'm gone. I'll be back tomorrow. Hope I didn't do any stupid mistakes in the hurry. it should all be in the file.

When I calculated the T-test statistic for comparing the sample means of Zerg and Terran MMR I got 2.32258065 which is something around P < 0.01 for the one-sided test. For the two-sided test P < 0.02, there is significant evidence that Zerg players have different (higher) MMR than terran players.

However, the original data wasn't collected through an SRS of battle.net players (although I don't really think it's going to matter in this case). Additionally, terran mean MMR isn't really independent of zerg mean MMR. I'd have to think more about this to see if this affected the validity of the results. Including bronze league MMR in the calculation of mean MMR is dangerous because bronze league mostly consists of terran, lowering mean terran MMR.

The real question is if that 36 MMR difference between Zerg and Terran is indicative of imbalance. The OP mentioned that this difference was about 3 games in favor of the zerg. I think the MMR difference is pretty negligible even though Zerg do have statistically significant higher MMR than terran.

What we should do is look at the mean MMR for zerg and terran at higher levels of MMR and see if Zerg has statistically significantly higher MMR. Which, according to this graph below, could be possible. However, the assumption that all races have equally skilled players might not hold at smaller sample sizes of higher MMR players. Click on this picture
http://i50.tinypic.com/213m4pl.jpg.

lazyitachi

1043 Posts

July 11 2012 07:42 GMT

#198

On July 11 2012 16:23 Cascade wrote:

Show nested quote +

the standard deviation is over the distribution of all players, not for a single player. each sample is one player.

So if I take 1000000000 faulty data then my data is now correct? Logic?

skeldark

Germany2223 Posts

July 11 2012 08:17 GMT

#199

On July 11 2012 16:36 NoobCrunch wrote:

Show nested quote +

When I calculated the T-test statistic for comparing the sample means of Zerg and Terran MMR I got 2.32258065 which is something around P < 0.01 for the one-sided test. Even for at two-sided test P < 0.02. There is significant evidence that Zerg players have different (higher) MMR than terran players.

However, the original data wasn't collected through an SRS of battle.net players (although I don't really think it's going to matter in this case). Additionally, terran mean MMR isn't really independent of zerg mean MMR. I'd have to think more about this to see if this affected the validity of the results. Including bronze league MMR in the calculation of mean MMR is dangerous because bronze league mostly consists of terran, lowering mean terran MMR.

The real question is if that 36 MMR difference between Zerg and Terran is indicative of imbalance. I think the MMR difference is pretty negligible even though Zerg do have statistically significant higher MMR than terran.

What we should do is look at the mean MMR for zerg and terran at higher levels of MMR and see if Zerg has statistically significantly higher MMR. Which, according to this graph below, could be possible. However, the assumption that all races have equally skilled players might not hold at smaller sample sizes of higher MMR players. Click on this picture
http://i50.tinypic.com/213m4pl.jpg.

welcome on board

On July 11 2012 16:23 graNite wrote:
Good job, nice statistics.

Is it possible to determine whether a matchup is random or not (espacially mirrors) by looking at winrates by MMR?
What I mean: in ZVZ, how often does a player with (lets say 200 points) smaller MMR win ?
Would that be a good way to detect randomness?

I dont know if i understand the question.

I could (not in now but theoretic) say you how big the chance is by only watching the mmr of the 2 players.
Because the mmr includes the win% the skill system gives the players.
I dont understand what you mean with randomness here.

Goetzinho ftw

Germany115 Posts

July 11 2012 08:24 GMT

#200

This gives us the facts we all knew before. We all know that terran is by far the hardest race to play, especially if you don't have the multitasking and micro that korean terrans have.

Prev 1 8 9 10 11 12 26 Next All

Please or register to reply.

Ladder-Balance-Data - Page 10

Completed

Ongoing

Upcoming