|
On September 12 2020 22:41 Dangermousecatdog wrote: Ok, you tell me where the mean of anything is from that graph. Go on.
Hint: it's not at 2610, 2680, 2980. what if it were? would that bring you to a stunning realization that would enable you to solve sc2? what does this graph have anything to do with the game? anything drawn from the graph is whatever you want it to be. why is this a discussion
|
Hey you asked right? Don't act offended. Ok, so you don't understand the difference between a mode and mean, but you like to talk about mean. It's ok. It just mean that you shouldn't interpret the graph.
By the way those numbers aren't the node either. That can only be shown from the actual data table. Like I said, the more I look at the graph, the more nonsensical it appears. I'll have to see the data myself, but if it's from that same link then what we have is just a bullshit graph. There is only 7 datapoints per race so how could he make such a graph to present the data?
|
On September 12 2020 22:47 Dangermousecatdog wrote: Hey you asked right? Don't act offended. Ok, so you don't understand the difference between a mode and mean. It's ok. It just mean that you shouldn't interpret the graph.
By the way those numbers aren't the node either. That can only be shown from the actual data table. Like I said, the more I look at the graph, the more nonsensical it appears. I'll have to see the data myself, but if it's from that same link then what we have is just a bullshit graph. There is only 7 datapoints per race so how could he make such a graph to present the data? did i say they were? why are you agreeing with me aggressively?
|
On September 12 2020 22:47 Dangermousecatdog wrote: By the way those numbers aren't the node either. That can only be shown from the actual data table. Like I said, the more I look at the graph, the more nonsensical it appears. I'll have to see the data myself, but if it's from that same link then what we have is just a bullshit graph. There is only 7 datapoints per race so how could he make such a graph to present the data? Wait, why does rankedftw only give 7 datapoints per race? https://www.rankedftw.com/ladder/lotv/1v1/mmr/?f=eu If you go here, there are 160239 datapoints for EU alone, it seems plausible that that OP generated the plot from rankedftw.com
Edit: your argument seems to be that the graph doesn't represent the 7 leagues correctly. Yes, indeed, if you want the % of gold players, you should integrate the curve between min(gold MMR) to max(gold MMR), and you'd get the correct percentage. But that's just b/c the transformation between leagues and MMR isn't linear, which makes sense since they're based on percentages and you have long tails. In other words, the statement "most protoss players are gold" is wrong, but "most protoss players are in the range 2300-3100 MMR" is perfectly correct, one is based on a league PDF and the other is based on an MMR PDF.
|
heres some raw data and a quick graph for each region from the battle.net api. only counting players with at least 10 games. docs.google.com
|
On September 12 2020 21:00 Alejandrisha wrote: you guys are missing the important part of the graph. the right side of the graph.
The extreme right side of the graph shows Zerg at the top. (They also win most major tournaments.)
|
On September 12 2020 20:19 Slydie wrote: I also have to say that it is easy to forget what kind of games are played at 2,6 and 2,9k MMR.
At 2,6, players tend to have their very own and generally very ineffective style, based on playing the campaign or fooling around with units. I doubt many of them have ever seen a pro-game or looked up a build to see what 1v1s are supposed to look like.
Even at 3k+, terrans make planetaries in their mains and go for proxy ghost rushes, while zergs are still one-trick-ponies relying on a single attack to win with no clue about transitioning.
It takes one to know one. I just legitimately brought my offrace Zerg to Plat 3, which would make me contribute to the peak of the curve. Even with just very basic understanding about expanding and making drones, upgrades and roaches, I feel I had a very easy time in most games, even ZvZs.
As someone who peaks at 4k and drops to 3.5k after not playing for a while, the planetary in the main thing sounds shocking to me. I have seen a planetary in the natural like 1 in 50 games, never in the main.
|
Not surprised to see Terran having the lowest average mmr as Terran seems to be the go to race to play for new players entering ladder.
|
On September 13 2020 00:59 greenturtle23 wrote:Show nested quote +On September 12 2020 20:19 Slydie wrote: I also have to say that it is easy to forget what kind of games are played at 2,6 and 2,9k MMR.
At 2,6, players tend to have their very own and generally very ineffective style, based on playing the campaign or fooling around with units. I doubt many of them have ever seen a pro-game or looked up a build to see what 1v1s are supposed to look like.
Even at 3k+, terrans make planetaries in their mains and go for proxy ghost rushes, while zergs are still one-trick-ponies relying on a single attack to win with no clue about transitioning.
It takes one to know one. I just legitimately brought my offrace Zerg to Plat 3, which would make me contribute to the peak of the curve. Even with just very basic understanding about expanding and making drones, upgrades and roaches, I feel I had a very easy time in most games, even ZvZs.
As someone who peaks at 4k and drops to 3.5k after not playing for a while, the planetary in the main thing sounds shocking to me. I have seen a planetary in the natural like 1 in 50 games, never in the main.
It stops way before 3,5k, and it is rare even at 3k. Every 100 MMR points in SC2 mark a significant skill difference. 2800-3500 are very highly populated ranks where the leagues are close.
The main point of the post was the dub 3k games, though.
|
I was bored so I built a script to scrape all the player data from the site in the OP and then did a bit of basic pandasing and seaborning so I could see what NA looks like.
Here's what the NA graph looks like:
![[image loading]](https://i.imgur.com/j7RyULI.png)
Here's a few basic tidbits from the data since people were wondering about means and stuff:
Total active accounts as of when I scraped the data: protoss: 51947 terran: 56176 zerg: 46133 random: 15610
mean MMR: protoss: 2833.9011492482723 terran: 2768.2159819139847 zerg: 2966.6612836797954 random: 2898.173478539398
median MMR: protoss: 2740.0 terran: 2659.0 zerg: 2901.0 random: 2845.0
overall mean MMR: 2854.140540190503 median MMR across population: 2779.0
|
+ Show Spoiler +On September 13 2020 08:51 Ben... wrote:I was bored so I built a script to scrape all the player data from the site in the OP and then did a bit of basic pandasing and seaborning so I could see what NA looks like. Here's what the NA graph looks like: ![[image loading]](https://i.imgur.com/j7RyULI.png) Here's a few basic tidbits from the data since people were wondering about means and stuff: Total active accounts as of when I scraped the data: protoss: 51947 terran: 56176 zerg: 46133 random: 15610
mean MMR: protoss: 2833.9011492482723 terran: 2768.2159819139847 zerg: 2966.6612836797954 random: 2898.173478539398
median MMR: protoss: 2740.0 terran: 2659.0 zerg: 2901.0 random: 2845.0
overall mean MMR: 2854.140540190503 median MMR across population: 2779.0
thansk for posting! i think medians are very important in a lot of cases. modes can be skewed by outliers. however, the fact that anyone can sign up regardless of their skill in other races and be data points in this still makes me question whether or not this graph really tells any story at all. still fun to look at stats tho
median MMR: protoss: 2740.0 terran: 2659.0 zerg: 2901.0 random: 2845.0
seems like my satirical handicap model
zerg = x(jasbean) + 500 protoss = x(jasbeantwins) + 700 terran = x(janbeandoooods) + 800
holds up though
2900 + 500 = 3400 2740 + 700 = 3440 2660 + 800 = 3460
almost haunting
|
I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data:
The graph looks basically identical:
![[image loading]](https://i.imgur.com/oeRCStA.png)
data is quite close too:
Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
|
well, simply the fact that random has a higher median than 2 races tells you the study is flawed. these random players have played all the races and are just doing a victory lap. doesn't give me much faith in the rest of the data
|
On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better
I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless
|
On September 13 2020 09:31 yubo56 wrote:Show nested quote +On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless 
yes i think that any conclusion drawn are bunk but i do like lookin at numbers xD
|
On September 13 2020 09:34 Alejandrisha wrote:Show nested quote +On September 13 2020 09:31 yubo56 wrote:On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless  yes i think that any conclusion drawn are bunk but i do like lookin at numbers xD Ohhh, one fun plot to make would be a scatter plot of (# games played, MMR) with the three races color coded. It might give a slightly more faithful signal.
How are y'all scraping the data, did you just download all of the pages and parse the HTML?
|
On September 13 2020 09:31 yubo56 wrote:Show nested quote +On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless  Does this work? It's for accounts with more than 10 games.
![[image loading]](https://i.imgur.com/wpVd6Rm.png) All I did was add ".set(yscale='log')". Nothing fancy.
Yeah I don't think anything meaningful can actually be drawn from this data. I just like tinkering.
edit: On September 13 2020 09:31 yubo56 wrote: How are y'all scraping the data, did you just download all of the pages and parse the HTML? Pretty much. I just used the requests library to download each page (you can use the offset GET parameter they use for pagination to hop between pages), then used BeautifulSoup to parse the HTML and then cleaned stuff up a bit before chucking it in CSVs. Nothing too advanced.
|
On September 13 2020 09:42 Ben... wrote:Show nested quote +On September 13 2020 09:31 yubo56 wrote:On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless  Does this work? It's for accounts with more than 10 games. ![[image loading]](https://i.imgur.com/wpVd6Rm.png) All I did was add ".set(yscale='log')". Nothing fancy. Yeah I don't think anything meaningful can actually be drawn from this data. I just like tinkering. edit: Show nested quote +On September 13 2020 09:31 yubo56 wrote: How are y'all scraping the data, did you just download all of the pages and parse the HTML? Pretty much. I just used the requests library to download each page (you can use the offset GET parameter they use for pagination to hop between pages), then used BeautifulSoup to parse the HTML and then cleaned stuff up a bit before chucking it in CSVs. Nothing too advanced. Both of your answers are exactly what I was asking about, thanks so much!
Interesting tail on the log scale, I think the extreme high end is basically just Parting, Neeb, and Scarlett holding the distribution up (you said this was NA right). There's a bit of small deviation around 6k, which is kinda interesting, but you're probably already up to like top 20 at that point. That the curves actually track each other so well from 3k-5.5k is really cool; this is the group of people that know how the game works but don't play anywhere near perfectly, and all their balance complaints notwithstanding, balance is incredibly good at their level!
Also, that the tail is linear in semi-log space means the distribution ~ exp(-MMR) right, so not Gaussian? That's also kinda cool to see 
Really appreciate your responses and plots!
|
On September 13 2020 09:42 Ben... wrote:Show nested quote +On September 13 2020 09:31 yubo56 wrote:On September 13 2020 09:18 Ben... wrote:I saw OP filtered out people who played fewer than 10 games (I didn't in my previous post) so I did also (it's trivial since the website provides number of games played as a column) just to see the difference. It's not really all that different but for consistency's sake here's the data: The graph looks basically identical: ![[image loading]](https://i.imgur.com/oeRCStA.png) data is quite close too: Number of accounts with 10+ games by race: protoss:31247 terran: 33855 zerg: 27118 random:8203
mean: protoss:2848.8236310685825 terran: 2778.8631516762666 zerg: 2985.8191975809427 random:2872.1861514080215
median: protoss:2759.0 terran: 2674.0 zerg: 2920.0 random:2818.0
overall mean MMR: 2864.1406052398356 population median MMR: 2793.0
Mean number of games played so far this season: factoring in accounts with fewer than 10 games played: 55.53022382348439 ignoring accounts with fewer than 10 games played: 90.96791571651913
Hi, I asked for a log y axis plot but OP doesn't seem to have responded; since you seem to be checking this topic, would you be able to post one with a log y axis as well? Would be able to show the long tail at higher/lower MMR much better I definitely agree that using mean/median/mode to draw balance conclusions is extremely dubious, but it should be fun to theorize nonetheless  Does this work? It's for accounts with more than 10 games. ![[image loading]](https://i.imgur.com/wpVd6Rm.png) All I did was add ".set(yscale='log')". Nothing fancy. Yeah I don't think anything meaningful can actually be drawn from this data. I just like tinkering. edit: Show nested quote +On September 13 2020 09:31 yubo56 wrote: How are y'all scraping the data, did you just download all of the pages and parse the HTML? Pretty much. I just used the requests library to download each page (you can use the offset GET parameter they use for pagination to hop between pages), then used BeautifulSoup to parse the HTML and then cleaned stuff up a bit before chucking it in CSVs. Nothing too advanced.
thank you, this is a much better graph. still don't think we can draw real conclusion from it. but this is better in that we can actually see the differences
|
I did the scatterplots for fun also (y being games played, x is MMR). They're kinda neat.
For this first one, I did log scale for y axis since otherwise it's incredibly cramped. I set it ridiculously wide for this because I noticed something (spoilered because this one's kinda biggish download-wise): + Show Spoiler + Those vertical lines are interesting.
A smaller scatterplot reveals the three most prominent lines on the right are around 4000, 4400, and approximately 4700-4800 MMR, which I'm guessing are cutoffs for various tiers of leagues, probably Master or Diamond 1. A bunch of accounts seem to cluster around the MMR cutoff for league tiers: + Show Spoiler +
Cool stuff.
edit: Yes, that giant space on the left is populated. There's someone with ~200 MMR.
double edit: Cleaned up the plots a smidge and made the y axis more reasonable. I had to do it manually since 10e2 was too small and 10e3 left a bunch of empty space so I went with a range of 10 to 3500 games. When I went from 3000 to 3500 I saw one data point added so somebody has to have played a lot of SC2 this season.
|
|
|
|