Who will win the OSL? A statistical simulation

Agrajag

Sweden38 Posts

April 01 2012 00:44 GMT

Hi everyone!

With a new OSL coming up, I decided to see if I can make some statistical analysis of who is likely to win (Bisu?), and perhaps other predictions, based on simulation. (I got the idea from Lightwip's recent thread).

First off, let me describe the process, which is in fact rather simple.

The Process
The general idea and most basic assumption is that if you pit any two progamers A and B against each other, the chance that A will win is proportional to A's winrate vs the race of B, divided by B's winrate vs the race of A. The algorithm is like the following:

1. Denote by p1 player 1's winrate vs player 2's race, and by p2 player 2's winrate vs player 1's race.
2. Generate two uniformly distributed random numbers u1 and u2.
3. If p1*u1 > p2*u2, then player 1 wins. Otherwise player 2 wins.

There are naturally many other plausible ways of simulating the outcome of a match. Which one is the most realistic is hard to say, but it is certainly possible to think of schemes that puts more or less emphasis on the winrate differences.

Now I can take individual winrates for a large number of progamers and simulate the outcomes of matches between them. I have specifically tried to replicate the proceedings of the (rather complicated) OSL format in order to simulate who will win the OSL. By using a modular approach to the implementation (which is done in Matlab for my own convenience), I can now simulate a number of interesting scenarios, including how the rules for seeding affect the likelihood of certain players to win.

The group selection process for OSL round 2 is perhaps the most dubious here, because it's hard to know how the players will choose. I went for simplicity though, letting the seeds always choose the lowest ranked player available from the pool (current ELO rank). Other plausible (but slightly harder to implement) strategies would be to choose players based on winrates.

The players I have included in my implementation were for simplificy the 74 currently highest ELO ranked players according to TLPD. I also added HiyA, who is actually a seeded player for the upcoming OSL, but he didn't appear in the list because he doesn't have a team anymore

.

I could add more players, but the problem with adding for example all the ones who play in the offline preliminaries is that many of them don't have any statistics that can be used to guess how well they will perform.

There are a couple of obvious limitations/assumptions/simplifications with this model, but I'll skip the boring part for later and jump straight to the results.

Results
If we make 10000 OSL simulations, where each one is using the same seed as the upcoming OSL (Jangbi, Fantasy, Hydra and n.Die_soO seeded into the Ro16, etc.), we arrive at the following distribution of OSL golds:

+ Show Spoiler +

On the y-axis is the number of golds, and the x-axis is the player ID's, sorted as follows:
+ Show Spoiler +

1 Flash
2 Bisu
3 Fantasy
4 Jaedong
5 Neo.G_Soulkey
6 Stork
7 Leta
8 Stats
9 n.Die_soO
10 Zero
11 Movie
12 Horang2
13 Effort
14 Baby
15 Jangbi
16 Best
17 Calm
18 Bogus
19 Hydra
20 Killer
21 Action
22 Light
23 Brave
24 Crazy-Hydra
25 HoeJJa
26 TurN
27 firebathero
28 Hyuk
29 Shine
30 Dear
31 Jaehoon
32 Modesty
33 Mind
34 RorO
35 Last
36 Reality
37 Shuttle
38 Grape
39 Wooki
40 Sea
41 Perfective
42 Snow
43 Iris
44 sHy
45 By.Sun
46 Tyson
47 s2
48 M18M
49 Classic
50 hyvaa
51 sKyHigh
52 BarrackS
53 Canata
54 Kal
55 Flying
56 ggaemo
57 free
58 hero
59 PianO
60 Ssak
61 Peace
62 great
63 herO[jOin]
64 Chavi
65 PerfectMan
66 Orion
67 Bbyong
68 Where
69 Juni
70 Sacsri
71 hOn_sin
72 Alone
73 Sharp
74 Trap
75 HiyA

The top ~~ten~~ twelve players are:

Flash		16.71%
Fantasy		14.77%
Jaedong		10.40%
Hydra		7.44%
Jangbi		6.16%
Sea		5.97%
Stork		4.40%
Baby		3.00%
HiyA		2.93%
n.Die_soO	2.86%
Calm		2.63%
Bisu		2.07%

Looks like a fairly good ~~risk~~ chance of a TvT finals this time! These numbers are of course greatly affected by the seeds. Interestingly enough, Bisu seem to be a pretty bad bet for the gold. Let's see what happens if we start with this seed, and then let the new seeds be decided by the results of the previous ones...

+ Show Spoiler +

Now the top players are:

Flash		15.42%
Jaedong		9.27%
Bisu		6.43%
Fantasy		5.31%
Sea		4.09%
Effort		3.86%
Neo.G_SoulKey	3.31%
Stork		3.29%
Leta		3.04%
Best		2.85%
Stats		2.53%
Hydra		2.41%

Pretty much regardless of where you start in terms of seeded players, the above distribution will manifest itself. Removing the effect of seeded players completely by selecting them randomly each time gives us the following (for 10000 simulations):

+ Show Spoiler +

The top players:

Flash		8.81%
Jaedong		6.68%
Bisu		5.17%
Fantasy		4.29%
Effort		3.68%
Sea		3.63%
Leta		3.32%
Neo.G_SoulKey	3.09%
Stork		2.90%
Hydra		2.35%
Stats		2.26%
Best		2.24%

As we can see, the seeds amplify the probability for the strong players to win, with a lower chance of upsets. But even without the help of seeds (as in the last simulation above), the over all pattern doesn't change - Flash is still God, Jaedong a few leaps behind, then Bisu, Fantasy, and after that the individual skills are quickly starting to drown in the statistical noise. In fact, the likelihood of winning an OSL if the seeds are randomized (a measure of individual skill as good as any) is declining in an almost exponential fashion:

+ Show Spoiler +

Probability of winning for the top 75 players, on a semilog scale. One interpretation of this is that the skill of individual players are fairly evenly distributed, with a few statistical abominations like Flash and Jaedong.

Limitations
1. The player statistics is a somewhat flawed measure of how well an individual player will perform in a tournament. For old players, having been in a slump in the past may reflect poorly on the current form, and vice versa - a player with historically impressive stats may be in a bad form at the moment. New players on the other hand may have played too few games to give a reasonable idea of their abilities - some player may have played only one game in a match up, and regardless of whether it was a win or a loss, it gives a bad statistic on which to guess the outcome of future games.

I can't really think of any better measure that can be used in practice, however.

2. I haven't added stats for all currently active progamers, partly because most of the ones that I didn't include don't have any recorded games at all. In any case, this means that some of the people who are actually in the offline preliminaries right now are not included in my analysis.

3. I haven't simulated who will advance from the PSL, but rather chosen them at random from the remaining (non-seeded) players. Although it shouldn't be too hard to include, I doubt it would make a very big difference for the general outcome. I guess it's on the to do list for this project, though.

4. Things like player experience, coaching, ambition, ability to handle stress, current form and so on are of course important factors in any individual league, but they are also hard to quantify. They are things to keep in mind that add to the uncertainty of this type of simulation, but there's not much else to be done about them in terms of extensions to the model.

5. It's not so hard to add things like (approximate) confidence intervals for the winrates. The only reason I haven't done it is because it's getting late and I wanted to finish this post before I go to bed. Edit: In the following simulations I have added approximate 95% confidence intervals based on the central limit theorem.

Update
After some pain-staking copy and paste work from TLPD, I have collected the players individual match results. This allows me to put weights on the wins and defeats based on how long ago the game was played. I chose the following weight function:

w(t) = 0.5 - (pi/2)*arctan((t-100)/100), where t is the number of days since the game was played.

Basically what this does is it puts a lot of emphasis on recent games, and much less on games played in the past. For gamers like Flash and Jaedong, this puts over half of the weight on games played within the last year, and games from the beginning of their career count for virtually nothing.

Naturally this changes the winrates for all the players, most for the better, some for worse. Here's a list of the players who had an overall benefit from the weights: + Show Spoiler +

Flash
Bisu
Fantasy
Neo.G_Soulkey
Stork
Leta
Stats
n.Die_soO
Movie
Horang2
Baby
Best
Calm
Killer
Action
Brave
Crazy-Hydra
HoeJJa
TurN
firebathero
Hyuk
Shine
Jaehoon
Modesty
Last
Reality
Grape
Perfective
sHy
By.Sun
s2
hyvaa
sKyHigh
BarrackS
Canata
Flying
hero
PianO
Ssak
Peace
herO[jOin]
Chavi
PerfectMan
Orion
BByong
Where
Juni
Sacsri
hOn_sin
Alone
Sharp
HiyA

Of course there are virtually an unlimited number of possible weights. I think the one I chose should capture the players current likelihood to win better than the unweighted winrates, however. For example, Flash went from 75% vT, 71% vZ, 70% vP to the incredible 81% vT, 79% vZ and 68% vP. Jaedong on the other hand had his winrates reduced by about 2% except for vP, which increased by 1% (units).

Anyway, for some more results!

This time I increased the number of simulations to 100,000. For the static seed (current OSL seed):

+ Show Spoiler +

Fantasy		19.76%	±.25%
Flash		18.67%	±.24%
Jaedong		8.07%	±.17%
n.Die_soO	5.49%	±.14%
Hydra		4.38%	±.13%
Jangbi		4.19%	±.12%
Stork		4.05%	±.12%
Baby		3.70%	±.12%
Hiya		3.24%	±.11%
Sea		2.95%	±.10%
Calm		1.98%	±.09%
Hyuk		1.55%	±.08%
Bisu		1.49%	±.08%

(The confidence interval is approximate 95%, based on the central limit theorem)

I'm sure you're as surprised as I am about this. Fantasy, wtf? Maybe I did something wrong, but I don't think so, I tripled checked the code. Of course Fantasy's stats are worse than Flash, even though they were improved a lot by the weights. Somehow the configuration of winrates swung greatly in favour of Fantasy, who in this model has a statistically significant edge to win over God himself! And who would have guessed n.Die_soO to be so far ahead of Hydra, Stork and Jangbi?

Let's see what happens if we randomize the seeds. Once more for 100,000 simulations:

+ Show Spoiler +

Flash		9.40%	±.18%
Fantasy		5.74%	±.14%
Jaedong		5.07%	±.14%
Bisu		4.33%	±.13%
Neo.G_Soulkey	3.31%	±.11%
Leta		2.99%	±.11%
Stork		2.92%	±.10%
Last		2.51%	±.10%
Effort		2.24%	±.09%
Where		2.18%	±.09%
Best		2.05%	±.09%
BarrackS	2.00%	±.09%

Order seems to be partially restored. The distribution now actually resemble the ELO rank quite well.

That's it for now, I hope you enjoyed it.

Any comments and criticism are welcome.

Fionn

United States23455 Posts

April 01 2012 00:51 GMT

I think it would be a fitting end to OSL.

Fischbacher

Canada666 Posts

April 01 2012 00:59 GMT

If Fionn predicts the winner of the next OSL correctly, then it would be a fitting end to the ~~world~~ OSL.

Pelopidas

Canada225 Posts

April 01 2012 01:01 GMT

I don't think overall winrate really means much, since a lot of players like sea, effort, and hydra are sucking hard right now. I am also confused about your inclusion of + Show Spoiler +

Bisu. He doesn't qualify for OSLs

MountainDewJunkie

United States10342 Posts

April 01 2012 01:06 GMT

But seeds don't matter beyond the Ro16 anyway (actually, do they now randomize Ro16 too? You used to be able to pick them A picks B who picks C who picks D. All A's were previous semifinalists from last season). Quarterfinals are randomly assembled, so at that point, there's no advantage to having a higher seed, as you will not be guaranteed to play a much lower seed, or even a lower seed at all.

I like that you're having fun with statistics, and OSL hype is good, but I don't feel like a statistical analysis offers much insight when compared to our own empiricism. Your simulations show that Flash, Jaedong, and fantasy are favorites. Well, none of us had to run any simulations to tell you the same thing.

ninazerg

United States7291 Posts

April 01 2012 01:09 GMT

Jangbi will win the whole thing.

qrs

United States3637 Posts

April 01 2012 01:14 GMT

Fun bit of research and interesting to read. Thanks for taking the time to do it and share it with us!

I was surprised to see Fantasy rank above Jaedong in the first version of the chart. Then I saw the other two versions with your seeding and with randomized seeding and order was restored.

One other thing that made me smile: I know you're from Sweden, so the occasional odd choice of English word is certainly forgivable, but I really liked

a few statistical abominations like Flash and Jaedong.

FWIW, the usual term is "statistical anomalies", but I like your phrasing here much better!

Ribbon

United States5278 Posts

April 01 2012 02:32 GMT

It really can't be anyone but Flash. Like, someone else winning would be morally wrong

ne4aJIb

Russian Federation3209 Posts

April 01 2012 02:35 GMT

Flash of course in goes to play-off, in boN he is unbeatable.

Harem

United States11392 Posts

April 01 2012 02:35 GMT

#10

On April 01 2012 11:32 Ribbon wrote:
It really can't be anyone but Flash. Like, someone else winning would be morally wrong

No, it wouldn't.

CosmicHippo

United States547 Posts

April 01 2012 02:36 GMT

#11

Light....les do this.

danl9rm

United States3111 Posts

April 01 2012 02:48 GMT

#12

On April 01 2012 10:14 qrs wrote:

One other thing that made me smile: I know you're from Sweden, so the occasional odd choice of English word is certainly forgivable, but I really liked

Show nested quote +

FWIW, the usual term is "statistical anomalies", but I like your phrasing here much better!

Lol, I though the exact same thing. They're not just anomalies at this point, they're abominations.

I enjoyed the analysis. Anything that gives Jaedong as much as a 6-10% chance of winning gets me hyped. Although, in fairness, he is going to win for sure

Ideas

United States8133 Posts

April 01 2012 02:49 GMT

#13

i think for any Jaedong fan, anything but a jaedong vs flash finals will be a disappointment.

Honestly if finals doesn't feature a combination of flash/fantasy/stork/jaedong, I would rather it be 2 players that haven't made the OSL final before.

BlackGosu

Canada1046 Posts

April 01 2012 02:50 GMT

#14

jaedong or flash to win the OSL is a safe bet, although its not shocking to see someone like Light win

DropBear

Australia4371 Posts

April 01 2012 02:50 GMT

#15

On April 01 2012 11:35 Harem wrote:

Show nested quote +

No, it wouldn't.

Oh to see what was here before the edit

alffla

Hong Kong20321 Posts

April 01 2012 02:52 GMT

#16

BISU!

o wait .. nvm

lawl.

dRaW

Canada5744 Posts

April 01 2012 02:53 GMT

#17

Soulkey or some royal roader needs to win, I am tired of these let downs ;; i.e bisu

danl9rm

United States3111 Posts

April 01 2012 03:03 GMT

#18

On April 01 2012 11:49 Ideas wrote:
i think for any Jaedong fan, anything but a jaedong vs flash finals will be a disappointment.

Honestly if finals doesn't feature a combination of flash/fantasy/stork/jaedong, I would rather it be 2 players that haven't made the OSL final before.

Exactly. I either want a JvF (Flash or even a Fantasy rematch) or maybe JvStork. I do not want a Flash v [irrelevant]. If we can't have JvF, may as well have 2 out of the wild, like (Z)

Alone v

Grape. Epic.

soulist

United States932 Posts

April 01 2012 03:03 GMT

#19

The obvious answer is flash because he looks unbeatable in tvz and tvt. There is also not a lot of protoss in the league.

qrs

United States3637 Posts

April 01 2012 03:12 GMT

#20

On April 01 2012 12:03 danl9rm wrote:

Show nested quote +

Exactly. I either want a JvF (Flash or even a Fantasy rematch) or maybe JvStork. I do not want a Flash v [irrelevant]. If we can't have JvF, may as well have 2 out of the wild, like (Z)

Alone v

Grape. Epic.

If you're a Jaedong fan, why wouldn't you want Jaedong vs. anyone if you can't have Jaedong vs Flash/Fantasy/Stork?

Actually, as a JD fan myself, I'd rather see Jaedong vs anyone but Flash, since I think JD's got to be favored against anyone else.

1 2 3 4 5 Next All

Please or register to reply.

Who will win the OSL? A statistical simulation

Completed

Ongoing

Upcoming