Tiebreakers, pt. 3

motbob

United States12546 Posts

June 23 2013 11:39 GMT

Another blog about tiebreakers! You're excited, right? Well, you should be.

In my last blog on this subject, I said that game score was a better tiebreaker than head-to-head because it used a larger sample size to decide who the better player was. Last night, I decided to test that statement.

In Excel, I set up a five man group. Round robin, best of three, top two players advance -- in other words, a group that you might see in any tournament. The five players were assigned different ELOs. In this case, I used the ELOs of Bogus, yCh, ForGG, Ryung, and Symbol. I simulated the ten matchups in the group by using RAND number generators in Excel. When I pressed F9, a new group would be played out and two new players from the five would advance.

It was a pain to set up the tiebreaking rules in Excel. If you want to vet my methodology, my Excel notebook is here. If you know anything about Excel, please look at it and tell me if anything I did could have been done more efficiently.

+ Show Spoiler [Sample] +

=IF(Q15=1,1,IF(Q15=2,2,IF(Q15=3,$F$12,IF(Q15=4,4,IF(Q15=5,$F$17,IF(Q15=6,$F$32,IF(Q15=7,$W$10,IF(Q15=8,8,IF(Q15=9,$F$22,IF(Q15=10,$F$37,IF(Q15=11,$W$11,IF(Q15=12,$F$47,IF(Q15=13,$W$13,IF(Q15=14,$W$16,IF(Q15=16,16,IF(Q15=17,$F$27,IF(Q15=18,$F$42,IF(Q15=19,$W$12,IF(Q15=20,$F$52,IF(Q15=21,$W$14,IF(Q15=22,$W$17,IF(Q15=24,$F$57,IF(Q15=25,$W$15,IF(Q15=26,$W$18,IF(Q15=28,$W$19,-1)))))))))))))))))))))))))

There has got to be a better way to do this...

ANYWAY. I ran a Monte Carlo simulation, which basically means I got a new set of random numbers and wrote down the results... one million times. In other words, my simulated group played itself out many times over, and I was able to see the results of that.

The results confirmed that I am correct about game score being the best first tiebreaker.

With game score as the first tiebreaker, the first and second most skilled players in the group advanced 77.8% and 44.4% of the time, respectively.

With head-to-head as the first tiebreaker, the first and second most skilled players in the group advanced 76.9% and 44.1% of the time, respectively.

Is this a big or a small difference? On the one hand, it makes a difference less than 1/100 of the time. On the other hand, the change we are talking about is correspondingly small. I mean, we're just talking about the order of when tiebreakers are used! Yet, 0.9% of the time, it's what allows the most skilled player to advance from the group instead of being kicked out of the tournament.

I think this is fairly conclusive evidence that tournaments should stop using H2H as the first tiebreaker in groups of five or more, if their goal is to see the best players advance to the knockout rounds.

Next blog: proving that extended series matches are good for non-finals matches.

Otolia

France5805 Posts

June 23 2013 12:06 GMT

Could we see the results in graphs ?

Paljas

Germany6926 Posts

June 23 2013 12:17 GMT

nice research.
but i wonder, arent there more easy/efficient ways than using excel?

motbob

United States12546 Posts

June 23 2013 12:44 GMT

Whoops, I messed up my math, since in some cases I treated all 5-way ties as hopeless instead of trying to name a winner... fixed now.

By the way, if you want to try to replicate my results, you can look at my excel doc, linked above, and use this thing to run the Monte Carlo simulation.

On June 23 2013 21:17 Paljas wrote:
nice research.
but i wonder, arent there more easy/efficient ways than using excel?

Yes, but I don't know how to use anything else.

AKnopf

Germany259 Posts

June 23 2013 13:26 GMT

Your research is based on the premise, that the better player is the player with better ELO.

That is not bad at all. But there may be other definitions of who the better player is.

One could argue, that a tournament is there to find out who the best player is at that given day (or weekend). In other words: A tournament is a way to redistribute ELO points. Now what are the criterias after which ELO points should be transfered among players? Of course its results. Thats why the winner gets the most ELO points (and the most money). Its obviosly a good mechanism in the grand scheme of things. Why should it not be used in small instances of the same problem?

In the big scheme of things: ELO points are your overall statistics or a measure of skill vs all opponents of all time.
In a tournament group thewin-loss statistic is an overall statistics or a measure of skill vs all opponents of that day.

So if you say that in a group, a tiebraker should be decided by the measure of skill vs all opponents of a day rather than vs a certain opponent, I feel like its saying: The player with the most ELO points should win the tournament. *

My belief is, that a tournament is there to find out who the best player is, not to confirm who the best player is. Therefore you should not orient yourself towards ELO points (that tell who was the best player so far) but at results at the given day against the given players (that tell who is the best player right now).

* Is that actually a statement you would agree too?

I guess its a matter of philosophie and preference. So there is nothing wrong with your research. I just wanted to say there may be other standards to measure the correctnes of tiebrakers.

For me the following solutions to tiebrakers are the best (from best to worst):
Rematch > Head-to-Head statistic > Group statistic > Overall statistic (like ELO, win-rate...)

motbob

United States12546 Posts

June 23 2013 13:47 GMT

On June 23 2013 22:26 AKnopf wrote:
Your research is based on the premise, that the better player is the player with better ELO.

That is not bad at all. But there may be other definitions of who the better player is.

One could argue, that a tournament is there to find out who the best player is at that given day (or weekend). In other words: A tournament is a way to redistribute ELO points. Now what are the criterias after which ELO points should be transfered among players? Of course its results. Thats why the winner gets the most ELO points (and the most money). Its obviosly a good mechanism in the grand scheme of things. Why should it not be used in small instances of the same problem?

In the big scheme of things: ELO points are your overall statistics or a measure of skill vs all opponents of all time.
In a tournament group thewin-loss statistic is an overall statistics or a measure of skill vs all opponents of that day.

So if you say that in a group, a tiebraker should be decided by the measure of skill vs all opponents of a day rather than vs a certain opponent, I feel like its saying: The player with the most ELO points should win the tournament. *

My belief is, that a tournament is there to find out who the best player is, not to confirm who the best player is. Therefore you should not orient yourself towards ELO points (that tell who was the best player so far) but at results at the given day against the given players (that tell who is the best player right now).

* Is that actually a statement you would agree too?

I guess its a matter of philosophie and preference. So there is nothing wrong with your research. I just wanted to say there may be other standards to measure the correctnes of tiebrakers.

For me the following solutions to tiebrakers are the best (from best to worst):
Rematch > Head-to-Head statistic > Group statistic > Overall statistic (like ELO, win-rate...)

What is a tiebreaker? Why is it used? A tiebreaker is a method used to determine which of two or three players is the most deserving to advance in a tournament. Which player is the most deserving? The one who is the most skilled, probably. You would probably prefer that I say the most deserving player is the one who is playing the best that day, and I can go along with that. How can we tell which player is playing the best? We can't, obviously... we just had a tie in the group, so there's some conflict as to who the best player is. So we play an extra match to break the tie, or use a tiebreaker if we don't have time.

What my post shows is that one commonly used method of breaking ties is slightly better at advancing the better player than another.

Your focus on ELO points is misguided, I think. Let's imagine that Joe Wannabe is mid-masters, but for whatever reason he is GM level vs zerg. He enters Dreamhack and is placed into a group with three other progamer zergs. Let's say those zergs all have an ELO of 2050 or thereabouts. Now, Joe's ELO in TLPD might be something like 1900. But if I were to try to simulate Joe's group, I'd plug in ~2050 for his ELO value, because he's projected to be competitive in this group on this day. In other words, the ELO used in my simulation isn't about all-time performance, it's about the predicted performance of players in a specific group on a specific day. So the simulation doesn't disagree with your idea of what a tournament is trying to measure.

Liquid`Nazgul

22427 Posts

June 23 2013 14:27 GMT

I'll support any effort that proves game scores are better than head to head. If you understand statistics and variance there's no way head to head points out who played better in the group.

Plexa

Aotearoa39261 Posts

June 23 2013 15:01 GMT

Does your model assume all three games are played out when the score becomes 2-0?

AKnopf

Germany259 Posts

June 23 2013 15:05 GMT

On June 23 2013 22:47 motbob wrote:

Show nested quote +

Ok the word ELO appears too often in my post. I did not want to imply you suggest to use ELO to solve tiebrakers. I wanted to use the ELO example to speak of scope. I mean you wouldnt decide who the winner of a tournament is only by the players ELO - you would rather let them fight it out. So why do you want to decide who the winner of a group is only by the players overall statistics in this group. Of course it is a larger sample, but does it mean its more accurate in the sense of competition?

The point is, I have a different understanding of a tiebraker situation. The question is not who deservers to advance but who obtains to advance. We are in a situation where two or three players claim the right to advance. Usually winning is the way to obtain that right so why not battle it out? For me that is the most natural answer to that problem.
If two guys both claim to be faster running than the respective other guy, why dont you let the run a mile to find it out?

The problem is, of course, that tournaments want to finish before 3 AM and that is a good goal to have. So you need a substitute "algorithm" to not let the player fight it out like it would be the best case. The best way to "simulate" a fight over the right to advance would be to simulate a fight among the parties. In this case its only natural to use the latest results among the players of that very day.
Only if there is also a tie among all players you could use the group statistics as a secondary statistic.

Let me put it this way: Flash and Life end up in a tie brake. Is it fair to let Flash advance because he won against ActionJesus and SirEatALot (much lesser players) and Life did lose only to one of them? In my opinion it would be much more fair if Life advances because he bat Flash only 2 hours ago.

Again, I dont believe your point is wrong. Statistically speaking I'm sure you're much cleaner than my approach. But I believe competition is not that much statistics, but rather... well, competition. And competition in SC2 is always Head-to-head.

May I ask you one theoretical question? If an unknown player uses an 5 rax SCV pull to rush down Life in a Bo1 situation, does he deserve the win? In my eyes, he does.

motbob

United States12546 Posts

June 23 2013 15:35 GMT

#10

On June 24 2013 00:01 Plexa wrote:
Does your model assume all three games are played out when the score becomes 2-0?

It would be a pretty bad model if it did! Feel free to look at the Excel spreadsheet.

On June 24 2013 00:05 AKnopf wrote:

Show nested quote +

On June 23 2013 22:47 motbob wrote:

+ Show Spoiler +

May I ask you one theoretical question? If an unknown player uses an 5 rax SCV pull to rush down Life in a Bo1 situation, does he deserve the win? In my eyes, he does.

To me, there's no such thing as someone "deserving" a win. That unknown player rolled the dice and his bet paid off. That unknown player, if he advances to the Ro8 instead of Life, is probably going to get destroyed, which is not something I want to see. If that unknown player beat Life head to head but Life did much better against the rest of his group (since he is the better player), I want a tournament system that will recognize that Life made up for the loss and advance him to the Ro8.

What I am saying is that I am primarily concerned with the quality of competition later on in the tournament. Actually, a lot of my opinions on tournament design exist because of that concern. I don't like seeing groups of death because they make the knockout stage much worse. I don't like Bo1 groups because they're so volatile (in SC2 especially).

y0su

Finland7871 Posts

June 23 2013 16:42 GMT

#11

On June 23 2013 23:27 Liquid`Nazgul wrote:
I'll support any effort that proves game scores are better than head to head. If you understand statistics and variance there's no way head to head points out who played better in the group.

I think overall we're just seeing why swiss style group play is a far superior choice to round robin (when matches don't need to be scheduled far in advance - like sports).

e: also curious how this would play out in larger groups (6 or 8 player groups).
e2: I also just realized you did groups of 5... when most groups are 4, why?

AKnopf

Germany259 Posts

June 23 2013 18:06 GMT

#12

On June 24 2013 00:35 motbob wrote:

Show nested quote +

It would be a pretty bad model if it did! Feel free to look at the Excel spreadsheet.

Show nested quote +

On June 24 2013 00:05 AKnopf wrote:

On June 23 2013 22:47 motbob wrote:

+ Show Spoiler +

May I ask you one theoretical question? If an unknown player uses an 5 rax SCV pull to rush down Life in a Bo1 situation, does he deserve the win? In my eyes, he does.

For the players, it is really hard.

But think of the Dreamhack events, where anyone that comes out of the open bracket suddenly is one of the favorites to take the whole title, just because it is so insanely hard to make it through. Dont get me wrong, I like the standard macro games and Bo5s - especially when they are commented appropriately.* But big open brackets like Dreamhack or Code B are also exciting and bring new blood to the scene.

+ Show Spoiler [Offtopic] +

* Did you see Thorzain vs HasuObs at HSC? Thats a great example for that

SKC

Brazil18828 Posts

June 23 2013 18:26 GMT

#13

About your next blog, "proving that extended series matches are good for non-finals matches", are you trying to prove extended series are fair or that they would ensure the better player has a higher chance of winning?

Because you could easily make a rule that, for example, said that every time a high ranked player lost a series to a low ranked oponent, they would then have to play another series to make sure the player with a lower rank actually deserves to move on. If you model that out you will probally get better results, since it arbitralily protects the better player, but it's obviously completelly unfair.

There's a diference between ensuring the better player has a better chance of moving on and being as fair as possible, and the main issue with extended series is how it completelly changes the penalty a player receives from dropping down to the lower bracket depending on who you are playing against.

About the current blog, I personally don't have a problem with either way. Besides the fact that the diference is really small, both are flawed, head to head because it overestimates the importance of a single match when overall performance is more important, while game count has the issue that not all 3 games are played, so it doesn't account for the diference between W-L-W, W-W-L and W-W-W, which in theory should matter.

Darkwhite

Norway352 Posts

June 23 2013 19:40 GMT

#14

On June 23 2013 20:39 motbob wrote:
The results confirmed that I am correct about game score being the best first tiebreaker.

With game score as the first tiebreaker, the first and second most skilled players in the group advanced 77.8% and 44.4% of the time, respectively.

With head-to-head as the first tiebreaker, the first and second most skilled players in the group advanced 76.9% and 44.1% of the time, respectively.

I disagree. You have shown a minuscule effect for a group composed of one specific set of ELOs. At the very least, you would have to make sure that your result is stable as you vary the ELOs of the players in the group. It is fairly conceivable that H2H is better when there are large differences in the players' skills, while mapscore is better when they are all closely matched.

I could run some simulations in Python, which is a lot less painful than Excel, but I'm not convinced that the methodology even makes sense.

On June 24 2013 03:26 SKC wrote:
Because you could easily make a rule that, for example, said that every time a high ranked player lost a series to a low ranked oponent, they would then have to play another series to make sure the player with a lower rank actually deserves to move on. If you model that out you will probally get better results, since it arbitralily protects the better player, but it's obviously completelly unfair.

This is not comparable. The setup motbob discusses treats all players equally and is blind to whatever their rankings might be. No decision is ever made in tournament play based on the player's ELO. ELOs are only used to simulate results, and the simulations are used gauge the likelihood of the most skilled player advancing with the two different tiebreakers.

Waxangel

United States33495 Posts

June 23 2013 20:08 GMT

#15

BUT WHAT DOES ALIGULAC.COM SAY?

Complete

United States1864 Posts

June 23 2013 22:00 GMT

#16

I mean, this 'math' doesn't address any of the points that were posted in your previous thread, I don't really understand what this proves...

packrat386

United States5077 Posts

June 23 2013 22:24 GMT

#17

On June 24 2013 05:08 Waxangel wrote:
BUT WHAT DOES ALIGULAC.COM SAY?

All too common recently... -_-

Seriously though I generally feel game score is better... except when it eliminates my favorite players

Solarsail

United Kingdom538 Posts

June 23 2013 22:42 GMT

#18

You... only tested one set of ELOs? "one million" tests doesn't prove anything if you only did that.

You need to test many other distributions of ELOs that might be found in a group to say anything as strong as you claim.

You should also give the significance value. You can't just call it big or small, there's a numerical way to say how strong the effect is.

Finally, your claim is "better", but your data shows (or doesn't show) only the probability of the highest skill player advancing. This is not the only consideration in a tournament; audience satisfaction is important as well. Please only state the conclusion you actually drew and not the one you'd like.

bludragen88

United States527 Posts

June 24 2013 07:41 GMT

#19

On June 23 2013 20:39 motbob wrote:
Next blog: proving that extended series matches are good for non-finals matches.

When it comes to extended series, it seems much fairer to have a 2-1 match lead to a bo3, while a 2-0 match leads to a 1 game lead in a bo5 for the previous winner. This still requires less games than the MLG style extended series might (max of 6 rather than 7). It prevents the annoying situation of someone winning 2-0 the first time and losing 1-2 the next (and thus having a winning record over the player who ends up knocking them out) - at worst the players end in a 3-3 tie. I'd be curious to see how this format stacks up against extended series and normal double elimination.

snively

United States1159 Posts

June 24 2013 22:22 GMT

#20

excel... the tool of the gods
lol

1 2 Next All

Please or register to reply.

Tiebreakers, pt. 3

Completed

Ongoing

Upcoming