|
motbob
United States12546 Posts
tl;dr: In double elimination brackets, a 1-0 advantage in Grand Finals for the team coming from Winner's increases the chance of the better team winning the tournament.
In Dota, double-elimination brackets are almost always used, and the grand finals are almost always Bo5. Tournaments have not agreed, however, on whether to give teams from the Winner's Bracket an advantage in Grand Finals. For example, when Alliance and Na`Vi played in the TI3 finals, the series started 0-0, but when those two teams played in Starladder Season 8, Alliance started up 1-0 because they came from the Winner's Bracket.
I'm under the impression that spectators don't like the 1-0 start, but some tournaments (D2CL and Starladder most notably) employ it nonetheless.
Being a massive nerd, I have these various brackets simulated in Excel, so I decided to do some tests and try to test how, in theory, a winner's bracket advantage affects the tournament outcome.
The best team doesn't always win a tournament. Dota is a game with a lot of variance involved, and it only takes a glance at Dota2lounge bet odds to see that. There is a 100% chance that Secret is a better team than M5, but the odds of Secret winning against M5 are not 100%. Nor is the chance of Secret winning a tournament against 7 other scrub teams 100%.
I think that an implicit goal of tournament organizers is to create a format where the best team has a good chance to win. Spectators generally want this. An uproar would surely result if a tournament advanced the second place team to bracket, rather than the first place team, or made the Grand Finals a Bo1. A caveat: spectators want to see good teams earn the win, which is probably why 1-0 advantages leave a bad taste in their mouths.
So if tournament organizers want to create a tournament format where the best team wins most often, spectators be damned, they should create a simulated bracket with teams assigned Elo values (representing "true" skill), run the simulation 10,000 times, and see how many times the best team won with (1) a 1-0 advantage in Grand Finals and (2) no advantage! Or let me do it.
First, I simulated a bracket with two good teams and a bunch of scrubs (1500 Elo, 1480, and a bunch of 1300s). The best team won 51.6% of the time without a Grand Finals advantage, and 52.2% with a 1-0 advantage. That's a 0.6% increase. (Note that the only number we really care about is the increase.)
Second, I simulated an Elo distribution that resembled TI4, meaning that there were a few teams clustered near the top and some semi-competitive teams just afterwards. Here we saw an increase of 1.7% in the best team's win chance from no advantage to 1-0 advantage.
Third, I simulated a very steady drop in Elo (1500, 1490, 1480, 1470...). With this distribution, the best team saw a 1.4% chance increase in winning.
To clarify: one thing to note about the above simulations is that I'm simulating the whole tournament, not the grand finals. In some runs of the simulation where the best team ended up winning, the team lost in Winner's and won GF coming from Loser's. In other runs, the team won Winner's and then won GF.
So with these different distributions of Elo, creating a 1-0 advantage increased the chance of the best team winning the whole tournament. I can't say for sure that that would be true for any combination of teams, but I think that's what these results imply. If y'all want me to test unusual Elo distributions or weird tournament formats (e.g. Bo5 WF instead of Bo3), ask in the comments.
The conclusion I derive from these results is this: if tournament organizers are concerned solely with creating a format where the best team wins, they should have GF with a 1-0 advantage. But the difference between formats seems small enough that, if I were an organizer, I would just keep doing what spectators want (no advantage).
|
you're just defining "best team" to describe a team that's won more consistently over a tournament, but what if i want to define "best team" as "the team that can win a best of 5 from 0-0 in this grand finals situation that's about to come up, regardless of recent performance"? you can't just beg the question of what it means to be the "best team" like this, using an elo definition that isn't going to be universally agreed upon as a good metric of "best team at the time of the grand finals"
in other words, you can't claim this without defining "best," and i doubt there will be any universally agreed-upon definitions of "best" here
|
motbob
United States12546 Posts
On March 14 2015 18:24 SpiritoftheTunA wrote: you're just defining "best team" to describe a team that's won more consistently over a tournament, but what if i want to define "best team" as "the team that can win a best of 5 from 0-0 in this grand finals situation that's about to come up, regardless of recent performance"? you can't just beg the question of what it means to be the "best team" like this, using an elo definition that isn't going to be universally agreed upon as a good metric of "best team at the time of the grand finals" I am definitely not describing "best team" as the team that "won more consistently over a tournament," and I'm confused as to why you think that.
|
the point stands that elo (a measure of their relative position of being able to win over lots of games) can't capture intangibles like grand-finals choke factor, grand-finals pocket strat or strat countering, etc
this approach makes more sense for games like chess, but dota is a game where pick/ban metagaming and particular style clashes can result in consistent "elo upsets": that is, elo isn't a complete description of relative standing, since individual teams can have individual interactions that don't necessarily accord with their elo standing when it comes down to head-to-head matchups
|
motbob
United States12546 Posts
On March 14 2015 18:35 SpiritoftheTunA wrote: the point stands that ELO (a measure of their relative position of being able to win over lots of games) can't capture intangibles like grand-finals choke factor, grand-finals pocket strat or strat countering, etc Maybe my use of the term "Elo" is confusing. When I talk about Elo, I am talking about actual, "true" skill; that is, the unknowable actual ability of a team to win matches. We try to approximate teams' (or, in SC2, players') skill by using metrics like the Gosugamers ranking or Alugilac, but the true skill is unknowable. But, just because it is unknowable in real life doesn't mean we can't use it in a simulation.
As to all those other factors, those are sort of like random chance. They don't really matter in the simulation. Since the team in the loser's bracket could just as easily be the more choke-inclined team as the team in the winner's bracket, the simulation isn't biased by the presence of those factors. Over the course of 10,000 runs of the simulation, those sorts of things would cancel out. If you disagree, I'd love to hear why.
|
extreme example: there's 100 teams, the two best are at 1500 and 1470, but 1500 got there by running a strat that's particularly good at beating 60 of the other teams (say, 70% of the time), 1470 got there by running a strat that's good at beating 38 of the other teams (70% of the time) and the 1500 team (55% of the time): in the head to head, the 1470 tends to win against the 1500, but that isn't reflected in the number alone. can this situation not exist in your model?
the numbers are fudged and i'm sure the ELO would turn out different with those winrates entered, but the point remains. i realize the 1470 rating might climb above the 1500 if they faced head to head over and over and got adjusted to reflect their 55% winrate, but with lots of games against the other teams, they tend to fall back into the relative positino of 1470<1500
elo, from my understanding, doesn't actually refer to an unknowable true skill of a team, it refers to a number that's generated based on win/loss history against other teams with elo ratings in the same league
|
motbob
United States12546 Posts
On March 14 2015 18:48 SpiritoftheTunA wrote: extreme example: there's 100 teams, the two best are at 1500 and 1470, but 1500 got there by running a strat that's particularly good at beating 60 of the other teams (say, 70% of the time), 1470 got there by running a strat that's good at beating 38 of the other teams (70% of the time) and the 1500 team (55% of the time): in the head to head, the 1470 tends to win against the 1500, but that isn't reflected in the number alone. can this situation not exist in your model?
the numbers are fudged and i'm sure the ELO would turn out different with those winrates entered, but the point remains Yeah, that happens some of the time in real life. But also in real life you may have a 1500 and 1470 team and the 1500 matches up super well against the 1470 team, more so than the difference in overall skill suggests. Both sides of that coin exists and act to cancel each other out over the course of 10,000 runs.
Anyway, I'm not really seeing the connection between any of these criticisms and the final conclusion of the post. How do these alleged flaws in the methodology change the result?
|
the result is fine if you define "best team" to accord with elo as how you've set it up in the premise, i just disagree with that definition of "best team." i think the problem that top teams can have idiosyncratic modifiers against each other, like tending to choke against one particular team or style, isn't one that can be smoothed out in this kind of simulation that assumes the elo rating differential, with each team only having one elo rating regardless of opponent, is the singular measure of probable outcome
|
motbob
United States12546 Posts
On March 14 2015 18:54 SpiritoftheTunA wrote: the result is fine if you define "best team" to accord with elo as how you've set it up in the premise, i just disagree with that definition of "best team." i think the problem that top teams can have idiosyncratic modifiers against each other, like tending to choke against one particular team or style, isn't one that can be smoothed out in this kind of simulation that assumes the elo rating differential, with each team only having one elo rating regardless of opponent, is the singular measure of probable outcome OK, but again, why? Give me an alternate definition of "best team," and explain why the post's result (that a 1-0 advantage leads to the "best team" team winning the tournament more often) changes.
|
as a fan of seeing a full BO5, i'll define "best team" to be the "team that tends to win a BO5 in the grand finals more in a head to head matchup." this definition of "best team" is dynamic and depends on which 2 teams are in the grand finals, since the only thing that goes into this is their ability to win against the grand finals opponents.
this "best team" can be the team with the lower Elo, so long as they tend to win against their fated opponent in the grand finals
given this a 1-0 advantage for the team that tends to get into the winners bracket spot more would decrease that team's chance of winning, were they the "best team" per my definition
|
motbob
United States12546 Posts
On March 14 2015 19:14 SpiritoftheTunA wrote: as a fan of seeing a full BO5, i'll define "best team" to be the "team that tends to win a BO5 in the grand finals more in a head to head matchup." this definition of "best team" is dynamic and depends on which 2 teams are in the grand finals, since the only thing that goes into this is their ability to win against the grand finals opponents.
this "best team" can be the team with the lower Elo, so long as they tend to win against their fated opponent in the grand finals
given this a 1-0 advantage for the team that tends to get into the winners bracket spot more would decrease that team's chance of winning, were they the "best team" per my definition OK, why would the 1-0 decrease that team's chance of winning? If this definition is so dynamic, half of the time the team with the 1-0 advantage would be the "best team." Since half of the time, your dynamic "best team" is on the winner's side, and half of the time it's on the loser's side, things would seem to be a wash. Unless there is some reason why your "best team" would usually be on loser's side?
|
no you're right. my bad. i guess my issue is that most people don't necessarily want the format to be such that the best team wins more, they just want the grand finals bo5 to seem more fair in a vacuum, a sentiment i generally agree with.
|
motbob
United States12546 Posts
On March 14 2015 19:25 SpiritoftheTunA wrote: no you're right. my bad. i guess my issue is that most people don't necessarily want the format to be such that the best team wins more, they just want the grand finals bo5 to seem more fair in a vacuum, a sentiment i generally agree with. Yeah, it's something I agree with too. I want to see more games in the GF so the small % chance of a "better" result doesn't really matter to me.
|
So in your simulation did the 1-0 advantage lead to closer finals than 0-0? By which i mean how many more times did more series end 3-2 with and without winners bracked advantage? My initial assumption would be 0-0 but I may be missing something
|
simply put, this only works if the algorithm places the same value upon a team whose chances to win is steady throughout a boX and through a tournament - two great counter examples would be Navi at TI2 and VG at TI4, both of those teams were significantly better at Bo3s than Bo5s due to the volatile nature of their strategy. Truely great teams tend to change and learn during the process of a big tournament and assigning some inherent elo can't be further than a representation of the realities of dota.
If all you're concerned about is the "team with best elo" then it becomes rather reductive - why even have an elimination bracket at all, just do a quadriple round robin - that increases the chances of the "best team" under your definition winning significantly more
|
motbob
United States12546 Posts
On March 14 2015 22:18 Kupon3ss wrote: simply put, this only works if the algorithm places the same value upon a team whose chances to win is steady throughout a boX and through a tournament - two great counter examples would be Navi at TI2 and VG at TI4, both of those teams were significantly better at Bo3s than Bo5s due to the volatile nature of their strategy. Truely great teams tend to change and learn during the process of a big tournament and assigning some inherent elo can't be further than a representation of the realities of dota.
If all you're concerned about is the "team with best elo" then it becomes rather reductive - why even have an elimination bracket at all, just do a quadriple round robin - that increases the chances of the "best team" under your definition winning significantly more As discussed above, the "truly great" adaptive team may be on the winner's side or the loser's side of grand finals, so that particular factor is a wash.
The purpose of this simulation was not to test radically different tournament formats, but rather to explore a minor split between tournament organizers on how to run things.
|
the adaptive team is significantly more likely to be on the loser's side than winners, you can imitate this by running the same simulation with a "growth" stat on your team's elos
|
motbob
United States12546 Posts
On March 14 2015 22:28 Kupon3ss wrote: the adaptive team is significantly more likely to be on the loser's side than winners, you can imitate this by running the same simulation with a "growth" stat on your team's elos I don't understand why "adaptive" teams are more likely to be in losers than winners.
|
simulate the bracket with some some "stagant" teams (say 1500 + 1.0 per game) and some "adaptive teams" with (1470 + 2.5 per game)
|
motbob
United States12546 Posts
I created a variable for adapting well and adapting poorly. I made the variable ridiculously powerful: 20 rating per round for well-adapting teams and 5 per round for teams that didn't adapt well. 8 team, double elim, teams close in terms of skill (1500, 1490...).
I assumed that teams would adapt equally well whether they were watching games or playing games. That is, if Secret were sitting on the sidelines watching Na`Vi and EG play in Loser's, Secret would not be disadvantaged just because they were taking notes instead of playing. This assumption was made because to assume otherwise would be to say that the lower bracket team has an inherent advantage because they played more games, which I don't think Kupon was arguing. Also, I don't think I know how to code a "per game" adaptation in Excel.
In grand finals, both teams had adapted "five times," for an increase of 100 points for adapters and 20 points for non-adapters. The result remained the same: the "better team" as defined by a static Elo value won 0.5% more often in a bracket with a 1-0 advantage. (The bracket is different from the ones in earlier tests, so the 0.5% figure isn't comparable to any of the previous results.)
|
|
|
|