I didn't want to reduce the outcome of a match to purely statistics, but that is what we can calculate. That is, why I was pointing out all the assumptions made for this evaluation.
But we cannot deny that statistics plays a role. Why is the BO3 format preferred to a single match? Because the better team has a higher success rate.
Edit: We also know from experience that the winners bracket team has a big advantage. Many pointed out that this destroys the pleasure of watching a grand finale
On March 16 2015 07:05 micronesia wrote: While it's true that chance of winning is variable with time and depends on a variety of factors, it is unrealistic to try to model those variations. An example is calculating the odds of getting a 300 if you know your odds of getting a strike in bowling. When you get to frames 8, 9, 10, you most likely will get nervous (which can be exacerbated depending how the people around you react), affecting how you bowl. Of course, you are also getting more physically tired as the game progresses, and the conditions of the lane (oil) are slowly changing. The surface of your bowling ball(s) is also changing over time. On a given throw, any of those effects can have a positive or negative effect on your likelihood of throwing a strike.
You can use a simplified model and say the odds of getting a 300 are 1% if you throw strikes with a consistent success rate of about 68 percent. If you try to argue that the model does not account fully for the other variables described above, you are correct, but the only reasonable thing you can do is say there's not point in doing any calculation, then. Instead, we perform the calculation anyway and just acknowledge what was and was not modeled. It is still interesting to determine that you need a 68% chance of getting a strike to roll a 300 one game in 100.
edit: tank, the edit you made to your post while I was typing seems to already address what I was getting at
Yes, while I lean towards saying no one should predict, prediction is fine as long as you are being honest with people about what you are actually doing and what its shortcomings are.
EDIT: The obvious counterargument is that the layman cannot understand the methodology and therefore cannot make a reasoned judgment as to whether to accept the outcome. However, they can read the thread, in which you have called me an incompetent liar. So laymen do have the opportunity to make a reasoned judgment on something like this since they can observe your arguments.
The burden is on you when you present statistics as part of advocating a position to also present fully your methodology and the limitations of your design and the constraints in the applicability of your results.
On March 16 2015 10:38 itsjustatank wrote: The burden is on you when you present statistics as part of advocating a position to also present fully your methodology and the limitations of your design and the constraints in the applicability of your results.
I agree in principle, but you have to consider the media he publishes in, and how important the factors he leave out are.
As this is a gamers forum, I feel we can forgive him for not going into detail of the possible inaccuracies of the ELO system for example. His main point is likely not affected by that.
However, presenting very small differences between numbers without mentioning the uncertainty of the numbers is a big deal, as it can significantly change his point (for example to "I've been measuring nose").
This OP seems to forget that if you start at 0-0 in the finals, your tournament is "double elimination" for all but the best team, which is unfair. Way worse to have an unfair format rather than hurt the viewer's feelings a little. I understand that you may want to sacrifice fairness for quality of show, especially when it doesn't disadvantage anyone but the best teams, and who gives a fuck about fairness for the best, they're the best anyway...
On March 16 2015 16:47 ZenithM wrote: This OP seems to forget that if you start at 0-0 in the finals, your tournament is "double elimination" for all but the best team, which is unfair. Way worse to have an unfair format rather than hurt the viewer's feelings a little. I understand that you may want to sacrifice fairness for quality of show, especially when it doesn't disadvantage anyone but the best teams, and who gives a fuck about fairness for the best, they're the best anyway...
something you and the OP did was not differentiate between "best team" (going in) and "winner's bracket finalist". Granted, that "should" be the best team, but for the sake of statistics the "best team" doesn't always make the winner's side...
I'm still curious to see how often the highest elo team ends up winning if the tournament is single elimination or true double elimination (compared to the bo5 at 0-0 and 1-0).
On March 16 2015 18:46 ZenithM wrote: Yeah, "best" in my post meant "winner's bracket finalist", the only "best" that matters in respect to fairness of the competition.
If you think that's the only meaningful use of the term "best player" is the winner brackets finalist, imho, I think you missed the entire point of the OP. Point is that the best team, as in the team having a larger than 50% probability to beat any other team (and THAT'S a useful definition of best), can end up in the losers bracket, and should then be given a chance to prove that they are indeed better than the winner brackets finalist.
If you claim that the winner brackets finalist is always the best team, then optimal way to have the best team win is to just give the tournament to the winner brackets finalist, ie single elimination.
I am bit confused I have to say, maybe I just misunderstand you...
On March 16 2015 18:46 ZenithM wrote: Yeah, "best" in my post meant "winner's bracket finalist", the only "best" that matters in respect to fairness of the competition.
If you think that's the only meaningful use of the term "best player" is the winner brackets finalist, imho, I think you missed the entire point of the OP. Point is that the best team, as in the team having a larger than 50% probability to beat any other team (and THAT'S a useful definition of best), can end up in the losers bracket, and should then be given a chance to prove that they are indeed better than the winner brackets finalist.
If you claim that the winner brackets finalist is always the best team, then optimal way to have the best team win is to just give the tournament to the winner brackets finalist, ie single elimination.
I am bit confused I have to say, maybe I just misunderstand you...
On March 15 2015 09:27 motbob wrote: So in this context, an appropriate objection isn't "you didn't do a proper statistical test" because we don't care about inferences and P-values here. We can get the true value, or approach it, just by cranking up the number of simulation runs.
Umm, yeah, you kinda have to do some kind of statistical test, or at least convince us in some way that your numbers are accurate enough so that we feel confident that the differences you quote are more than random noise. We can never get the true value by simulation (infinite accuracy computer simulations with infinite computing time have some practical issues unfortunately. Especially in excel. ), but we can often get close enough with enough computing time. it is incredibly important that you make sure that you actually are putting in enough computing time to get sufficiently accurate numbers out. Did you?
For example, in your first example of 51.6% vs 52.2% from 10k runs. This seems to be close enough to flipping a coin, which will have an error of around 1/sqrt(N), which for 10k runs is 1% relative uncertainty, which is exactly the difference you are seeing. So I think I need some convincing that the differences you are quoting are more than just numerical noise. Let me know if you need help.
Nonetheless, the idea of the simulation is great! I love the approach.
I actually made exactly the same remark on the LiquidDota version of this blog . Errors and standard deviation are important, regardless of how many toys you run, at the very least so we can see how significant it is.
I'd also be interested in seeing the correlation between say, ELO difference between the top two teams and the top team win rate. You'd definitely expect some correlation, but if its too strongly correlated (or the reverse, I guess), then I'd say that there's a bias there, that you'd have to take into account when dealing with the significance of the results. Or do some reweighting in your monte carlo. I mean, maybe its a small thing, but it'd be nice to see.
Edit: my knowledge of statistics comes from particle physics, where we do some weird stuff that isn't necessarily, rigorously mathematically correct. And our monte carlo samples are often >500k events, and we still worry about statistical uncertainties (not to mention systematics, which might come into play here as part of your ELO definitions). Still want to see the errors, though
Ahaha, I'm an (ex) particle physicist myself. :D wrote a minimum bias event generator. Qcd phenomenology essentially.
good to see the particle physics kind of thinking around. exactly what are you doing? (Did do?) You location is Switzerland, so I guess LHC?
Oh cool! My masters project was writing an event generator for black hole events at the LHC, was fun. Now I do experimental stuff which is far less fun Yep, working on ATLAS, a little over halfway through my PhD. Looking for SUSY - I don't hold much hope for getting a positive result. xD
On March 16 2015 18:46 ZenithM wrote: Yeah, "best" in my post meant "winner's bracket finalist", the only "best" that matters in respect to fairness of the competition.
If you think that's the only meaningful use of the term "best player" is the winner brackets finalist, imho, I think you missed the entire point of the OP. Point is that the best team, as in the team having a larger than 50% probability to beat any other team (and THAT'S a useful definition of best), can end up in the losers bracket, and should then be given a chance to prove that they are indeed better than the winner brackets finalist.
If you claim that the winner brackets finalist is always the best team, then optimal way to have the best team win is to just give the tournament to the winner brackets finalist, ie single elimination.
I am bit confused I have to say, maybe I just misunderstand you...
I think I understand what the OP wants to say, I just say that if you remove the 1-0 advantage, the "best team" (in the sense of "the one who didn't lose") is not rewarded at all for being the best that day, because its opponent has had the opportunity to lose once already, and this same opportunity is denied to the "best". As I understand it, the OP claims that the statistical difference in the chances that the best team wins (the best gameplay wise this time) is negligible compared to how badly the 1-0 advantage is perceived by the viewers. I'm just saying that if you remove that, the tournament becomes unfair, and certainly doesn't deserve to be called "double elimination". And back in the day, it wasn't even a 1-0 advantage, it was a full Bo5 advantage (back in early MLGs). Now that shit was sad to watch ;D
Interesting idea. I wasn't sure the conclusion would hold so I did a few calculations myself. I calculated the chances of reaching the final for a team assuming they have a chance P to win a single game vs anyone (and thus 1-P to lose). To determine if the 1-0 advantage is good or not for the best team it matters what the relative probabilities of reaching the final by the winner's or loser's bracket is. So i calculated it for tournaments of size 8, 16 and 32. First graph shows it where the red lines are for tournament size 8, the bumpy one being reaching by loser's bracket, the other one being by winner's bracket. Likewise blue for 16 teams and green for 32. (All matches being bo3). Basically the more rounds you have or the worse the team is the chance to reach the final by loser's bracket is relatively bigger. This means that for a bigger tournament even if you are a dominant team (60% to win a single game against any other team) you are still more likely to enter the final by loser's bracket than winner's bracket.
This also has the result that unless you are very dominant as team the format for the final doesn't matter much. Only for small tournaments (where double elimination is somewhat silly anyway) or if you are very dominant is it really a disadvantage to have the final start 1-0 up. The reason simply is that even the best team often enters the final by loser's bracket. The second graph shows the chance to win the whole tournament for a 8 team double elimination if your team has P chance (shown on X-axis) to win a single game. The red line is with a 0-0 starting final, the blue one with a 1-0 for WB team final.
Basically for determining the 'fair' winner it hardly matters, only a little if the team is very dominant to begin with.
As for discussion about the format, I think double elimination with 1-0 up in the final is fine for most tournaments. It's important for the tournament to be intersting and be a bit fair. Too much luck like single elimination can cause random or weak teams to reach too far too often, but a fairer system of round robin can last too long. Double elimination gives good chances for the best teams to come out on top, while still having the thrill of elimination. Round robin has useless matches, match throwing and all other sorts of problems, double elimination has exciting matches while still having good chances for a great final. The 1-0 up in the final is a decent method to give the WB team for having a bit of an advantage without it being too big. Double bo3 gives slightly bigger advantage but feels a bit sillier to me. For determining the fairest winner it doesn't matter much how the final is done and it's not even in a tournaments interest per se to do that. They want viewership and excitement, having the favourite roll over people stinks. You could argue tennis is doing much poorer and soccer is so popular because they are respectively too predictable and excitingly unpredictable.
You can always tell who has never taken a stats course past the high school level by who talks the loudest with the most strongly held opinions. It's a notoriously difficult field with conclusions drawn from studies usually being quite nuanced and qualified, but conclusions that are significant nonetheless.
Otherwise you get drawn into arguments with people who don't think N=25 is a significant sample size because it "feels low".
On March 16 2015 16:47 ZenithM wrote: This OP seems to forget that if you start at 0-0 in the finals, your tournament is "double elimination" for all but the best team, which is unfair. Way worse to have an unfair format rather than hurt the viewer's feelings a little. I understand that you may want to sacrifice fairness for quality of show, especially when it doesn't disadvantage anyone but the best teams, and who gives a fuck about fairness for the best, they're the best anyway...
something you and the OP did was not differentiate between "best team" (going in) and "winner's bracket finalist". Granted, that "should" be the best team, but for the sake of statistics the "best team" doesn't always make the winner's side...
I'm still curious to see how often the highest elo team ends up winning if the tournament is single elimination or true double elimination (compared to the bo5 at 0-0 and 1-0).
If the model only looked at who was the best team going in (statistically the highest WR) and who actually wins, the team that ends up in WF's is irrelevant. It gets into conditional probability that has nothing to do with the original hypothesis. It would be like flipping a coin 100 times to see if it's a fair coin and looking at if heads was ever flipped 10 times in a row as that would be evidence against the hypothesis even though the only relevant metric is the end number of heads and tails.
On March 17 2015 04:33 hariooo wrote: You can always tell who has never taken a stats course past the high school level by who talks the loudest with the most strongly held opinions. It's a notoriously difficult field with conclusions drawn from studies usually being quite nuanced and qualified, but conclusions that are significant nonetheless.
Otherwise you get drawn into arguments with people who don't think N=25 is a significant sample size because it "feels low".
On March 16 2015 16:47 ZenithM wrote: This OP seems to forget that if you start at 0-0 in the finals, your tournament is "double elimination" for all but the best team, which is unfair. Way worse to have an unfair format rather than hurt the viewer's feelings a little. I understand that you may want to sacrifice fairness for quality of show, especially when it doesn't disadvantage anyone but the best teams, and who gives a fuck about fairness for the best, they're the best anyway...
something you and the OP did was not differentiate between "best team" (going in) and "winner's bracket finalist". Granted, that "should" be the best team, but for the sake of statistics the "best team" doesn't always make the winner's side...
I'm still curious to see how often the highest elo team ends up winning if the tournament is single elimination or true double elimination (compared to the bo5 at 0-0 and 1-0).
If the model only looked at who was the best team going in (statistically the highest WR) and who actually wins, the team that ends up in WF's is irrelevant. It gets into conditional probability that has nothing to do with the original hypothesis. It would be like flipping a coin 100 times to see if it's a fair coin and looking at if heads was ever flipped 10 times in a row as that would be evidence against the hypothesis even though the only relevant metric is the end number of heads and tails.
Correct, the team that was the WF is irrelevant in the OP's calculations. That's my point. The entire issue with dual elimination revolves around how the WF faces elimination in the finals, not "does it give the best team the highest chance to win". + Show Spoiler +
although I am curious about THAT statistic in various formats
People could easily misunderstand the OP and think that "best team" is synonymous with WF and incorrectly conclude that 0-0 vs 1-0 starts are statistically fair.
Btw, if you're interested in the topic itself, try to google for interviews of Barry Hearn and the PTC Snooker series. He changed tons of professional billard tournaments to shorter distances (iirc Bo9-Bo17 to Bo7 only). He tries to explain why that is - without any math - and just summarizes it as: "it's the only way to get all games done in a short time frame".
Hearn (who's ruining snooker by the way) has the issue that doesn't exist with esports in that they're limited by logistics - they only have so many tables available and so much time. With esports you can play as many games concurrently as you like - at least until you get to an offline stage. of course if you do an online stage correctly you can allow for the offline games to be of a decent length
edit:
On March 15 2015 09:13 Cheren wrote: Every defense of double elimination I've read is tautological. "Double elimination works because teams that lose twice are eliminated." "Double elimination works because everyone gets a second chance."
It's a system with horrible flaws that isn't used in real sports and needs to get out of esports. It is to tournament formats what Instant Runoff is to voting.
again, it is rarely used in normal sports because of logistical reasons as detailed above
Note that double elimination is much more reasonable for games where fortunate/unfortunate matchups in terms of characters/races in bracket are a bigger issue in terms of variation, especially fighters or games like SC. DOTA can quite reasonably start doing solo elimination brackets because you're given tools to alleviate that through the draft system.