• Log InLog In
  • Register
Liquid`
Team Liquid Liquipedia
EDT 11:16
CET 16:16
KST 00:16
  • Home
  • Forum
  • Calendar
  • Streams
  • Liquipedia
  • Features
  • Store
  • EPT
  • TL+
  • StarCraft 2
  • Brood War
  • Smash
  • Heroes
  • Counter-Strike
  • Overwatch
  • Liquibet
  • Fantasy StarCraft
  • TLPD
  • StarCraft 2
  • Brood War
  • Blogs
Forum Sidebar
Events/Features
News
Featured News
WTL 2023 Summer - Qualifiers Preview + Power Rank13[Interview] Dewalt18Tournament Spotlight: Crowdfunded Pre-Season Events17[ASL15] Ro24 Preview: Welcome Back!25A Tidal Wave in Still Water - Oliveira at IEM Katowice22
Community News
Liquibet SC2 Season 27 Recap11SCboy: 2023 Tournament Plans8ESL Open Cup #166: Dark, MaxPax, ByuN win4Classic, DRG, Nice, and Strange join Mystery Gaming13Team DPG and KZ Merge, rebranded as DKZ.23
StarCraft 2
General
Tournament Spotlight: Crowdfunded Pre-Season Events WTL 2023 Summer - Qualifiers Preview + Power Rank The Death of Korean SC2, and Where We Go From Here Scelight 6.0: Patch 3.0 + LotV support Looking for a herO game
Tourneys
[WTL 2023] Summer Qualifier and Code A $10,000 PIG STY FESTIVAL 3.0! (March 15-19) Ukrainian Cup Powered By Hot Headed Gaming Playoff Kung Fu Cup and Master's Coliseum Return for 2023 Afreeca World 101!
Strategy
[H] (PvP) WTF Nexus rush into recall probe/zealot Neural parasite on disruptors?
Custom Maps
[A] Proxy Rush [A] SC Real Scale [A] (Minigames) Raynor Party [D] Planning to host a small map tournament
External Content
Mutation # 361 And Drops and Rifts Mutation # 360 Double Trouble Mutation # 359 Enhanced Defenses Mutation # 358 The Ascended
Brood War
General
Looking for an old game (impossible challenge) The uncertainty behind FlaSh's Return; In-depth [Update] ShieldBattery: League Support! [ASL15] Ro24 Preview: Welcome Back! BGH auto balance -> http://bghmmr.com/
Tourneys
New Worlds Map Contest III: Top 12 (VOTE HERE) [Megathread] Daily Proleagues [BSL16] RO3 - SemiFinals - Sunday 18:00 CET Copa Latinoamericana StarCraft by OliPatrick
Strategy
Carriers or arbiters Starcraft Remastered Build Orders February 2023 Marine rate of fire
Other Games
General Games
Frost Giant announce Stormgate Diablo IV Final Fantasy XIV Nintendo Switch Thread Destiny 2 - PC/Xbox/PS4
Dota 2
Official Dota and Chess General Discussion Lima Major 2023
League of Legends
LiquidLegends to reintegrate into TL.net [Patch Notes] Release General Discussion
Heroes of the Storm
HotS: WP and Funny Moments
Hearthstone
TL Mafia
TL Mafia Community Thread Liquid Arcanon News [0]Paper Team Liquid Maria L TL Mafia Idea Factory Chezinu streak(s) Mafia
Community
General
US Politics Mega-thread UK Politics Mega-thread Russo-Ukrainian War Thread Trading/Investing Thread The Chess Thread
Fan Clubs
The Phredxor Fan Club The Scarlett Fan Club The Clem Fan Club
Media & Entertainment
Movie Discussion! Anime Discussion Thread [Manga] One Piece Korean Music Discussion [\m/] Heavy Metal Thread
Sports
2022 - 2023 Football Thread Formula 1 Discussion 2021 NFL/CFB Season UFC/MMA Discussion Thread NBA General Discussion
World Cup 2022
FIFA World Cup Qatar 2022 Thread
Tech Support
how to play music while streaming with xsplit Computer Build, Upgrade & Buying Resource Thread
TL Community
Recent Gifted Posts Happy Birthday R1CH! Ask TL Staff Anything
Blogs
Return of CranKy Du…
CranKy Ducklings
TL Currency Converter Mafia…
Minely
An ex-schizoid's u…
ApatheticSchizoid
20 years plus!
FuDDx
ASL 15 English Commentary…
namkraft
Teaching StarCraft Blog …
Lovethelord
Why Liquipedia needs Notabi…
FO-nTTaX
Customize Sidebar...

Website Feedback

Closed Threads



Active: 1237 users

Statistical Analysis of Extended Series

Forum Index > SC2 General
Post a Reply
1 2 3 4 5 6 7 Next All
nzb
Profile Joined September 2010
United States41 Posts
Last Edited: 2010-11-12 20:41:46
November 12 2010 00:45 GMT
#1
ABSTRACT

On the most recent State of the Game podcast, there was discussion of
MLG's extended series rule in their double elimination
tournament. This post explores the effects of the extended series rule
on tournament outcomes, using a simplified model of players and
tournaments. Several tournament formats are explored: round robin,
single elimination, double elimination, and double elimination with
extended series. Performance is measured by averaging over many
simulations, using several distance metrics from the 'ideal ranking'
of players. Results show a small but measurable improvement in
performance when using the extended series rule; with 64 players in a
best-of-three format, the 'best player' wins 1% more often (25%
compared to 24%) using the extended series rule than with simple
double elimination. However, the improvement from the extended series
rule is marginal compared to the overall tournament format; in
single-elimination, the best player wins 19% of the time, and in
round-robin the best player wins 47% of the time.

1. INTRODUCTION

Skip this section if you are familiar with the debate about MLG's extended series rule.

MLG is the largest Starcraft II tournament in North America, and
consequently its tournament format has a large impact on the
competitive scene. MLG employs a fairly standard double-elimination
tournament format, with each round determined by a best-of-three
series. However, MLG has an additional wrinkle called 'extended
series', which many people find counter-intuitive. To explain these
complexities, let's start with an overview of different tournament
formats.

A single elimination tournament is the simplest format that most
people are familiar with. Play proceeds in rounds, with all players
starting in the same round. Players are then paired in each round and
play a series. The winner proceeds to the next round, and the loser is
eliminated from the tournament. This format has the advantage of
determining a champion in very few games (O(log(# of players))), but
the disadvantage that bad luck can knock out good players at an early
stage.

To help with this problem, double elimination tournaments ensure that
any player must lose twice in order to knocked out of the
tournament. This is done by having two brackets: 'winners' and
'losers'. All players begin in the winners' bracket, and after losing
once are sent to the losers'. Players in the loser's bracket play each
other, as well as all players who join the losers' bracket from the
winner's bracket. Therefore, players in the losers' bracket play
twice as many series as those in the winners'.

MLG has extended the double elimination format with an 'extended
series' rule that is invoked when players meet twice in a single
tournament. If players meet in the winners' bracket, and later again
in the losers' bracket, then instead of playing a new best-of-three
series, their series from the winners' bracket is resumed as a
best-of-seven series. Example: If Alice beats Bob 2-1 in the winners'
bracket, and they meet again in the losers' bracket, then they will
play a best-of-seven series to determine the winner with a starting
score 2-1 in favor of Alice. Alice has to win two games to proceed,
and Bob has to win three.

This rule is intended to avoid some paradoxical outcomes, as well as
statistically increase the likelihood that the 'better player'
continues in the tournament. It is possible in standard double
elimination for Alice to defeat Bob 2-0 in the winners', and Bob to
defeat Alice in the losers' 2-1. The "overall series" between Alice
and Bob is 3-2 in Alice's favor, but Bob continues and Alice does
not.

Similarly, another argument is that double elimination exists in order
to give better players a 'second chance' to continue in the tournament
when defeated by inferior players, but this logic does not apply when
the same players meet again. In this case, it makes more sense (so the
argument goes) to extend the series to determine the 'better player'.

Despite these arguments, the extended series has generated
controversy because in many instances the tournament setting is very
different when the series resumes, and many people find it
unentertaining and counter-intuitive.

In particular, the extended series between Liquid`Tyler and PainUser
at MLG Dallas demonstrates some of the problems. In their series in
the winners' bracket, Liquid`Tyler fell victim to a mistake of the
tournament organizers, and was forced to restart a game that he had a
clear advantage. Liquid`Tyler subsequently lost the series 2-0, which
some have argued was due to the psychological effect of the game
restart. When they later met in the losers' bracket, Liquid`Tyler was
at a significant disadvantage, and lost the extended series 2-4, but
would have won a best-of-three.

This post is organized into several sections. Section 2 describes how
these results were gathered, and the various models used. Section 3
describes the experimental setup. Section 4 presents the
results. Section 5 concludes, and Section 6 shows where to follow up
on this if you are interested.

1.1 SCOPE

This post is an in-depth analysis of the statistical performance of
different tournament formats. It is not concerned with many other
important questions, for example:

* What is the purpose of tournaments, beyond determining skill of
players?

* Is the extended series rule entertaining?

* Is the extended series rule morally justified?

* Players aren't strictly 'better' or 'worse' than each other -- or,
at least, this relationship isn't transitive between players.

* The tournament setting can change when an extended series resumes.

These questions will and have been addressed elsewhere.

2. DESCRIPTION

This post explores the accuracy of several tournament formats,
focusing on the impact of the extended series rule. This is done using
simulation, running through many thousands of tournaments and
comparing the average results. This section describes the player
model, tournament model, and accuracy metrics used in the results.

2.1 PLAYER MODEL

Players are modeled using a simple randomized model. The goal is to
have players of greater or lesser skill, but have each player vary
somewhat in their performance. Players therefore consist of two
numbers: mean performance and deviation. Performance for a single
player is randomly generated each game, and lies in the range
[mean - dev, mean + dev].

The mean performance lies between 0 and 2, and the deviation is always
1. This ensures that the worst player can always beat the best player,
however at the extremes this is unlikely.

A players performance is calculated as follows:

performance = mean + dev * rand^2 * plusminus

Where rand is a uniformly-distributed number in [0,1] and plusminus is
seleted from {-1,1} with even probability. This formula makes the mass
of the probability distributed concentrated around the mean, making
the better players win more often.

To generate a set of players for a tournament, each player's mean is
selected uniformly from [0,2]. This is probably inaccurate -- player's
mean performance is likely distributed on a normal curve. The player
model is probably the biggest weakness in this study, however I still
believe the first-order effects are well captured in the analysis.

2.1 TOURNAMENT MODEL

The rules for each tournament are faithfully replicated in the
simulation, however there are some modelling choices here as well. The
most significant is the seeding of players in each tournament. I have
chosen to use the "ideal seeding", as determined by players' mean
performance, as the initial seeding for players. This removes a source
of inaccuracy from elimination tournaments, and so the results should
be taken as an upper bound for their performance.

Four tournament types are considered: single elimination, double
elimination, double elimination with extended series, and round
robin. The focus of this post is on the effect of extended series, but
single elimination and round robin are included in order to give some
context for these results.

A round robin tournament is one where every player plays every
other. Players are then ranked according to their number of wins. This
tournament produces a complete ranking, first through last, and
because everyone plays everyone, it is very accurate. The down side is
that it requires a lot of games (O(# players)) and is less exciting
than other tournament formats. However, because it is so accurate, it
can be used to calibrate the accuracy of elimination tournaments by
showing a "speed of light" for tournament efficacy.

Similarly, single elimination tournaments show the other end of the
spectrum. They are very fickle in their results, and show relatively
how much of an improvement the extended series rule makes over
standard double elimination.

2.2 MEASURING ACCURACY

One of the principle challenges is determining how to measure
performance of a tournament -- how can we say that one tournament is
"better" than another? The approach taken is to have each tournament
produce a ranking of players, first through last, and compare this
ranking to the ideal ranking, as determined by players' mean
performance.

This produces its own challenges, as elimination tournaments do not
strictly produce a ranking. However, taking seeding into account, an
elimination tournament does sort players into categories based on how
far they made it through the tournament. The ranking of players is
determined as players are eliminated from the tournament -- first
eliminated places last, and so on.

Three metrics are used to measure performance: winner, depth, and
2^depth.

* The 'winner' metric determines performance based on a very simple,
intuitive rule: Did the best player win? This metric is simple,
but unfortunately not very useful, because for even
moderately-sized tournaments, the best player rarely wins.

* The 'depth' metric determines performance based on how deep each
player made it in the tournament. Specifically, the player ranking
is divided into groups according to a single-elimination bracket
(first, second, top four, top eight, top sixteen, etc..). Then
each player's expected placement is calculated based on which
group they fall into within the ideal ranking -- the
fifteenth-best player should place into the top sixteen. These
results are compared against the actual placement from simulation,
and the difference from depth for all players is added to produce
the final "distance from ideal".

* The '2^depth' metric is similar to the depth metric, however
before adding up all of the depth-differences, we first calculate
2^(delta)-1. This is done because, intuitively, it is more
significant if the first player is eliminated in the round of 64
than if the 33rd, 34th, 35th, and 36th players make it to the
round of 32, but the 'depth' metric calculates these as being
equally bad. Essentially, this metric exaggerates says that big
differences in depth are more important than many small
differences.

3. METHODOLOGY

Results are gathered by running simulations of one million tournaments
and averaging the results for each tournament. It is generally found
that the trends in each metric are reflected in the others, except for
the 'winner' metric, which is very sensitive to random factors and
sometimes fluctuates independently.

4. RESULTS

4.1 OVERVIEW

Because this discussion was inspired by MLG Dallas, the first result
to consider the overall performance of each tournament format in a
128-player, best-of-three tournament:

Format | Winner | Depth | 2^Depth
---------------+--------+-------+--------
Single | 0.91 | 52.09 | 110.07
Double | 0.88 | 48.31 | 89.83
DoubleExtended | 0.88 | 46.01 | 87.42
RoundRobin | 0.72 | 22.29 | 28.85

Note that these are distance metrics, so lower is always better. For
the 'winner' metric, this number indicates the fraction of the time
that the best player did not win. So, 1 - 'winner' is the chance of
the best player winning the entire tournament.

A slight improvement can be seen from using the extended series in the
depth metrics, however it is marginal compared to the large difference
between single elimination and round robin tournaments. These results
also indicate that double elimination does perform significantly
better than single elimination, however neither come close the
performance a round robin tournament.

4.2 VARYING NUMBER OF GAMES

We can also explore the effect on tournament outcomes when the number
of games in each series is varied. (In this case, the extended series
is also varied.) These results are graphed below.

[image loading]

[image loading]

[image loading]

These results all show pretty much what one would expect -- using more
games in each series improves the accuracy of the tournament
format. However, this also visually show that the elimination
tournaments all perform similarly, and none approach the accuracy of a
round robin tournament. The ordering of performance is very
consistent, however: round robin is best, followed by double
elimination with extended series, double elimination, and single
elimination.

The depth metric doesn't show much separation between the different
elimination tournament formats, but the winner and 2^depth metrics
both show significant separation between single and double elimination
formats. This indicates that the single elimination format produces
more big differences in outcome than the double elimination tournament.
That is, more often the best player does not win, and more often good
players don't make it as far as they should. In this respect, the
extended series seems to make very little difference.

4.3 VARYING NUMBER OF PLAYERS

In this section, we compare the effect on accuracy when changing the
number of players in the tournament. I have to break methodology here
a bit, because I don't have the time to wait for a million simulations
of a 512-player round robin tournament to finish. So instead, I
simulated fifty thousand simulations. Consequently, there is a little
more noise in these results.

[image loading]

[image loading]

[image loading]

These graphs don't show anything particularly revealing compared with
the last section, but they do confirm that the trends hold over a
variety of tournament sizes. Single elimination does worse than double
elimination formats, and round robin is much better than the
elimination formats. This is particularly true with large numbers of
players -- but in this range, it is an unfair comparison, because
round robin plays many more games. Most relevant to this post,
extended series seems to have minimal effect on results for large
numbers of players, particularly when considering 2^depth.

4.4 EFFECT OF EXTENDED SERIES

We now consider the effect of the extended series in
isolation. Specifically, how often is the extended series used, and
how often does is "correct an injustice" from the winners' bracket?

In this case, we consider a 64-player tournament in double elimination
with extended series format. In a standard double-elimination format,
127 matches will be played.

Simulation shows that, on average, 18.8 extended series will be played
in a 64-player tournament. This means that 15% of matches, on average,
will be rematches of players.

Similarly, of these 18.8 matches, 3.03 of them will result in
"corrections". A correction is when the better player loses in the
winners' bracket and wins the extended series to continue in the
tournament. In 2.17 of the matches, the worse player won in the
winners' bracket and won the extended series, meaning the extended
series failed to "correct" the result from the winners'
bracket.

The worst possible outcome is when the better play wins in the
winners' bracket and loses the extended series. The extended series
does well here, only introducing 0.55 such results per tournament, or
4% of the extended series.

Considering the disadvantage that the better player has when entering
the extended series, it does surprisingly well at correcting these
results, succeeding 58% of the time. At the same time, it only
introduces bad results 4% of the time.

I am tempted to conlude that extended series is successful at letting
the better player continue in the tournament, however data is missing
to compare against a standard double elimination tournament. A good
area of extension for this study would be measuring the outcome if a
regular best-of-three were done, and comparing its
correction/injustice rate to the extended series. The ratio from the
extended series (58%/4%) seems pretty hard to beat -- I would expect a
best-of-three to allow the better play to proceed more often, but have
a much higher injustice rate.

5. CONCLUSION

Whe considering individual matches, the extended series appears to
perform well to make sure the better player continues in the
tournament. In this sense, it fulfills its purpose.

But when looking at the larger picture, it appears that the extended
series has little effect on the outcome. While the extended series
rule does slightly improve outcomes, these differences are not
particularly significant compared to the overall double elimination
format.

What is clear from these results is that both elimination formats
leave much to be desired when compared to a round robin
tournament. Although round-robin is impractical due its large number
of games, other tournament formats such as swiss-style or those with
rounds play deserve further consideration.

Another future area of work is considering the performance of a
points-based system of several double elimination tournaments, like
MLG employs for its full Starcraft II season.

6. SEE ALSO

Wikipedia on tournament formats:
http://en.wikipedia.org/wiki/Single-elimination_tournament
http://en.wikipedia.org/wiki/Swiss_style_tournament

6.1 SOURCE CODE

The source code is available via git at:

git://github.com/nathanbeckmann/Tournament.git

It is written in Go. Have fun!

EDIT 1: Corrected problem with injustice rate. It is 4%, not 3%.

EDIT 2: Fix example in intro (corrected by Cyber_Cheese).
Durn
Profile Blog Joined July 2010
Canada360 Posts
November 12 2010 01:00 GMT
#2
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?
"Even if I lose 100 games, that's 100 different arrows pointing me in the wrong direction." - Sean Day[9] Plott
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:03 GMT
#3
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


I agree. Even more interesting, lets say (hypothetically) that..

IdrA > Tyler
Tyler > SeleCT
SeleCT > IdrA

There is no "best player" in this group, and now their seeding basically determines who faces who first, and therefore which of them has an advantage in the extended series.

I'd call this one of those things that falls outside the scope of my post.
randplaty
Profile Joined September 2010
205 Posts
November 12 2010 01:03 GMT
#4
awesome awesome study. Thanks for the hardwork. Good to know that extended series does have some value... although minimal.
Shakes
Profile Joined April 2010
Australia557 Posts
November 12 2010 01:07 GMT
#5
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's argument is one that has been explicitly excluded from the scope of this analysis (that the "better" player might not be transitive).
Durn
Profile Blog Joined July 2010
Canada360 Posts
November 12 2010 01:08 GMT
#6
I just took a closer look at all your work, and that's actually really awesome. The statistics do make sense when put out in such an organized manor.

I appreciate your hard work, I hope this will get some eyes from MLG haters. I still disagree with it at the core of its concept, but in terms of your statistics, the math points in the right direction.
"Even if I lose 100 games, that's 100 different arrows pointing me in the wrong direction." - Sean Day[9] Plott
vohne
Profile Joined September 2010
Philippines197 Posts
November 12 2010 01:13 GMT
#7
In a higher level arena the better player isn't always transitive. That is because there are too many variables that must be taken into consideration such as race matchups, maps, player conditioning and etc.
Dragar
Profile Joined October 2010
United Kingdom971 Posts
November 12 2010 01:19 GMT
#8
Is it possible to rephrase the question to not assume that the better player is transitive? So that the goal is not to determine the 'best' player, but rather to minimise the effect of matchup ordering, etc?
Nayl
Profile Joined March 2010
Canada413 Posts
November 12 2010 01:22 GMT
#9
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's arguement is irrelevant to the actual statistics or logic in the argument.

Extended series exist to make contest between 2 player fairer, how these guys play 3rd player has no effect.

Also, in his argument, how does he know he wouldn't have made stupid mistake if he were to advance over nony?
paralleluniverse
Profile Joined July 2010
4065 Posts
November 12 2010 01:23 GMT
#10
On November 12 2010 10:07 Shakes wrote:
Show nested quote +
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's argument is one that has been explicitly excluded from the scope of this analysis (that the "better" player might not be transitive).

Not really.

The nontransitivity is taken in account since performance was measured using a mean +/- and random number. And that allows for the possibility that player A will beat player B, player B beats player C, and player C beats player A.
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:26 GMT
#11
On November 12 2010 10:19 Dragar wrote:
Is it possible to rephrase the question to not assume that the better player is transitive? So that the goal is not to determine the 'best' player, but rather to minimise the effect of matchup ordering, etc?


This is definitely possible, you would need some kind of relation for each player to every other. The problem with this is you would end up with a lot of choices in terms of modeling -- because the relationship, while not perfectly transitive, is pretty close. (That is, although the cream of the crop might be extremely intransitive, they are definitely better than most of the other players). Therefore the relation you come up with shouldn't be completely random. This kind of data would probably have to be pulled from actual player statistics, which would actually be a huge improvement to the study overall.

But until that happens, I think keeping it simple is better because you avoid a lot of complexities that don't necessarily improve the results.
rasnj
Profile Joined May 2010
United States1959 Posts
November 12 2010 01:26 GMT
#12
On November 12 2010 10:19 Dragar wrote:
Is it possible to rephrase the question to not assume that the better player is transitive? So that the goal is not to determine the 'best' player, but rather to minimise the effect of matchup ordering, etc?

What exactly would be the goal then? I thought about doing this kind of analysis myself, but decided that I couldn't formulate exactly what I wanted the tournament system to accomplish without imposing a total order on the skill levels of the players, and I considered this too far from reality to bother. If you can clearly express the goal of your tournament and a way to determine how far a given ranking is from that goal, then we can probably do some analysis.
zulu_nation8
Profile Blog Joined May 2005
China26351 Posts
November 12 2010 01:26 GMT
#13
I think your study would only be meaningful if people actually assumed a bo7 series does not determine the best player as well as a bo3 series.
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:27 GMT
#14
On November 12 2010 10:23 paralleluniverse wrote:
Show nested quote +
On November 12 2010 10:07 Shakes wrote:
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's argument is one that has been explicitly excluded from the scope of this analysis (that the "better" player might not be transitive).

Not really.

The nontransitivity is taken in account since performance was measured using a mean +/- and random number. And that allows for the possibility that player A will beat player B, player B beats player C, and player C beats player A.


In this sense, the intransitivity is a random fluctuation, and if you played a long enough series you would expect it to go away.

But in reality, there probably are cases of "true intransitivity", where people's play styles match up in weird ways so that A > B, B > C, and C > A.
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:30 GMT
#15
On November 12 2010 10:26 rasnj wrote:
Show nested quote +
On November 12 2010 10:19 Dragar wrote:
Is it possible to rephrase the question to not assume that the better player is transitive? So that the goal is not to determine the 'best' player, but rather to minimise the effect of matchup ordering, etc?

What exactly would be the goal then? I thought about doing this kind of analysis myself, but decided that I couldn't formulate exactly what I wanted the tournament system to accomplish without imposing a total order on the skill levels of the players, and I considered this too far from reality to bother. If you can clearly express the goal of your tournament and a way to determine how far a given ranking is from that goal, then we can probably do some analysis.


Although reality isn't exactly transitive, it is pretty close.

That is, you can pretty confident saying that IdrA > Gretorp > HDstarcraft (random names, don't take offense). So although there are players near each players skill that confuse the issue slighly, the large-scale picture is still pretty clear because there is actually some order.
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:32 GMT
#16
On November 12 2010 10:26 zulu_nation8 wrote:
I think your study would only be meaningful if people actually assumed a bo7 series does not determine the best player as well as a bo3 series.


I'm not really sure what you are responding to ...

The point of this is to determine exactly how much of an effect extended series has, both for individual matches and for an entire tournament. I'm pretty sure I haven't seen anyone talk about this with real numbers to back up what they are saying
paralleluniverse
Profile Joined July 2010
4065 Posts
Last Edited: 2010-11-12 01:33:22
November 12 2010 01:32 GMT
#17
On November 12 2010 10:27 nzb wrote:
Show nested quote +
On November 12 2010 10:23 paralleluniverse wrote:
On November 12 2010 10:07 Shakes wrote:
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's argument is one that has been explicitly excluded from the scope of this analysis (that the "better" player might not be transitive).

Not really.

The nontransitivity is taken in account since performance was measured using a mean +/- and random number. And that allows for the possibility that player A will beat player B, player B beats player C, and player C beats player A.


In this sense, the intransitivity is a random fluctuation, and if you played a long enough series you would expect it to go away.

But in reality, there probably are cases of "true intransitivity", where people's play styles match up in weird ways so that A > B, B > C, and C > A.

But these *are* random fluctuations in real life. If A > B > C, we would expect that A will beat B will beat C most of the time, and on some few random occasions for this not to hold. I think your model captures this fact well.

Although I wonder why you used such an archaic setup to simulate player performance instead of just simulating from a normal distribution, which can be done in 1 line in any statistical package, and would probably be more correct.
Nayl
Profile Joined March 2010
Canada413 Posts
November 12 2010 01:34 GMT
#18
On November 12 2010 10:30 nzb wrote:
Show nested quote +
On November 12 2010 10:26 rasnj wrote:
On November 12 2010 10:19 Dragar wrote:
Is it possible to rephrase the question to not assume that the better player is transitive? So that the goal is not to determine the 'best' player, but rather to minimise the effect of matchup ordering, etc?

What exactly would be the goal then? I thought about doing this kind of analysis myself, but decided that I couldn't formulate exactly what I wanted the tournament system to accomplish without imposing a total order on the skill levels of the players, and I considered this too far from reality to bother. If you can clearly express the goal of your tournament and a way to determine how far a given ranking is from that goal, then we can probably do some analysis.


Although reality isn't exactly transitive, it is pretty close.

That is, you can pretty confident saying that IdrA > Gretorp > HDstarcraft (random names, don't take offense). So although there are players near each players skill that confuse the issue slighly, the large-scale picture is still pretty clear because there is actually some order.


Well non-transitivity can occur especially if you are comparing between a non-team mate and 2 team mates.

Incontrol might be better than machine because he knows his teammate well, but machine might be better than Painuser but Painuser is better than Incontrol. (random names)

So its not necessarily clear in reality. =/
nzb
Profile Joined September 2010
United States41 Posts
November 12 2010 01:34 GMT
#19
On November 12 2010 10:32 paralleluniverse wrote:
Show nested quote +
On November 12 2010 10:27 nzb wrote:
On November 12 2010 10:23 paralleluniverse wrote:
On November 12 2010 10:07 Shakes wrote:
On November 12 2010 10:00 Durn wrote:
I think IdrA summed it up quite well in the State of the Game. Statistics aside, it goes like this hypothetical they used:

IdrA makes a stupid mistake and gets knocked out by NoNy in an early round. 3 rounds later, NoNy makes a silly mistake that idrA wouldn't have made. They meet in the losers bracket, they've both made silly mistakes that the other one wouldn't have made. Why should IdrA be penalized?


IdrA's argument is one that has been explicitly excluded from the scope of this analysis (that the "better" player might not be transitive).

Not really.

The nontransitivity is taken in account since performance was measured using a mean +/- and random number. And that allows for the possibility that player A will beat player B, player B beats player C, and player C beats player A.


In this sense, the intransitivity is a random fluctuation, and if you played a long enough series you would expect it to go away.

But in reality, there probably are cases of "true intransitivity", where people's play styles match up in weird ways so that A > B, B > C, and C > A.

But these *are* random fluctuations in real life. If A > B > C, we would expect that A will beat B will beat C most of the time, and on some random occasions for this not to hold. I think your model captures this fact well.

Although I wonder why you used such an archaic setup to simulate player performance instead of just simulating from a normal distribution, which can be done in 1 line in any statistical package, and would probably be more correct.


Haha, touche. The reason is that I did this in order to have something fun to code in Go, which I've wanted to learn for a while, so doing it in Mathematica or R or something would have defeated my purpose.
MannerMan
Profile Blog Joined July 2008
371 Posts
November 12 2010 01:34 GMT
#20
Here's a blog I wrote on the same subject the other day.
http://www.teamliquid.net/blogs/viewblog.php?id=168168

It is a bit shorter and less in depth, and the scope is only the difference between separate Bo3s vs an extended series Bo7.
1 2 3 4 5 6 7 Next All
Please log in or register to reply.
Live Events Refresh
World Team League
12:00
WTL Code A Qualifier Day 2
RotterdaM675
CranKy Ducklings192
SteadfastSC115
Liquipedia
WardiTV Korean Royale
12:00
Group A - Day 1
GuMiho vs ScarlettLIVE!
Creator vs Maru
NightMare vs Maru
WardiTV1572
ComeBackTV 668
IndyStarCraft 317
IntoTheiNu 290
3DClanTV 89
HorussTv 66
LiquipediaDiscussion
[ Submit Event ]
Live Streams
Refresh
StarCraft 2
RotterdaM 675
IndyStarCraft 317
ProTech133
SteadfastSC115
Creator 40
StarCraft: Brood War
Calm 4044
Sea 2987
GuemChi 2054
Shuttle 1708
Horang2 1441
Stork 730
BeSt 584
Soma 542
Light 308
Mini 259
[ Show more ]
Sea.KH 253
ggaemo 229
Mind 166
Leta 164
firebathero (twitch) 141
Mong 127
Hyun 99
Sharp 97
hero 77
firebathero 68
Rock 32
Oya187 22
zelot 18
scan(afreeca) 17
ajuk12(nOOB) 13
HiyA 6
Dota 2
Gorgc8503
qojqva3765
Attackerdota714
EternaLEnVy636
boxi98189
League of Legends
Trikslyr82
Counter-Strike: Global Offensive
pimpcsgo2275
fl0m1818
Foxcn168
Other Games
Stewie2K4717
singsing4145
hiko3542
DeMusliM671
crisheroes622
Pyrionflax588
Lowko371
MaximusBlack272
Fuzer 239
ArmadaUGS184
KnowMe150
QueenE55
kRYSTAL_52
ViBE33
Organizations
Counter-Strike: Global Offensive
ESL CS:GO45886
ESL CS:GO B28046
StarCraft 2
Esl_sc2106
ESL.tv106
StarCraft: Brood War
StarcraftVOD3
StarCraft 2
Blizzard YouTube
StarCraft: Brood War
BSLTrovo
[ Show 19 non-featured ]
StarCraft 2
• intothetv
• IndyKCrew
• Poblha
• Migwel
• Laughngamez YouTube
• Alpha X_
• aXEnki
• LaughNgamez Trovo
• Gussbus
• Kozan
StarCraft: Brood War
• sscaitournament1
• STPLYoutube
• BSLYoutube
• AfreecaTV YouTube
League of Legends
• Jankos5441
• TFBlade3487
• Nemesis1651
• Lourlo908
Other Games
• WagamamaTV351
Upcoming Events
Afreeca Starleague
18h 45m
Royal vs Shine
Jaedong vs Action
WardiTV Korean Royale
20h 45m
Classic vs SpeCial
Cure vs Stats
herO vs RagnaroK
SpeCial vs Stats
Classic vs Cure
Solar vs ByuN
herO vs Solar
RagnaroK vs ByuN
Afreeca Starleague
1d 18h
Rush vs Barracks
Queen vs JyJ
WardiTV Winter Champion…
1d 20h
RagnaroK vs HonMonO
NightMare vs Kelazhur
PassionCraft
2 days
Korean StarCraft League
2 days
WardiTV Winter Champion…
2 days
HeRoMaRinE vs Gerald
Elazer vs INnoVation
WardiTV Korean Royale
3 days
Maru vs Ryung
DongRaeGu vs NightMare
GuMiho vs Dark
Ryung vs NightMare
Maru vs DongRaeGu
Creator vs Scarlett
GuMiho vs Creator
Dark vs Scarlett
Sniper's StarCraft League
4 days
ESL Pro Tour
4 days
[ Show More ]
WardiTV Korean Royale
4 days
Cure vs ByuN
TBD vs RagnaroK
TBD vs herO
Cure vs RagnaroK
ByuN vs TBD
Solar vs Stats
TBD vs Solar
herO vs Stats
BSL: ProLeague
5 days
Bonyth vs TBD
Amantes de StarCraft 2
5 days
ESL Pro Tour
6 days
ESL Pro Tour
6 days
Afreeca Starleague
6 days
Soulkey vs BeSt
Snow vs Light
Liquipedia Results

Completed

Ultimate Battle: Snow vs BarrackS
PiG Sty Festival 3.0
Tournament by teenyeu #2
CCT Central EU Malta Finals

Ongoing

FS Mania
CWCL Season 6
BWCL Season 58
Copa Latinoamericana
ASL Season 15
Individual Silver League
Spring Cup Season 4: China
KCM Ladies Race Survival 2023 Season 1
KCM Race Survival 2023 Season 1
BSL Season 16
Spring Cup Season 4
WardiTV Korean Royale
WardiTV Winter 2023
NGS Storm Division S6
Calamity Cup Division A - Season 5
META Madness #7
ESL Pro League Season 17
ESL Challenger League S44 NA
ESL Challenger League S44 EU
ESL Challenger League S44 AP

Upcoming

CHN vs KOR Week35
KOR-CHN Ladies Invitational League 1: Duck9 vs CoCo
KOR-CHN Invitational League 10: Organ vs Kid
WTL 2023 Summer
LTK Thunderball
BLAST.tv Paris Major 2023
ESL Challenger Melbourne 2023
IEM Rio 2023
BLAST.tv Paris 2023: EU RMR B
BLAST.tv Paris 2023: EU RMR A
BLAST.tv Paris 2023: APAC RMR
BLAST.tv Paris 2023: AME RMR
BLAST Premier Spring AME Showdown
BLAST Premier Spring EU Showdown
TLPD

1. ByuN
2. TY
3. Dark
4. Solar
5. Stats
6. Nerchio
7. sOs
8. soO
9. INnoVation
10. Elazer
1. Rain
2. Flash
3. EffOrt
4. Last
5. Bisu
6. Soulkey
7. Mini
8. Sharp
Sidebar Settings...

Advertising | Privacy Policy | Terms Of Use | Contact Us

Original banner artwork: Jim Warren
The contents of this webpage are copyright © 2023 TLnet. All Rights Reserved.