Statistical Analysis of Extended Series - Page 7

Jayrod

1820 Posts

November 12 2010 18:06 GMT

#121

The extended series makes the finals of the tournament feel alot less epic. To be honest, I dont care whats fair to the players involved because the ONLY reason there is money involved is the viewership. No fans = No esports = no sponsors = no money for tournaments = tournaments would only be based on entrance fees etc.

If they want to make esports bigger, faster, they will understand that you want your finals to feel epic and last more than like 2 games. Its a really big let down to watch people battle it out to get to the end of this big tournament then have one guy playing with a handicap. I guess thats what happens with a double elimination style though... the extended series sucks.

Jerubaal

United States7684 Posts

November 12 2010 18:43 GMT

#122

On November 13 2010 02:32 Risen wrote:

Show nested quote +

You're an idiot. These statements don't contradict each other at all...

They contradict each other because idrA and INControl said they wanted each series to be treated as an isolated event and then brought in the overall tournament performance of the players as an argument. I'm not saying it kills their argument completely, but it's odd that they switched paradigms.

Another seeming contradiction is that they all agreed that the tournament doesn't do a particularly good job of ranking the players and then decided that it was better to let the system do the ranking. You can gauge relative skill between two players better than you can 128, yet the argument against extended series seems to play to the deficiencies of the system.

cHaNg-sTa

United States1058 Posts

November 12 2010 18:46 GMT

#123

On November 13 2010 02:32 Risen wrote:

Show nested quote +

You're an idiot. These statements don't contradict each other at all...

Uh, how so? Idra said that why should one player be penalized over the other because they both lost once in the losers bracket. But then incontrol states that each of the Bo3 are isolated events and should be treated as a single entity. Nothing that happened previously in the tournament should have any effect on the current Bo3. Yet, idra is pulling the example that player B screwed up against someone else in the tournament, so he shouldn't have an extended series with player A because of an event outside of the "isolated Bo3".

So basically incontrol said that the isolated Bo3 is an entirely new event that has absolutely no reflection on any of the previous results in the tournament. But idra is declaring that one shouldn't be penalized over the other because of a previous result in the tournament outside of the matchup between player A and B. Sounds like a contradiction to me.

Fries

United States124 Posts

November 12 2010 18:56 GMT

#124

Was just re-listening to all the arguments made in the State of the Game podcast and thought of something that I think is actually a pretty good compromise.

Not sure if this idea has yet been brought up, but what do people think of this?

Instead of an extended series a new series is played but a best of 7(or 5) instead of a new best of 3.

The logic being that Tyler is correct in saying the purpose of double elimination really is to give a better player who slips a chance to get a higher place than he otherwise would have had. I don't really see how you can dispute this. So if two players who already have to face each other it's probably best (read: most fair) to be as accurate as possible in determining who moves on.

It's true that you could still have a case where one person could overall be 4-3 or 5-4 and be the "loser" but would it at least feel better to people to go home after losing a fresh best of 5 or 7?

Ketara

United States15065 Posts

November 12 2010 19:11 GMT

#125

On November 13 2010 02:06 nzb wrote:

Show nested quote +

I imagine you would just use the initial seeds to sort between people with equivalent scores, and having 3x the number of rounds would quickly break people into categories.

I know that Google used this exact format (swiss->single elim) for their internal tournament at the end of the beta.

The problem here is that sorting ties when using swiss pairing becomes astronomically more difficult the more ties you have.

Lets say we have 64 players, and we're using their sc2ranks scores to sort ties. The first round is sorted that way, which is fine. The nature of the system is such that the first round has to be somewhat arbitrary.

After the first round, you have 32 players with 1 win and 32 players with 0 wins. The 32 with 1 win haven't played each other yet, so their sc2ranks stats can be used to sort them, and the same for those with 0 points.

However, the problem then occurs in round 3, where we have 16 people with 2 points who haven't played each other, 16 with 0 points who haven't played each other, but 32 with 1 point who may have played the person with the closest rank in the first round. In fact, if the people with the higher ranks are consistently beating the people with the lower ranks, it is very likely that you will have a large portion of these 32 who's closest ranked opponent is the person they played in the first round.

Since by the rules of the system people cannot play each other twice (you could change that rule I suppose!) you are then looking for the second closest ranked person for every single one of your problem matchups. It doesn't take a math whiz to see that at this point, every matchup that is altered in turn alters every other matchup, and it becomes very ungainly to organize it if you have a large number of problem matchups. The pairing is supposed to be done by the system and the organizer is supposed to be unable to influence it, but at this point you have to be influencing it just in order to settle the pairings.

Doing accelerated pairings and having multiple games in a series, allowing for people to have a variance in their number of points per round, would mitigate this issue to a degree, but without seeing some math I'm not convinced that it would prevent it.

Further, the concept of using someones ranked ladder stats to sort teams is going to be full of problems, since not everybody ladders, the best players in tournaments are often not the best players on ladder, and Blizzard has stated that sc2ranks is not 100% accurate at sorting player skill level because we don't have the full equation on how the hidden MMR rating works.

It would probably be fine for the first round, because that has to be somewhat arbitrary, but by using it repeatedly round after round you're undermining the accuracy of your sorting method.

Obviously if you could get a tournament system going and seed players by earlier tournament scores this would stop being an issue, but there would have to be an initial tournament at some point, and it would be nice to create a system that could be used for one-off tournaments and not require them to be part of some sort of league.

f0rk

England172 Posts

November 12 2010 19:13 GMT

#126

On November 13 2010 03:06 Jayrod wrote:
The extended series makes the finals of the tournament feel alot less epic. To be honest, I dont care whats fair to the players involved because the ONLY reason there is money involved is the viewership. No fans = No esports = no sponsors = no money for tournaments = tournaments would only be based on entrance fees etc.

If they want to make esports bigger, faster, they will understand that you want your finals to feel epic and last more than like 2 games. Its a really big let down to watch people battle it out to get to the end of this big tournament then have one guy playing with a handicap. I guess thats what happens with a double elimination style though... the extended series sucks.

The finals would of been 2 games if it was standard double elimination. The anti climatic finals is more down to using bo3 the whole way through, when switching to bo5 near the end would be more exciting.

Cyber_Cheese

Australia3615 Posts

November 12 2010 19:20 GMT

#127

early on in the article when explaining the series types a typo

"This rule is intended to avoid some paradoxical outcomes, as well as
statistically increase the likelihood that the 'better player'
continues in the tournament. It is possible in standard double
elimination for Alice to defeat Bob 2-1 in the winners', and Bob to
defeat Alice in the losers' 2-0. The "overall series" between Alice
and Bob is 3-2 in Alice's favor, but Bob continues and Alice does
not."

those should be 2-0, 2-1 and 3-2 in order to make sense

nzb

United States41 Posts

November 12 2010 20:19 GMT

#128

On November 13 2010 04:20 Cyber_Cheese wrote:
early on in the article when explaining the series types a typo

"This rule is intended to avoid some paradoxical outcomes, as well as
statistically increase the likelihood that the 'better player'
continues in the tournament. It is possible in standard double
elimination for Alice to defeat Bob 2-1 in the winners', and Bob to
defeat Alice in the losers' 2-0. The "overall series" between Alice
and Bob is 3-2 in Alice's favor, but Bob continues and Alice does
not."

those should be 2-0, 2-1 and 3-2 in order to make sense

You are correct, sir.

nzb

United States41 Posts

November 12 2010 20:38 GMT

#129

On November 13 2010 04:11 Ketara wrote:

Show nested quote +

I haven't worked this all out myself, so bear with me... As the number of rounds proceeds, you end up with this spread:

Round | # of wins (starting from 0, increasing)
1 | 64
2 | 32 32
3 | 16 32 16
4 | 8 24 24 8
5 | 4 16 24 16 4
6 | 2 10 20 20 10 2
7 | 1 6 15 20 15 6 1

I think you could adopt the algorithm of, starting from the top ranked player ,choose the next best play who he hasn't played, and continue down from the top selecting players until you have everyone paired.

Dammit, now you have me interesting writing this all up and simulating it again.

Further, the concept of using someones ranked ladder stats to sort teams is going to be full of problems, since not everybody ladders, the best players in tournaments are often not the best players on ladder, and Blizzard has stated that sc2ranks is not 100% accurate at sorting player skill level because we don't have the full equation on how the hidden MMR rating works.

It would probably be fine for the first round, because that has to be somewhat arbitrary, but by using it repeatedly round after round you're undermining the accuracy of your sorting method.

Obviously if you could get a tournament system going and seed players by earlier tournament scores this would stop being an issue, but there would have to be an initial tournament at some point, and it would be nice to create a system that could be used for one-off tournaments and not require them to be part of some sort of league.

I don't think this is much of an issue, because you can use season ranks (presuming that you are a league and have multiple tournaments in a season). And presumably these would be reasonably accurate.

Ketara

United States15065 Posts

November 12 2010 21:05 GMT

#130

For our round 3 where the 32 players come up, lets see how sorting the next best player works. I'm going to use an 8 person example because that way I don't have to type as much.

We have 16 players, all with 1 point in the tournament so far:

Player A: 225 MMR, played E in round 1
Player B: 230 MMR, played H in round 1
Player C: 250 MMR, played a player who now has 2 points in round 1
Player D: 200 MMR, played F in round 1
Player E: 220 MMR, played A in round 1
Player F: 180 MMR, played D in round 1
Player G: 240 MMR, played a player who now has 2 points in round 1
Player H: 235 MMR, played B in round 1

Lets sort them from the top down.

C (250 MMR) plays G (240), which is straightforward.
H (235) should play B (230) but can't, so he goes to the next player down, A (225)
B (230) plays the #4 player, E (220)
This leaves D (200) and F (180), but they cannot play each other.

So now we have to sort it from the bottom up instead.

When you sort it that way it works out, but results in C and G not playing each other which was your only obvious matchup, and gives you F at 180 MMR playing E at 220 MMR, which is not entirely fair, because at that skill differential it is likely a free point for player E that was caused by ties in the system.

And that is only with 8 people. I dunno. It has the potential to work, it just seems difficult to me. I am betting that games that only count wins as either a win (1) or a loss (0) generally do not have 64 players in their swiss tournaments. The game that we use it for, Field of Glory, has a 25 point scoring system and a very reliable ELO ranking for players, and only one league for every tournament. Plus, our tournaments rarely break 20-25 people.

nzb

United States41 Posts

November 12 2010 22:41 GMT

#131

On November 13 2010 06:05 Ketara wrote:
For our round 3 where the 32 players come up, lets see how sorting the next best player works. I'm going to use an 8 person example because that way I don't have to type as much.

We have 16 players, all with 1 point in the tournament so far:

Player A: 225 MMR, played E in round 1
Player B: 230 MMR, played H in round 1
Player C: 250 MMR, played a player who now has 2 points in round 1
Player D: 200 MMR, played F in round 1
Player E: 220 MMR, played A in round 1
Player F: 180 MMR, played D in round 1
Player G: 240 MMR, played a player who now has 2 points in round 1
Player H: 235 MMR, played B in round 1

Lets sort them from the top down.

C (250 MMR) plays G (240), which is straightforward.
H (235) should play B (230) but can't, so he goes to the next player down, A (225)
B (230) plays the #4 player, E (220)
This leaves D (200) and F (180), but they cannot play each other.

So now we have to sort it from the bottom up instead.

When you sort it that way it works out, but results in C and G not playing each other which was your only obvious matchup, and gives you F at 180 MMR playing E at 220 MMR, which is not entirely fair, because at that skill differential it is likely a free point for player E that was caused by ties in the system.

And that is only with 8 people. I dunno. It has the potential to work, it just seems difficult to me. I am betting that games that only count wins as either a win (1) or a loss (0) generally do not have 64 players in their swiss tournaments. The game that we use it for, Field of Glory, has a 25 point scoring system and a very reliable ELO ranking for players, and only one league for every tournament. Plus, our tournaments rarely break 20-25 people.

Well, I went ahead and implemented the swiss style I was talking about and I'm running a million iterations right now. Basically, if there isn't a "valid match", then you drop the requirement that players can't play each other, and there is a re-match. This doesn't happen very often, though -- about 2% of the time. Also, this only impacts people in the bottom of the ranking (because the best get priority), so its probably not too much of a concern.

The bad news is that initial results from running 50k iterations didn't look very good. I think that after running the tournament with a lot of rounds, you end up having the top players play bad players in the final rounds because they have already played all the other good players ... I'll have to follow up on this to see exactly whats going on.

Ketara

United States15065 Posts

November 12 2010 22:49 GMT

#132

Swiss ranking has a system for how many rounds you are supposed to have based on how many people have entered the tournament in order to achieve the best sorting, which as I understand it is because of that issue.

I think this is better than the wikipedia article: http://vtchess.info/Results/Swiss_Pairing_System.htm

"The rule of thumb is that it can handle 2n players, where n is the number of rounds. Therefore, 8 players needs 3 rounds, 16 players needs 4 rounds, 32 players needs 5 rounds, and so forth. (These numbers are approximations - due to draws and other variables, sometimes it works with more players than expected.)"

Using accelerated pairings allows you to have a tournament with 1 fewer rounds than what is necessary, but requires you to have a skill approximation of your players, and requires that you sort your initial round by that approximation.

Another note about swiss ranking is that there are actual literal ways to game the system. If you have access to the rankings and know everybodies score and who played whom and can do some quick math, you can at times figure out that if you lose a game on purpose, your next two opponents will be ones that you know cannot defeat you.

We call it "submarine-ing" and it's not a very honorable thing to do in our competitions but people do do it.

It is also sometimes possible to cheat the system by arranging a draw on purpose, but I imagine in Starcraft that would not be possible since draws are so difficult to create.

nzb

United States41 Posts

November 12 2010 22:59 GMT

#133

On November 13 2010 07:49 Ketara wrote:
Swiss ranking has a system for how many rounds you are supposed to have based on how many people have entered the tournament in order to achieve the best sorting, which as I understand it is because of that issue.

I think this is better than the wikipedia article: http://vtchess.info/Results/Swiss_Pairing_System.htm

"The rule of thumb is that it can handle 2n players, where n is the number of rounds. Therefore, 8 players needs 3 rounds, 16 players needs 4 rounds, 32 players needs 5 rounds, and so forth. (These numbers are approximations - due to draws and other variables, sometimes it works with more players than expected.)"

Using accelerated pairings allows you to have a tournament with 1 fewer rounds than what is necessary, but requires you to have a skill approximation of your players, and requires that you sort your initial round by that approximation.

Another note about swiss ranking is that there are actual literal ways to game the system. If you have access to the rankings and know everybodies score and who played whom and can do some quick math, you can at times figure out that if you lose a game on purpose, your next two opponents will be ones that you know cannot defeat you.

We call it "submarine-ing" and it's not a very honorable thing to do in our competitions but people do do it.

It is also sometimes possible to cheat the system by arranging a draw on purpose, but I imagine in Starcraft that would not be possible since draws are so difficult to create.

Yeah, after hearing from all these people that have actually played in tournaments that use swiss style, I'm not sure it is actually preferable to simple double elimination. I still think it is a cool idea though. I'll try running my simulation again using normal best of three once it finishes, instead of increasing the # of games.

It seems to me like tournaments like the GSL, who have literally thousands of people enter the qualifier, need a better system than single elimination in order to determine who qualifies. Since they take the top 64 anyway, it seems like swiss might be useful there ... But it really isn't acceptable that Tester, OGSTop, July, even Jinro can't qualify because they hit good players in the randomly-seeded qualifiers.

Ketara

United States15065 Posts

November 12 2010 23:10 GMT

#134

I think Swiss ranking is an excellent system if you A - Have a small number of people competing, B - Have a scoring system that makes identical scores rare, and C - Have to finish the tournament in a timely fashion such that round robin is impossible.

I too am shocked at the way GSL does its qualifiers, but I am under the impression that they are only doing these 3 with this system in order to create seeds, and the system they use next year is going to be markedly different and (presumably) better. Any time you're creating a league the first event is bumpy.

I do think that for a team vs. team competition swiss pairing would work great though. It'd be fun to know how the SC BW team leagues work. I'm sure there's a Liquipedia article on it and I just don't care enough to read it.

If you've got say 8 teams with 4 players each. The pairings for the first round could be random, with 4 Liquids playing 4 EG say, and then count the number of wins as the score, with however many rounds. This would necessitate that the team do well as a whole in order to win the tournament, because the teams best players win is only worth as many points as their worst players win.

Round Robin would probably work well for a team league too however since the number of participating teams would not be huge, and it would likely be more accurate.

nzb

United States41 Posts

November 13 2010 00:20 GMT

#135

So curious results for the swiss implementation -- with 64 players it does:

Winner - 0.67
Depth - 33.38
2^Depth - 43.73

To put in perspective, both the depth metrics are slightly worse than the single-elimination tournament format (which does 24.13/40.13), but the winner metric is better than double with extended series and almost as good as round robin (which get .79 and .55, respectively). So thats certainly unexpected -- so far the trends in every metric had been pretty consistent.

I bet its doing poorly because there are too many games, I'll try running it again.

Prev 1 5 6 7 All

Please or register to reply.

Statistical Analysis of Extended Series - Page 7

Completed

Ongoing

Upcoming