|
Hey. First congratz to you guys you're doing not only a pretty neat job but whats nice is that you listen to feedback too and it is really nice.
All right, I apologize in advance because I have a non-aligulac related question. But since you guys are into statistics and coding I propose a little problem to solve in case you have some free time during summer :D
Given the already announced non WCS events that gives point, the current WCS rankings, and the upcoming WCS seasons,, considering only the best 144 players to compete (or whatever model suits you), what is the minimal amount of points required to be in the top 16 players in 96 percent of the possible configurations ?
|
5003 Posts
Am I missing something, or does Aligulac does not contain map data?
|
On July 11 2013 05:06 Milkis wrote: Am I missing something, or does Aligulac does not contain map data?
Correct. It probably never will, for various reasons (incomplete data, not worth the incredible time/effort at this point, contributes nothing to the overall ranking directly, etc), I figure.
|
5003 Posts
On July 11 2013 05:13 MCXD wrote:Show nested quote +On July 11 2013 05:06 Milkis wrote: Am I missing something, or does Aligulac does not contain map data? Correct. It probably never will, for various reasons (incomplete data, not worth the incredible time/effort at this point, contributes nothing to the overall ranking directly, etc), I figure.
Even if it's incomplete, it's probably worth putting in the data, as it will be the next step in predicting results. That, and we can see how maps affect balance. It's difficult, but probably will be required at one point or another.
|
On July 11 2013 05:33 Milkis wrote:Show nested quote +On July 11 2013 05:13 MCXD wrote:On July 11 2013 05:06 Milkis wrote: Am I missing something, or does Aligulac does not contain map data? Correct. It probably never will, for various reasons (incomplete data, not worth the incredible time/effort at this point, contributes nothing to the overall ranking directly, etc), I figure. Even if it's incomplete, it's probably worth putting in the data, as it will be the next step in predicting results. That, and we can see how maps affect balance. It's difficult, but probably will be required at one point or another. It would be a great to have, absolutely. But it would also require updating 150.000 matches, which is a ton of work. We have awesome volunteers, but we don't have that many awesome volunteers. Yet.
|
On July 11 2013 05:33 Milkis wrote:Show nested quote +On July 11 2013 05:13 MCXD wrote:On July 11 2013 05:06 Milkis wrote: Am I missing something, or does Aligulac does not contain map data? Correct. It probably never will, for various reasons (incomplete data, not worth the incredible time/effort at this point, contributes nothing to the overall ranking directly, etc), I figure. Even if it's incomplete, it's probably worth putting in the data, as it will be the next step in predicting results. That, and we can see how maps affect balance. It's difficult, but probably will be required at one point or another.
I've spoken on this matter before, to the best of my understanding (based upon internal discussions with BB and everyone else) and the conclusion we've come to is that:
1) there simply aren't enough games played per map per player that this would increase the accuracy of predictions and/or player skill ratings, on the contrary a much smaller and more fragmented sample size would lead to more uncertainty etc.
2) it's really REALLY hard (and in many cases impossible) to find map information for many of the online tournaments/qualifiers and small LANs that we add to the database
I'm not sure what to say to your claim that "it will be required at one point or another", but if we're going to see the scene go for a Proleague approach to maps, with more frequent map pool changes, then implementing such a thing would be even less accurate or meaningful (refer to #1 above).
So yeah... it's been considered.
|
Not to mention that we would need a proper definition of the term "map", as silly as it sounds. Would different versions of a map count towards the same map, or be separate? What about different spawn positions, missing watch towers, etc.? The more accurate we'd be, the less useful the data would be for any kind of prediction or statistics.
|
5003 Posts
On July 11 2013 05:43 Conti wrote:Show nested quote +On July 11 2013 05:33 Milkis wrote:On July 11 2013 05:13 MCXD wrote:On July 11 2013 05:06 Milkis wrote: Am I missing something, or does Aligulac does not contain map data? Correct. It probably never will, for various reasons (incomplete data, not worth the incredible time/effort at this point, contributes nothing to the overall ranking directly, etc), I figure. Even if it's incomplete, it's probably worth putting in the data, as it will be the next step in predicting results. That, and we can see how maps affect balance. It's difficult, but probably will be required at one point or another. It would be a great to have, absolutely. But it would also require updating 150.000 matches, which is a ton of work. We have awesome volunteers, but we don't have that many awesome volunteers. Yet. 
Technically you can just rip a lot of them off TLPD, which does have map data, which should aid with the process. Maybe even request a rip of it so you don't have to farm it.
1) there simply aren't enough games played per map per player that this would increase the accuracy of predictions and/or player skill ratings, on the contrary a much smaller and more fragmented sample size would lead to more uncertainty etc.
Even if it is something as simple as map balance (ie, races sent out in proleague, race matchups in tournaments), it'd help quite considerably and that statistic should have enough sample points. Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
2) it's really REALLY hard (and in many cases impossible) to find map information for many of the online tournaments/qualifiers and small LANs that we add to the database
That's probably true, but you can just dummy for that and still have some sort of an idea. If we're going to go with the sample size argument, most of the players you add from the small LANs and qualifiers also have a small sample size so it's hard to get a "true rating" of a player, but we believe in LLN so we go through with it anyway.
I'm not sure what to say to your claim that "it will be required at one point or another", but if we're going to see the scene go for a Proleague approach to maps, with more frequent map pool changes, then implementing such a thing would be even less accurate or meaningful (refer to #1 above).
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at, and easier to predict the outcome (a Protoss comes out on battle royale? you bet it's going to get slaughtered. Terran on Central Plains? Nom nom nom)
I mean, it comes down to "we don't have it, and it requires work" and that's fine because getting volunteers and working on something like this is hard, but I don't really agree with the reasoning being put out, and don't see why there shouldn't be at least an option to add maps to it rather than taking it out completely.
Show nested quote +On July 11 2013 05:58 Conti wrote: Not to mention that we would need a proper definition of the term "map", as silly as it sounds. Would different versions of a map count towards the same map, or be separate? What about different spawn positions, missing watch towers, etc.? The more accurate we'd be, the less useful the data would be for any kind of prediction or statistics.
Is your argument really "the more accurate our data, the less useful it is for statistics?" Maybe that's the case for Aligulac's rating system but please think about what you just said... If it's useless data, you don't need to use it, but having more data can't really hurt. It would be useful for map makers to refer to also in the future.
Sorry if I sound like i'm attacking it, I'm really not attacking you guys in anyway -- but perhaps I think it'll be more useful and perhaps worth at least allowing people to submit map data.
|
On July 11 2013 06:11 Milkis wrote: Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
[...]
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at.
I don't think you understand what I was getting at. The moment that player ratings and the prediction system become dependent on maps as well, both will become significantly less accurate simply because there are less games to work with. Therein lies the issue with the uncertainty.
If what you're asking for is simple stuff like map winrate statistics and whatnot, that would be much simpler to do, but at the same time redundant because TLPD already does it and it would be very hard for us to add a significant amount to that (again, lack of information for many tournaments that aren't on TLPD).
On July 11 2013 06:11 Milkis wrote: Is your argument really "the more accurate our data, the less useful it is for statistics?" Maybe that's the case for Aligulac's rating system but please think about what you just said...
Please don't be condescending when you didn't even understand my point...
|
5003 Posts
On July 11 2013 06:21 MasterOfPuppets wrote:Show nested quote +On July 11 2013 06:11 Milkis wrote: Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
[...]
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at.
I don't think you understand what I was getting at. The moment that player ratings and the prediction system become dependent on maps as well, both will become significantly less accurate simply because there are less games to work with. Therein lies the issue with the uncertainty. If what you're asking for is simple stuff like map winrate statistics and whatnot, that would be much simpler to do, but at the same time redundant because TLPD already does it and it would be very hard for us to add a significant amount to that (again, lack of information for many tournaments that aren't on TLPD).
Let's put it this way. I'm not asking to add a dummy for each map, but having the map in the database would be useful so you can easily calculate win rates to feed the racial win rates on each map for your rating system. I'm just saying maps play a huge role in prediction (even if that effect is averaged out in simple racial win rates) and I'm surprised it isn't being considered atm.
Does that make sense?
Show nested quote +On July 11 2013 06:11 Milkis wrote: Is your argument really "the more accurate our data, the less useful it is for statistics?" Maybe that's the case for Aligulac's rating system but please think about what you just said...
Please don't be condescending when you didn't even understand my point...
That wasn't aimed at you -- please read what I said.
|
Don't get me wrong. I'd love to have maps in aligulac. But you can see that it would be a lot of work, both in implementation and in maintenance.
|
5003 Posts
On July 11 2013 06:27 Conti wrote: Don't get me wrong. I'd love to have maps in aligulac. But you can see that it would be a lot of work, both in implementation and in maintenance.
I understand that's what it comes down to, it's just very odd when you guys appeal to statistics to say "it's useless" when it's not, when the issue comes from cost. I agree that it is a lot of work on inputting the data, however, but at the same time, I don't see why it shouldn't have at least an option to allow people to add it in.
|
On July 11 2013 06:24 Milkis wrote:Show nested quote +On July 11 2013 06:21 MasterOfPuppets wrote:On July 11 2013 06:11 Milkis wrote: Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
[...]
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at.
I don't think you understand what I was getting at. The moment that player ratings and the prediction system become dependent on maps as well, both will become significantly less accurate simply because there are less games to work with. Therein lies the issue with the uncertainty. If what you're asking for is simple stuff like map winrate statistics and whatnot, that would be much simpler to do, but at the same time redundant because TLPD already does it and it would be very hard for us to add a significant amount to that (again, lack of information for many tournaments that aren't on TLPD). Let's put it this way. I'm not asking to add a dummy for each map, but having the map in the database would be useful so you can easily calculate win rates to feed the racial win rates on each map for your rating system. I'm just saying maps play a huge role in prediction (even if that effect is averaged out in simple racial win rates) and I'm surprised it isn't being considered atm. Does that make sense?
Yes. Maps do play a huge role in prediction. But with the exception of Antiga, Daybreak and Cloud Kingdom (all of which are obviously no longer relevant), there simply aren't enough games played on singular maps that implementing this would benefit the system in any significant, noticeable way.
In an ideal world, there would be far more tournaments for players to compete in, and for us this would solve most if not all of our problems.
|
While I won't discuss the statistical/predictional value on the matter of maps or no maps, I can discuss some of the more simple issues: 1: Getting the info on maps played can be tough unless you are talking about the big leagues, let's not get into the problems with different versions of a map and some with forced spawns etc. 2: Backtracking the existing DB would take months since we can't do it in bulks like we did with the event NSM. 3: Number of capable volunteers = 5-10 4: The entire entry system/parsing system would need an overhaul. 5: As far as I recall, we store the rows of matches as sets, not games, meaning that they would needed to be split with an extra attribute to keep track of which games belongs to which set.
I really wish we could have maps, but if you check aligulac.com/db you see that outside kiekaboe, conti, BB, MoP and shellshock, we really doesn't have nearly enough manpower to consistently keep it all updated, nor do we have the prestige of being affilliated with TL to attract them.
|
Ps: map stats probably would be useful, although it wouldn't be compared to the massive workload behind implementing it and maintaining it.
|
On July 11 2013 06:29 Milkis wrote:Show nested quote +On July 11 2013 06:27 Conti wrote: Don't get me wrong. I'd love to have maps in aligulac. But you can see that it would be a lot of work, both in implementation and in maintenance. I understand that's what it comes down to, it's just very odd when you guys appeal to statistics to say "it's useless" when it's not, when the issue comes from cost. I agree that it is a lot of work on inputting the data, however, but at the same time, I don't see why it shouldn't have at least an option to allow people to add it in. Mostly because that requires a lot of programming work to implement. We'd need to change our database structure, as Grovbolle says. We'd need to update the parser to accept map syntax. We'd need to create new interfaces for adding maps. We'd need to figure out how much information on maps we want to have (as I mentioned above, map version, spawns, etc.). We'd need to make sure that we don't break anything that's already working. And I'm sure that we'll find tons of other stuff that needs to be done first as soon as we start working on all that. 
Oh, yeah. And reworking our database structure of course means that we would need to change practically everything else so it would still work with the new database structure.
|
On July 11 2013 06:42 Grovbolle wrote: Ps: map stats probably would be useful, although it wouldn't be compared to the massive workload behind implementing it and maintaining it.
That's the thing.
Yeah, it's an improvement, but it's like having an upgrade on the Fusion Core for 1000 minerals / 1000 gas / 300 seconds that gives Marauders an added 0.5 damage. If you catch my drift with this terrible analogy. :/
|
I believe that map stats would make aligulac strictly better. In fact, no one can argue the contrary.
However, by how much? Well, I guess "not very much". As mentioned, small sample sizes make the variance jump. Also, the community has the good taste to eliminate any map with a W/L ratio bigger than 60/40. So maps already have only a very small variance.
The cost of updating the DB is on the other hand monstrously large.
I believe that there are better ways to improve aligulac than implementing map stats. Perhaps not better predictive power, but just a better website, for far much less work.
|
5003 Posts
On July 11 2013 06:32 MasterOfPuppets wrote:Show nested quote +On July 11 2013 06:24 Milkis wrote:On July 11 2013 06:21 MasterOfPuppets wrote:On July 11 2013 06:11 Milkis wrote: Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
[...]
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at.
I don't think you understand what I was getting at. The moment that player ratings and the prediction system become dependent on maps as well, both will become significantly less accurate simply because there are less games to work with. Therein lies the issue with the uncertainty. If what you're asking for is simple stuff like map winrate statistics and whatnot, that would be much simpler to do, but at the same time redundant because TLPD already does it and it would be very hard for us to add a significant amount to that (again, lack of information for many tournaments that aren't on TLPD). Let's put it this way. I'm not asking to add a dummy for each map, but having the map in the database would be useful so you can easily calculate win rates to feed the racial win rates on each map for your rating system. I'm just saying maps play a huge role in prediction (even if that effect is averaged out in simple racial win rates) and I'm surprised it isn't being considered atm. Does that make sense? Yes. Maps do play a huge role in prediction. But with the exception of Antiga, Daybreak and Cloud Kingdom (all of which are obviously no longer relevant), there simply aren't enough games played on singular maps that implementing this would benefit the system in any significant, noticeable way. In an ideal world, there would be far more tournaments for players to compete in, and for us this would solve most if not all of our problems.
Most big maps have ~150 games of each match up played in it. Should be "enough".
On July 11 2013 06:42 Grovbolle wrote: Ps: map stats probably would be useful, although it wouldn't be compared to the massive workload behind implementing it and maintaining it.
That's what it would come down to, yeah. It's unfortunate I suppose, but what I'm trying to point out is that it's very limiting in the future not being able to add in maps. The problems you point out is only going to get worse, but at that point you've already completely ruled out the possibility of adding something important in.
Sorry, I think i went a bit overboard because I don't agree with the statistical reasoning being thrown out. Sorry about wasting your time D:
|
On July 11 2013 06:54 Milkis wrote:Show nested quote +On July 11 2013 06:32 MasterOfPuppets wrote:On July 11 2013 06:24 Milkis wrote:On July 11 2013 06:21 MasterOfPuppets wrote:On July 11 2013 06:11 Milkis wrote: Being afraid of "uncertainty" is a silly thing when you're working with statistics, because I'm sure you'll gain more in accuracy from a cost from variance.
[...]
On the contrary Proleague format is the most map dependent one because of how they select players to go in a match. There's more to analyze and look at.
I don't think you understand what I was getting at. The moment that player ratings and the prediction system become dependent on maps as well, both will become significantly less accurate simply because there are less games to work with. Therein lies the issue with the uncertainty. If what you're asking for is simple stuff like map winrate statistics and whatnot, that would be much simpler to do, but at the same time redundant because TLPD already does it and it would be very hard for us to add a significant amount to that (again, lack of information for many tournaments that aren't on TLPD). Let's put it this way. I'm not asking to add a dummy for each map, but having the map in the database would be useful so you can easily calculate win rates to feed the racial win rates on each map for your rating system. I'm just saying maps play a huge role in prediction (even if that effect is averaged out in simple racial win rates) and I'm surprised it isn't being considered atm. Does that make sense? Yes. Maps do play a huge role in prediction. But with the exception of Antiga, Daybreak and Cloud Kingdom (all of which are obviously no longer relevant), there simply aren't enough games played on singular maps that implementing this would benefit the system in any significant, noticeable way. In an ideal world, there would be far more tournaments for players to compete in, and for us this would solve most if not all of our problems. Most big maps have ~150 games of each match up played in it. Should be "enough". Show nested quote +On July 11 2013 06:42 Grovbolle wrote: Ps: map stats probably would be useful, although it wouldn't be compared to the massive workload behind implementing it and maintaining it. That's what it would come down to, yeah. It's unfortunate I suppose, but what I'm trying to point out is that it's very limiting in the future not being able to add in maps. The problems you point out is only going to get worse, but at that point you've already completely ruled out the possibility of adding something important in. Sorry, I think i went a bit overboard because I don't agree with the statistical reasoning being thrown out. Sorry about wasting your time D: Well formulated feedback and discussion is never a waste of time. :-) It havde been discussed to death internally though, which makes it a bit funny when people point it out, since we are already horribly aware of our lack of this feature. Ultimately, the speed and ease of which we can add matches was/is our main afvantage over TLPD
|
|
|
|