|
On December 06 2010 08:47 Skrag wrote: You simply *cannot* rely on the results of build order calculators or optimizers for comparisons. It is absolutely necessary to do the comparisons in-game, in the best way you possibly can.
Build order calculators have a simplified version of reality that simply does not reflect actual results to the letter, and you *will* get the incorrect answers from them.
Use them as a guide, not a deciding factor.
As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor. Unless you have a lot of replays, you can't really be sure at all of any comparisons between two builds. Most builds that were compared before only had a couple replays. Personally I would say that best replay analysis would be one where you had a lot (just guessing on how many you need to reduce the error good enough, but maybe 15) of replays for for both builds and the replays were from the same person on the same map with the same position on the same game speed. It's important to note that replay results seem to vary a lot between who plays and how many times they have played the particular build before. In order to avoid those differences it would put a lot of work on one person to test each and every build over and over again. A build order calculator doesn't have to mimic how any one person plays in reality. It just has to reflect the results if the computer were to play build perfectly. That's true potential of a build.
|
FWIW, the most effective way I've currently found to truly compare builds for economy advantages is to perform the build on Slowest (which gives you plenty of time to not mess up important stuff), and then rather than recording mineral counts, which can vary a lot depending on exactly what instant you take the measurement at, record drone finish times, and compute the difference in available worker-seconds for the various time frames.
For example, worker 9 comes out at 46 seconds in all builds.
11 pool finishes worker 10 at 53 seconds, worker 11 at 67 seconds, pulls a worker to build the pool at 87 seconds, finishes worker 11 again at 114 seconds, worker 12 at 121 seconds, and worker 13 at 129 seconds.
13 pool finishes worker 10 at 73 seconds, workers 11 & 12 at 89 seconds, and worker 13 at 96 seconds.
From this, you can say that from 53-67 seconds, 11pool is 1 worker ahead, from 67-73, it's two ahead, from 73-87 it's 1 ahead, from 87-89 it's even, from 89-96 it's 2 behind, and from 96-114 it's 1 behind.
53-67 (14s) +1 +14ws 67-73 (6s) +2 +12ws 73-87 (14s) +1 +14ws 87-89 (2s) 0 89-96 (7s) -2 -14ws 96-114 (18s) -1 -18ws
So at the 1:54 mark, 11pool is ahead a total of 8 workerseconds, or 5.6 minerals. (workerseconds*7)
It's a huge pain in the ass to do this, especially if you want to go up through the first 20-24 workers, and your timings have to be as clean as possible (which is why I do it on slow) but it's by far the most accurate comparison method I've found, because it uses actual in-game timings, but doesn't have time-of-measurement issues.
It also gives you a much clearer picture of who is ahead during which timeframes, and by how much.
|
On December 06 2010 08:58 jacobman wrote: As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor.
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality.
It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all.
|
On December 06 2010 09:05 Skrag wrote: FWIW, the most effective way I've currently found to truly compare builds for economy advantages is to perform the build on Slowest (which gives you plenty of time to not mess up important stuff), and then rather than recording mineral counts, which can vary a lot depending on exactly what instant you take the measurement at, record drone finish times, and compute the difference in available worker-seconds for the various time frames.
For example, worker 9 comes out at 46 seconds in all builds.
11 pool finishes worker 10 at 53 seconds, worker 11 at 67 seconds, pulls a worker to build the pool at 87 seconds, finishes worker 11 again at 114 seconds, worker 12 at 121 seconds, and worker 13 at 129 seconds.
13 pool finishes worker 10 at 73 seconds, workers 11 & 12 at 89 seconds, and worker 13 at 96 seconds.
From this, you can say that from 53-67 seconds, 11pool is 1 worker ahead, from 67-73, it's two ahead, from 73-87 it's 1 ahead, from 87-89 it's even, from 89-96 it's 2 behind, and from 96-114 it's 1 behind.
53-67 (14s) +1 +14ws 67-73 (6s) +2 +12ws 73-87 (14s) +1 +14ws 87-89 (2s) 0 89-96 (7s) -2 -14ws 96-114 (18s) -1 -18ws
So at the 1:54 mark, 11pool is ahead a total of 8 workerseconds, or 5.6 minerals. (workerseconds*7)
It's a huge pain in the ass to do this, especially if you want to go up through the first 20-24 workers, and your timings have to be as clean as possible (which is why I do it on slow) but it's by far the most accurate comparison method I've found, because it uses actual in-game timings, but doesn't have time-of-measurement issues.
It also gives you a much clearer picture of who is ahead during which timeframes, and by how much.
This is possible solution. I haven't tried it myself yet. Have you tried doing two replays of the same build and checking to see if they both end up giving the same results? I'm just curious because I found me and other people would get different mineral results between different tries. I'll check how it seems to work with my first two replays of a build I have. If it doesn't have practically any difference. I'll try it again to double check. I don't when I'll do this since I still have some other things to post first.
|
On December 06 2010 09:06 Skrag wrote:Show nested quote +On December 06 2010 08:58 jacobman wrote: As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor.
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality. It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all.
In game results are simply a bunch of math too. You can claim that a particular build simulator is flawed, but you can't say that simulators can't simulate perfect play. Unless you have a mistake in the math of the build simulator though, there's no reason to think that the build order calculator does not simulate perfect play.
|
On December 06 2010 09:11 jacobman wrote: This is possible solution. I haven't tried it myself yet. Have you tried doing two replays of the same build and checking to see if they both end up giving the same results? I'm just curious because I found me and other people would get different mineral results between different tries. I'll check how it seems to work with my first two replays of a build I have. If it doesn't have practically any difference. I'll try it again to double check. I don't when I'll do this since I still have some other things to post first.
Simply recording minerals is problematic for a couple reasons.
#1: The time at which you measure can make a *big* difference. For example, if you were to measure at a specific time when there just happened to be 8 drones about to return there minerals, and then measured one second later, there would be a 40 mineral difference.
#2: Small differences can have a significant impact on the mineral count. Missing a perfect inject by a second or two can make a difference, so can overlord timings. Even the way that the larvae randomly move, causing a particular drone to have to move further to get to the minerals, can make as much as a 5-10 mineral difference over time.
So yeah, obviously replays are not perfect. But if you play on slower, so you don't miss any important timings, and save often and reload every time you mess something up, they are still a lot better than the build order calculators, whose results don't even match reality, and can give you completely incorrect answers.
The nice thing about recording worker finish times is that as long as you're playing as perfectly as possible, and doing the same thing each game, they will always line up. Back when I was doing the 9OL vs extractor trick comparisons, I had a custom map that would spit out true start and finish times, down to the nearest 1/256th of a second (the smallest time increment measurable in the game), and it was actually pretty damn impressive how consistent the timings were.
But, like I said, it's a huge pain in the ass to calculate things that way, and you still have to have the build down as near-perfect as possible, including overlord timings, optimal (and consistent!) times to pull drones off to place buildings, when and how to inject vs using larvae from the previous inject, etc.
|
On December 06 2010 09:06 Skrag wrote:
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality.
It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all.
Yes, but we can estimate its "potential" to another build, if hypothetically we had a computer which executes perfectly.
Alas we do not have such a computer, but we can, like i said before, minimize the "human factor" to make results more reliable.
|
On December 06 2010 09:14 jacobman wrote:Show nested quote +On December 06 2010 09:06 Skrag wrote:On December 06 2010 08:58 jacobman wrote: As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor.
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality. It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all. In game results are simply a bunch of math too. You can claim that a particular build simulator is flawed in simulated, but you can't say that simulators can't simulate perfect play. Unless you have a mistake in the math of the build simulator though, there's no reason to think that the build order calculator does not simulate perfect play.
None of the simulators that currently exist make any attempt whatsoever to simulate reality. They're not trying to simulate a game as it is played, they simplify the simulation drastically. For example, instead of trying to simulate trips to and from specific mineral patches, they just assume a .7 minerals/second mining rate for every active worker, when the fact of the matter is that having minerals come in 5 at a time rather than continuously at a constant rate *is important*.
It makes a difference, and sometimes a *big* difference. I'm not speaking theorycraft here. I have personally seen cases where the calculator flat out gave incorrect results, and cases where the build order optimizers would spit out builds that simply weren't possible to execute in game because the timings were all off.
So yes, they *could* simulate a perfect and realistic game. But they don't, because doing that would be an absolutely insane amount of work, and would involve basically reproducing the game's simulation loop. But they're not simulating perfect play. They're simulating perfect SIMPLIFIED play, which is something very different.
|
I think what all the people saying that "build order simulators are unreliable" are trying to say is that because the simulators don't take into account the nuances such as transfer time, they may skew the end result in favor of one opener over another--not just a constant change over all openers. I mean, I don't know how these build order simulators calculate the numbers, but being able to transfer more drones from the first base to the natural (like you would with the 11/18, for example) instantly would definitely give that opener the edge.
At the same time, though, I understand why using replays where the build is used in context of a ladder match, for example, is equally unreliable for reasons you already stated.
I agree with Skrag on how to test all these, though, with the repeated tests on slowest. It's the most tedious, but also the most accurate. However, instead of worker-mining time, I just looked at the income graph in the post-game screen (which I let run a few minutes past full saturation).
|
On December 06 2010 09:22 scAre wrote:Show nested quote +On December 06 2010 09:06 Skrag wrote:
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality.
It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all. Yes, but we can estimate its "potential" to another build, if hypothetically we had a computer which executes perfectly. Alas we do not have such a computer, but we can, like i said before, minimize the "human factor" to make results more reliable.
Again, the simulators do not emulate a computer that executes perfectly. They simplify the simulation drastically so that the results can be calculated quickly and without a ridiculous amount of work trying to reproduce the game's simulator.
You absolutely can use them to *estimate* the potential. But that is purely an estimation, and you cannot draw conclusions from it without showing valid in-game results.
The fact that it's extremely difficult to get consistent in-game results doesn't remove the requirement.
|
On December 06 2010 09:27 Skrag wrote:Show nested quote +On December 06 2010 09:14 jacobman wrote:On December 06 2010 09:06 Skrag wrote:On December 06 2010 08:58 jacobman wrote: As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor.
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality. It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all. In game results are simply a bunch of math too. You can claim that a particular build simulator is flawed in simulated, but you can't say that simulators can't simulate perfect play. Unless you have a mistake in the math of the build simulator though, there's no reason to think that the build order calculator does not simulate perfect play. None of the simulators that currently exist make any attempt whatsoever to simulate reality. They're not trying to simulate a game as it is played, they simplify the simulation drastically. For example, instead of trying to simulate trips to and from specific mineral patches, they just assume a .7 minerals/second mining rate for every active worker, when the fact of the matter is that having minerals come in 5 at a time rather than continuously at a constant rate *is important*. It makes a difference, and sometimes a *big* difference. I'm not speaking theorycraft here. I have personally seen cases where the calculator flat out gave incorrect results, and cases where the build order optimizers would spit out builds that simply weren't possible to execute in game because the timings were all off. So yes, they *could* simulate a perfect and realistic game. But they don't, because doing that would be an absolutely insane amount of work, and would involve basically reproducing the game's simulation loop. But they're not simulating perfect play. They're simulating perfect SIMPLIFIED play, which is something very different.
I tried looked at when drones popped for replays. Even playing it on slower I accumulated 2 seconds differences in timings sometimes. Even with your analysis that doesn't account for those fluctuations. 2 seconds is really easy to accumulate too. I couldn't even find where it was coming from when I watched the replay.
|
On December 06 2010 09:32 jacobman wrote: I tried looked at when drones popped for replays. Even playing it on slower I accumulated 2 seconds differences in timings sometimes. Even with your analysis that doesn't account for those fluctuations. 2 seconds is really easy to accumulate too. I couldn't even find where it was coming from when I watched the replay.
Then you personally are simply not quick enough or consistent enough to generate replays that can be accurately measured. As I said, when I was doing 9OL vs extractor trick comparisons, my timings ended up being very consistent, almost ridiculously so, since I was having the game tell me times down to 1/256th of a second. Then again, I never tried to go any higher than 15 supply in any of those tests, because all I needed to see was where the early advantages were.
Don't get me wrong. I'm not saying this is easy by any means. It's actually *really fucking hard*.
But that doesn't change the fact that it needs to be done, and that build order calculators/simulators can, and will, give you incorrect results that will lead you to flawed conclusions.
|
On December 06 2010 09:27 Skrag wrote:Show nested quote +On December 06 2010 09:14 jacobman wrote:On December 06 2010 09:06 Skrag wrote:On December 06 2010 08:58 jacobman wrote: As I said before, the in-game comparisons are just as unreliable. The best you can do is use the replays as a general guide and not a deciding factor.
The build-order calculators are even MORE unreliable, especially when the results are close, because their results simply do not match reality. It doesn't match what the computer could do if it played perfectly. It only matches what the simulation says would happen, which is not the same thing at all. In game results are simply a bunch of math too. You can claim that a particular build simulator is flawed in simulated, but you can't say that simulators can't simulate perfect play. Unless you have a mistake in the math of the build simulator though, there's no reason to think that the build order calculator does not simulate perfect play. None of the simulators that currently exist make any attempt whatsoever to simulate reality. They're not trying to simulate a game as it is played, they simplify the simulation drastically. For example, instead of trying to simulate trips to and from specific mineral patches, they just assume a .7 minerals/second mining rate for every active worker, when the fact of the matter is that having minerals come in 5 at a time rather than continuously at a constant rate *is important*. It makes a difference, and sometimes a *big* difference. I'm not speaking theorycraft here. I have personally seen cases where the calculator flat out gave incorrect results, and cases where the build order optimizers would spit out builds that simply weren't possible to execute in game because the timings were all off. So yes, they *could* simulate a perfect and realistic game. But they don't, because doing that would be an absolutely insane amount of work, and would involve basically reproducing the game's simulation loop. But they're not simulating perfect play. They're simulating perfect SIMPLIFIED play, which is something very different.
That is the only real simplification that can't be changed in a build order tester, except in the very beginning when you're low on minerals, it is largely irrelevent since by treating mining as continuous you actually get a better idea of worker minutes produced than by looking at minerals in a replay, since, as you said, a worker that hasn't dropped off it minerals yet has still been doing work since the last time it dropped off minerals.
|
On December 06 2010 09:37 Skrag wrote:Show nested quote +On December 06 2010 09:32 jacobman wrote: I tried looked at when drones popped for replays. Even playing it on slower I accumulated 2 seconds differences in timings sometimes. Even with your analysis that doesn't account for those fluctuations. 2 seconds is really easy to accumulate too. I couldn't even find where it was coming from when I watched the replay.
Then you personally are simply not quick enough or consistent enough to generate replays that can be accurately measured. As I said, when I was doing 9OL vs extractor trick comparisons, my timings ended up being very consistent, almost ridiculously so, since I was having the game tell me times down to 1/256th of a second. Then again, I never tried to go any higher than 15 supply in any of those tests, because all I needed to see was where the early advantages were. Don't get me wrong. I'm not saying this is easy by any means. It's actually *really fucking hard*. But that doesn't change the fact that it needs to be done, and that build order calculators/simulators can, and will, give you incorrect results that will lead you to flawed conclusions.
To be honest, it would probably be just as much work, AND be more reliable to simply come up with a build order simulator that doesn't make the continuous mineral production assumption. As far as I can tell that is the only thing that is not realistic about the simulators since in the beginning of the game when each scv arrives at the hatchery affects when the next unit is built or when the next building is built. I suspect that it doesn't make as much as a difference as you think, not to mention that there is a similar affect just by changing maps. Mineral collection rates are different on each map, yet most people don't seem to have an issue with taking results on what the most economic build is and applying it other maps, where the timing on when you get a certain amount of minerals is absolutely different.
|
On December 06 2010 09:39 jacobman wrote: That is the only real simplification that can't be changed in a build order tester, except in the very beginning when you're low on minerals, it is largely irrelevent since by treating mining as continuous you actually get a better idea of worker minutes produced than by looking at minerals in a replay, since, as you said, a worker that hasn't dropped off it minerals yet has still been doing work since the last time it dropped off minerals.
It's very far from irrelevant.
.7 is only an average. The actual mining rate depends on how close the specific patch is, and varies quite a bit.
In cases where you're waiting for minerals to do something, the constant mining rate has you doing that something sooner than you should be able to, because minerals don't come in constantly, they come in batches.
The "settle in" effect, where you have workers bouncing from patch to patch looking for a spot to mine, is not taken into account, and I've seen cases where the 22nd drone didn't actually even start actively contributing to the economy for TWO FULL MINUTES after it was built, because it took that long for the workers to settle into a consistent routine.
The calculators don't take travel time into account, ever, nor do they take into account the diminishing returns you get after the 16th worker, where 2 workers mining a single patch get exactly double the mining rate of a single worker on the same patch, but 3 workers don't get triple.
Look, we can argue this all day, but as I said, I have seen specific cases where the results of a build order simulator flat out gave the wrong answer, saying one path was superior to another, when it was fairly easy to demonstrate that the reverse was actually true, and have seen output from build order optimizers (which use the same simplified reality as the simulators) that spit out impossible builds that couldn't actually be executed in-game no matter how hard you tried.
Simulation can give you a head-start, and it can give you an approximation of the best thing to do, but at the end of the day it is a simulation that is flawed in a number of ways, and you simply cannot say "build x is better than build y" based on the results of that simplified simulation.
|
Btw, when I was doing 9OL vs extractor trick and spawning pool timing testing, I tried *really* hard to get the AI to perform the builds perfectly and consistently.
Unfortunately, it turns out that you just don't have enough flexiblity in the map editor to do the sorts of things that would need to be done. It doesn't give you the option of giving commands at the level that would be required to get an AI player to execute perfectly. I could get it to execute a single extractor trick reliably and *almost* perfectly, but there wasn't any way to get it to execute a double-extractor trick at anything even remotely approaching maximal efficiency.
Which is unfortunate, because if you could, that would be an extremely consistent way of producing replays.
|
Eh, I'll respect that. I still tend to think the results are going to be better than through replays. I've had bad replay comparison experiences with so many differences between different replays that I pretty much gave up on it.
One last point though. The affect of drones searching for mineral patches is very minimal. It doesn't happen a ton in the beginning of the game, and later on in the game it simply does not matter since you have a surplus of minerals at all times and no build timings are affected. Yes you may end up with less minerals but all builds will suffer the exact same mineral deficit due to this. This affect only affects replays in the later (after 20ish supply) game because each replay is usually affected differently by it.
|
On December 06 2010 10:07 Skrag wrote:Btw, when I was doing 9OL vs extractor trick and spawning pool timing testing, I tried *really* hard to get the AI to perform the builds perfectly and consistently. Unfortunately, it turns out that you just don't have enough flexiblity in the map editor to do the sorts of things that would need to be done. It doesn't give you the option of giving commands at the level that would be required to get an AI player to execute perfectly. I could get it to execute a single extractor trick reliably and *almost* perfectly, but there wasn't any way to get it to execute a double-extractor trick at anything even remotely approaching maximal efficiency. Which is unfortunate, because if you could, that would be an extremely consistent way of producing replays. 
That is actually really unfortunate because that would pretty much be the end all be all for build order testing.
Perhaps I made a mistake in the topic of this thread Maybe I should have made it a thread to discuss what the most precise and accurate method of testing build orders relative to one another is.
|
On December 06 2010 10:07 jacobman wrote: Eh, I'll respect that. I still tend to think the results are going to be better than through replays. I've had bad replay comparison experiences with so many differences between different replays that I pretty much gave up on it.
One last point though. The affect of drones searching for mineral patches is very minimal. It doesn't happen a ton in the beginning of the game, and later on in the game it simply does not matter since you have a surplus of minerals at all times and no build timings are affected. Yes you may end up with less minerals but all builds will suffer the exact same mineral deficit due to this. This affect only affects replays in the later (after 20ish supply) game because each replay is usually affected differently by it.
The effect is not "very minimal". Two minutes is worth about 80 minerals. Granted, that was the most extreme case I saw, adding the last possible worker on a base where 22 was the max, but 20-30 second times were not unheard of, even for earlier workers.
It will typically affect different builds in similar ways, but the simulator doesn't take it into account *at all*, and will again have the build doing things before it could actually do them. That's exactly the sort of thing that can have the simulator saying "build X is better than build Y" when it's not actually true, because it executes things in a way that is actually impossible in-game.
It's true that most build have the minerals they need most of the time, but there are very key points in all of them where that's not true, and the build is in fact waiting for minerals. Typically this will happen around big spending points, such as the spawning pool, possibly the first queen, and *always* the hatch.
|
On December 06 2010 10:07 jacobman wrote: Eh, I'll respect that. I still tend to think the results are going to be better than through replays. I've had bad replay comparison experiences with so many differences between different replays that I pretty much gave up on it.
My experience is exactly the opposite of yours. I've been able to pretty consistently reproduce results in replays, especially when playing on slower (for example, I've *never* seen a full two-second difference in timings when trying to accurately measure them), but have seen cases where the simulators and optimizers just flat out gave the wrong answer because the results aren't based on reality.
|
|
|
|