These past two days I had a chance to go to the leading test automation conference in the world, GTAC, and got to meet a bunch of brilliant people in the industry. I hope to aim this discussion at people who are unfamiliar with this field and certainly not technical experts. I have not designed anything, nor do I have anything cool to show. The technical aspects are daunting, but I just wanted to give you guys some food for thought.

For those of you unfamiliar with test automation the general idea is that products of all kinds need some sort of testing in order to make sure it works properly. A lot of this type of testing is done manually whereby an individual is hired or contracted by a company to break that company's products. At some point people started to ask, "wouldn't it be great if there was a way we could test our products automatically every time we make a change to it?" Automation testing is the concept that came out of that. Let's make sure things work faster and with less effort.

In my field, test automation relates to making sure websites operate as they should. Let's take TL as an example. There's a bunch of functionality that needs to be checked every time a developer makes a change to how the site works. Users should be able to login, search fields should work as intended, the navigation bars should update properly, you should only see live streams of those who are online.

One question you may ask is why would changing something related to login affect search fields? Well perhaps they share some of the same tools. As any product becomes complex it becomes harder to track the effects of one change. Imagine changing a busy two-way city street into a one-way street that has fewer lanes. Some people might be delayed with the increased traffic. Others might choose to avoid the street entirely and change what roads they take. People might need to reschedule appointments which would then affect surrounding businesses and not just people on the road.

Wouldn't it be nice if every time someone made a change various programs ran to check and make sure users could still login properly, navigation around the product and various features worked as intended? Almost all technological products can leverage this in some way.

What if we did this for starcraft?

What if every time someone removed a mineral patch from a map, that map ran test games from that base location for every race and every possible build order and gave you organized results about changes in timing?

What if every time someone came up with a new popular build people could test that build on every single map ever used in competitive play?

Obviously it would be near impossible to simulate even a fraction of the possibilities if you consider an opponent who is adapting to what you do, but if we consider how far along some of the AI work in starcraft has come, we can test simple cases. Already we have competitions that focus on micro battles between certain subsets of units that can determine the victor.

If we had a framework to run these types of automated tests for starcraft, perhaps we could change and evolve the game we know and love at a rate unheard of.

AcrossFiveJulys

United States3612 Posts

October 29 2011 05:20 GMT

The problem here isn't the test automation portion. For this case, that's trivial; you just run simulations a certain number of times to get statistical significance for your suite of tests. But like you said these kinds of tests require sophisticated AIs that can play at a human level for the results to be meaningful, and believe me, we are extremely far from that level. The AIs made for bw/sc2 aren't really true AIs, they are in large part hand coded and incapable of adapting to novel situations.

phiinix

United States1169 Posts

October 29 2011 05:26 GMT

Will never happen, and will never be attempted. You can't even being to fathom testing every build possible because the number of things you can do in a game of sc explodes exponentially with every second. Almost literally.

Maybe your examples could be better or I could just not fully understand the concept, but what's the point of removing a mineral patch? If you could collect data on the difference in income, does that even do anything for you? Applying it to balance is something so technical that it isn't needed, and I can't see why you would want that type of information otherwise.
In terms of running builds on every map, what does that change? Every map has a base, natural, and 3rd available with the same exact number of mineral fields and gas geysers, (maybe with the exception of terminus and crevasse, but that's besides the point) and build viability is based on income. Sure we've seen that some builds work "better" on some maps, but that's because of micro. And computers can't simulate that, because no human thinks alike in their decision making, which goes back to the first 2 thoughts on practicality and possibility.

I can't see why you would ever want to use results from an automated test anyway, a computer won't be able to tell the difference in what's "good" or not.

ZaplinG

United States3818 Posts

October 29 2011 05:32 GMT

we could analyze replays from a specific person and learn their exact tendencies using averages in macro analysis, as well as micro. Using these specific averages, you could hash out a few build orders using data collection methods in game while the cpu is playing, essentially creating perfectly recreated progamers. You could then battle these ai against each other and see who is the best lol like setting your favorite soccer teams on fifa both to AI and seeing who wins.

foom

United States47 Posts

October 29 2011 05:48 GMT

The idea isn't to test the intricate cases, that's more along the lines of exploratory testing and would need human interaction. The examples I chose were insufficient unfortunately. So let me try to present it a bit differently.

Let's revisit the case of a map maker utilizing tests of this nature. The computer isn't able to tell what is "good" or not. That's not the role of the test. An automated test is a scenario laid out by a human to check and see if the desired result was reached. Once scripted the idea is that this can be run trivially and can be reused.

Complicated AI isn't what I'm trying to get after. That would encompass something a lot grander like end to end testing. The idea is a lot of small tests to figure out very specific components of the game and if automated properly the cost of running these tests should be trivial. Automating a single player to do a specific build can be beneficial in terms of determining timing. In this case you can gather the raw data for a specific build and gather metrics on how well that build performs in each location.

Does this help clear it up a bit?

phiinix

United States1169 Posts

October 29 2011 06:22 GMT

I still don't really understand how you use it. As a map maker, what is it that you would want to test? What kind of desired results are we looking for?

As far as benefits from determining timings, I (terran) base most of my builds off of an upgrade timing. Lets say I wanted to make a 9:00 push with +1 and shields, with marines and hellions. I know that I want my +1 and shields to be done at exactly 9:00, so I start them at 6:30 and 7:10. I also get additional rax as i can afford, while always building units out of my rax, fact, cc, and not getting supply blocked. As I push I'll expand behind it. What exactly is the automator suppose to be able to do that I can't do myself?
For me personally, when I make builds, the hard part is more about surviving various rushes and finding ways to scout/react than it is about getting the timing right, which requires another player, not a computer

foom

United States47 Posts

October 29 2011 07:10 GMT

In the first case, let's consider one of the simplest examples, how long does it take different types of units to get from one end of the map to the other. I can see that being useful in determining objectively how defensible a position is.

So timing benefits don't just strictly benefit players. Perhaps as events mature they want to change the metagame and promote significantly longer or shorter matches. While simple tests can't tell you everything, they can give you a good portion of a baseline. You want +1 shields with a specific amount of marines and hellions, in what ways can the map be changed in order to make that 9 minute timing an 8 minute timing, or a 10 minute timing?

Again the idea with a lot of these smaller tests is to test very small, very specific factors that you may want to consider. I'm not trying to make a list of all the different things you can test, just pointing out if small tests that have some value can be run trivially why not do it?

Also, there can be thought put into what constitutes a small test, what makes it valuable, what reasonable cases should be considered? All of these are good questions to ask.

phiinix

United States1169 Posts

October 29 2011 07:58 GMT

Measuring the rush distances on a map take about 3 minutes to do in real life anyway, open up the map, send a worker to each base straight out and watch the time. I can't really see a map maker creating a map in which a main goal is to change a 9:00 push into a 10:00 push for purposes of forcing the meta-game to shift or promoting different game lengths, but I could be wrong. On the players perspective, changing your initial upgrade time by 1 minute isn't too hard of a change anyway.

Maybe there are the "very small, very specific factors" I want to test but for the life of me I can't think of any that a automaton without a complicated AI would be able to do that I can't.

Cases aren't valuable and have no need to be looked into if it doesn't improve game play and can not be implemented in a useful fashion. Maybe an automaton would be able to tell me that if my marine is full health at a watch tower and 2 zerglings without speed come, I can make it back home, but I can't imagine it making a difference on my game play, and is nowhere near game breaking. If the situation ever comes up I'll pick 1 or the other, and I'll find out that way.

Maybe the automaton idea isn't so impossible after all, but now I'm starting to think it's not a very useful one.

Please or register to reply.

Live Events Refresh