Towards a good SC bot - P3 - Scouting

imp42

398 Posts

September 30 2016 23:02 GMT

Towards a good StarCraft bot - Part 3 - "Scouting"

Summary: + Show Spoiler +

A guide on general scouting in 4 simple steps. Additionally, one step deals with basic machine learning and a final step with some very basic game theory to make the scouting pattern less predictable.

Scouting in 6 steps

An elegant algorithm that solves scouting pretty well, in 6 simple steps (Ok, maybe step 1 through 4 are simple while step 5 and 6 are a bit more advanced). It is assumed that you have decided when to scout and what unit(s) to use to scout.

Step 1: Heatmap “explored”
We create a heatmap containing for every tile (32x32 pixels) in the game the information whether the tile has been explored or not. The scouting unit is assigned the closest unexplored tile if either this tile is walkable and there exists a path from the scout to the tile, or the scout if flying.
Once any unit reaches a previously unexplored tile, the heatmap is updated to represent that the tile is now explored.
This simple algorithm will already make a scout explore the whole map.

the image below shows explored and unexplored regions.
[image loading]

Step 2: Heatmap “Key regions”
Instead of just assigning “true” for “explored” tiles and “false” for “not explored” tiles, we can identify some key regions of special interest and assign higher values to those regions.
In a first step, let us assign: 0: explored, 2: normal, 3: base location, 5 starting location
The scouting unit is assigned the closest tile of all tiles with the same highest value. This is all it takes to make the scout go explore possible starting locations first and then check expansion locations, before checking out everything else.

Key regions such as potential enemy starting locations have higher priority:
[image loading]

Step 3: aging heatmap
Instead of having static values, every value of the heatmap is regularly updated according to the newest information and then increased by a small amount. This way, explored tiles (whose values have been reset to 0) will slowly grow in importance up to the point where the scout decides to reexamine them.
Now we have created an algorithm which will let the scout explore the map endlessly, always going back to reexamine the spot that has been hidden in partial fog of war the longest.

the image below shows unexplored tiles growing in importance over time.
[image loading]

Step 4: weighted aging heatmap
A small improvement: instead of letting all tiles age with the same speed, we let tiles with high initial values (like the ones in key regions) faster. So the rate of growth is simply proportional to the initial value. This way, they are reexamined more frequently.

Step 5: machine-learning key regions
Instead of manually identifying key regions, we examine a set of replays and measure how long each tile was occupied by something interesting (usually an enemy building). This way, we are able to catch typical spots for proxies or hidden tech.

Note that the tile occupation values should be normalized after each analyzed replay as well as after completion of the learning session. Of course different sets can be learned for different match ups. Normalizing just means scaling down all values at the end of the game evenly to a range between e.g. 0 and 5. This way we can deal with longer and shorter games easily.
An advanced version slowly forgets what it has learned to account for shifting meta.

Step 6: playing with probabilities (Poker)
Brood War is at least to some extent a game of poker. If our opponent knows what algorithm we use to scout, then he will build proxies and hide tech at places he knows we won’t scout in a long time. Depending on our machine learning, we might or might not adjust the scouting pattern after game 1 in a best of 5. Either way, our strategy is easy to counter: If we don’t adjust, then abuse the same spot again. If we do adjust, then counter-adjust. So we have to become unpredictable.
In other words: we have to sometimes scout unlikely places first. This will make scouting less optimal, so we shouldn’t do it too often. How often we should do it is a game theory problem equivalent to the problem of how often to bluff in poker. Applying the solution from poker we would scout randomly, but according to the probabilities to find something interesting. That is, if machine learning establishes that region A is three times more likely to contain valuable information than region B, then we will still scout region B first with a probability of 25% = one out of four games.

Final comments:
In an actual implementation the exploring has to be implemented in a robust way. The scout could get stuck, get killed, or be unable to decide between two equally interesting locations. It could attempt to reach a location that is scoutable but not walkable or not manage to walk up a ramp, etc. etc.
The above algorithm should be viewed as a background algorithm to execute while nothing else happens. For example, once we find the enemy base it might make more sense to stick around as long as possible, going in circles around the base.

By the way: here is an example of the initial path of the scouting, recreated manually.
+ Show Spoiler +

As can be seen the scout checks out the enemy base first, then goes back to check for proxies in the own base, checks some other places on the map and then decides it's time to go back to the enemy base for a second look.
[image loading]

Credit:
The algorithm is inspired by how ants create paths via pheromones. Thanks to nepeta for valuable input and discussion on ant path finding!

Jett.Jack.Alvir

Canada2250 Posts

October 01 2016 02:06 GMT

Wow I think your algorithm is simple yet powerful. Thinking about your scouting algorithm makes me rethink my own scouting strategy. Although I do scout enemy base first, than proxies (identical to your algorithm) I always forget to recheck some areas.

I think you should give main attack paths a faster aging tile.

imp42

398 Posts

October 01 2016 02:46 GMT

On October 01 2016 11:06 Jett.Jack.Alvir wrote:
I think you should give main attack paths a faster aging tile.

that's a good input. Actually I was thinking of including mobile units in the machine learned heat map as well, but later dropped the idea. Including mobile units would accomplish the goal of establishing what the main attack paths are.
Then they will automatically be interpreted as a key tiles and hence age faster.

The whole thing then works much like ant path finding, which inspired me in the first place.

Going further down the road, giving a higher priority to shuttle/dropship/loaded overlord will establish common drop harass patterns.

YokoKano

United States612 Posts

October 02 2016 01:05 GMT

completely unrelated but it's cool how old games can actually be deconstructed in a way comprehensible to the casual audience.

Jett.Jack.Alvir

Canada2250 Posts

October 02 2016 02:07 GMT

speaking of scouting unlikely places, I recall a game of SC2 where herO proxied a stargate right beside the natural of his opponent (I don't recall opponent but I think he was Terran). It wasn't game ending, but his opponent only found it until after the Stargate was finished.

Sometimes a ballsy proxy goes undetected because our opponent would never think we are brazen enough to try something so outrageous.

YokoKano

United States612 Posts

October 02 2016 04:34 GMT

yes, protoss is the real deal. the race is true to its namesake, especially in SC2. winning and losing is a question of intuition.

it would be interesting to see a human player adapt in an AI series. we have seen previously the human will not magic box like the AI. other elements of control will also be missing. surely the human will play the designer, but it seems to be a considerable disadvantage. the designer will know gamer patterns, but the human will only be discovering the designer's thoughts.

imp42

398 Posts

October 02 2016 04:44 GMT

On October 02 2016 13:34 YokoKano wrote:
it would be interesting to see a human player adapt in an AI series. we have seen previously the human will not magic box like the AI. other elements of control will also be missing. surely the human will play the designer, but it seems to be a considerable disadvantage. the designer will know gamer patterns, but the human will only be discovering the designer's thoughts.

Human brains are incredibly good at inferring patterns (Too good actually, in many cases inferring patterns where there aren't any...). At the current state of the art humans are able to detect AI patterns after game 1 and conclude winning counter-strategies for the rest of the series.

graNite

Germany4434 Posts

October 02 2016 06:32 GMT

Does it make sense to also consider the speed of the scouting unit and the game state?

speed: if the bot has a quick unit, it can check for proxies and complete the whole scouting route faster, so the priorities could be assigned values that are lower. the other way round, if you have a really slow unit, the choice of what to prioritize is more meaningful and has a bigger impact.

game state: scouting for proxies has a bigger impact on the game in the early game. the longer the game goes on, base locations become more important, especially when you consider what areas to recheck.

What about scouting by holding a position? (I dont know about BW, but in SC2) you can place overlords on some spots just permanently check if the opponent has taken a base or is taking a certain route. Is this part of the algorithm? Do you think it should be?

Is it possible to adjust scouting the "heat" of the game state? What I mean by heat is the amount of action happening and the number of fights per time. If there is a permanent battle going on, scouting for additional bases is not as important as opposed to when both players macro up and noone attacks.
In a game of permanent action, the scouting unit might make the difference in the fight.

imp42

398 Posts

October 02 2016 16:11 GMT

On October 02 2016 15:32 graNite wrote:
Does it make sense to also consider the speed of the scouting unit and the game state?

this absolutely makes sense. In my own implementation I adjusted the configurations for the worker scout. A slower unit would never be able to catch up on anything while a faster unit would re-examine the enemy base too early. It's a point I actually had not thought of, even though I do allow for any unit to potentially scout.

game state: scouting for proxies has a bigger impact on the game in the early game. the longer the game goes on, base locations become more important, especially when you consider what areas to recheck.

This is actually quite important, yes. The naive way of doing this is to create a heat map for every in-game time unit. For example, every 100 frames. In a 10-minute game this requires approximately 10*60*24 / 100 = 144 states, which can be stored in approx. 2304 KB of memory (128*128 tiles * 144 frames * 1 Byte).
A more sophisticated approach would detect "significant changes" during the learning phase and update the heat map only at those specific times in the game.

What about scouting by holding a position? (I dont know about BW, but in SC2) you can place overlords on some spots just permanently check if the opponent has taken a base or is taking a certain route. Is this part of the algorithm? Do you think it should be?

While it certainly is very important I do not think it should be part of the algorithm. I would devise a separate algorithm for permanent scouting balancing potential information gained against likelihood to get killed. In real games this mostly applies to overlords, observers, zerglings, and the occasional marine. It sounds like a good topic for a follow up post on scouting.

Is it possible to adjust scouting the "heat" of the game state? What I mean by heat is the amount of action happening and the number of fights per time. If there is a permanent battle going on, scouting for additional bases is not as important as opposed to when both players macro up and noone attacks.
In a game of permanent action, the scouting unit might make the difference in the fight.

It is possible, but I advice against it. Whether or not to scout should not be decided by the scouting algorithm, but by a higher level decision maker. In my case it is the overall game strategy deciding.

nepeta

1872 Posts

October 02 2016 16:17 GMT

#10

Impressive, most impressive ^^

YokoKano

United States612 Posts

October 03 2016 02:54 GMT

#11

hahahahahahaha the scouting unit might make the difference in the fight that's great

Jett.Jack.Alvir

Canada2250 Posts

October 03 2016 03:27 GMT

#12

Wow that is some higher level of understanding. So if a higher level decision maker tells your bot to scout, what kind of cues are you going to use to initiate the scouting algorithm? I would assume part of it would be time based (e.g. @ 6:30 scout opponent base for tech path). That would mean you might need several different scouting algorithms depending on cues (e.g. scout for proxy vs. scout for hidden base vs. scout main attack path). Or will your aging heat map take care of the various cues to scout?

This sc bot is quite intriguing.

imp42

398 Posts

October 04 2016 12:48 GMT

#13

On October 03 2016 12:27 Jett.Jack.Alvir wrote:
Wow that is some higher level of understanding. So if a higher level decision maker tells your bot to scout, what kind of cues are you going to use to initiate the scouting algorithm? I would assume part of it would be time based (e.g. @ 6:30 scout opponent base for tech path).
This sc bot is quite intriguing.

Speaking in theory, the decision maker evaluates its need for information vs. the likelihood to obtain such information by scouting. Early game this is not required to be calculated in-game, since as long as the opponent is still covered in fog of war, there are only three variables: the map, your own build, and the enemy race. So we can create a lookup table offline.
In practice this just resolves to e.g. a @9-scout

That would mean you might need several different scouting algorithms depending on cues (e.g. scout for proxy vs. scout for hidden base vs. scout main attack path). Or will your aging heat map take care of the various cues to scout?

following the line that the higher-level decision maker determines the "when" and the scouting algorithm decides on the "how", this will be taken care of by the heat map.

Now imagine the following scenario:
The algorithm has been initialized with a machine learned heat map indicating a very low probability for proxies. The scout arrives at the enemy base and finds.. nothing (a clear indication for a proxy). Now it is up to the higher-level decision maker to decide on the appropriate reactions: e.g. abort fe, build defense, find the proxy. The "find the proxy" part is delegated back to the scouting algorithm, which could implement it by adjusting the values of the heat map.
This reactionary update is definitely material for a part 2 on scouting though

Finding hidden bases is done automatically by the algorithm. Imagine the algorithm as a background task. Whenever you have an idle unit, you can assign it to do some scouting.

Jett.Jack.Alvir

Canada2250 Posts

October 04 2016 18:45 GMT

#14

This is getting more and more intriguing. And more complex the longer I think about the possibilities.

So a cue is triggered by the decision maker to scout. Scouting routine is initiated and finds information that triggers more decisions by the learning machine. It will become a continuous cycle of cue/routine/feedback all with separate algorithms that work in tandem.

Fascinating! The interesting part is the AI learning process, and once it recognizes patterns certain routines can be eliminated for efficiency.

imp42

398 Posts

October 04 2016 21:07 GMT

#15

On October 05 2016 03:45 Jett.Jack.Alvir wrote:
So a cue is triggered by the decision maker to scout.

I would phrase it like this: "A cue triggers the decision maker to initiate the scouting routine"

Scouting routine is initiated and finds information that triggers more decisions by the learning machine. It will become a continuous cycle of cue/routine/feedback all with separate algorithms that work in tandem

Just to clarify: I do all the learning offline from replays. While in-game, the loop is as follows:

loop:
1. check all cues
2. react on cue
2.1. if cue requires scouting and we're currently not scouting: initiate scouting
3. scouting potentially creates more cues
4. repeat: back to step 1

So yes, there is a cue/routine/feedback cycle, but the feedback does only affect the decision maker and not the scouting routine while in-game. Once the game is over the scouting routine can be altered by updating the heat map with the new information from the replay.

Edit: you maybe referred to the potential in-game alteration of the heat map. Then it is correct: the feedback also can affect the scouting routine itself. But I don't consider this to be "learning", since it is only an ad-hoc adjustment. I haven't implemented this yet, by the way.

Jett.Jack.Alvir

Canada2250 Posts

October 05 2016 01:05 GMT

#16

A cue triggers the decision maker to initiate the scouting routine

Very subtle but I see the difference.

This is really fascinating. Makes me think of playing SC2 again.

Please or register to reply.

Towards a good SC bot - P3 - Scouting

Completed

Ongoing

Upcoming