INTREPID v1.0

StRyKeR

United States1739 Posts

July 06 2009 21:55 GMT

Ever wonder Who is Who in ICCup Season x?

Ladies and gentlemen, please allow me to present this to Starcraft enthusiasts all over the world. I present it with the wish that the original Starcraft lives far beyond the inception of Starcraft 2 and that we will have many more chances in the future to utilize this tool.

I've mentioned in my previous blog posts that I've been working on an automated replay identifier. I've used Support Vector Machines and sophisticated machine learning tools to tackle this problem. However, I've re-evaluated the problem and I've come up with something simpler. I gave up trying to program a machine that is able to identify a player with 100% accuracy. It now presents a list of results that the human must interpret.

Throughout the process, romad has always been a source of inspiration, though he is probably not aware of it. He is the proof that a replay identifier is possible. Perhaps his ability to identify who's who in a replay is nothing short of genius, something akin to humans recognizing patterns in a game of Go. However, with the hopes that his ability can be turned into an algorithm, I kept myself on track.

I present to you INTREPID.

intrepid

characterized by resolute fearlessness, fortitude, and endurance

It is the INTeractive REPlay IDentifier.

Now, I do not claim that its accuracy comes anywhere near the identifying prowess of romad. In fact, it currently does pretty damn terribly.

Where it's at
=====================================================
=====================================================
>>>> INTREPID <<<<
=====================================================
=====================================================

It's in its infant stages and it works somewhat decently. Because of its inadequacies, it is currently only very good at identifying certain players.

It's quite peculiar. It's good at identifying players with very unique signatures, such as IdrA. He has an incredibly unique signature.

I am completely confident that it can detect IdrA replays. You can download the recent IdrA vs. ret replay and upload it to see the result: http://files.getdropbox.com/u/4422/replays/sleepwalker.rep.

I'll go through the process in case you're confused at the results. As mentioned earlier, the results require human analysis.

1. Upload the replay. You'll see the following.

2. Click Analyze to the right of SleepWalker. You'll see the following.

3. There are four categories. Building hotkeys, hotkey actions, hotkey spam, and hotkey assignments. Roughly speaking, each list is a list of players whose hotkeys come close to SleepWalker's hotkeys.

Building Hotkeys = assigning 1 to Command Center, 2 to Barracks, etc.

Hotkey Actions = does the player 1a2a3a or 4a5a6a?

Hotkey Spam = what sequence of hotkeys does the player love to repeat? 515251525125? or 232323232323?

Hotkey Assignments = what hotkeys are used at all by the player?

The "Total" combines all the categories into one easy-to-read list. As you can see, idra trumps all other players in similarity to SleepWalker.

Also, each player in the list is accompanied by a similarity rating.

Where does the player list come from?
I have a few thousand replays gathered mostly from TSL. I have a few dozen progamer replays that I've picked up from TL and other places. Whenever you upload a replay, it is compared to the database I have.

How to use
1) I cannot emphasize enough that in its current state, it is probably about 25-romad (in the sense that it is 25% close to romad-level detection). I don't think we can ever replace romad for accurate replay detection, but with this, a lot of unique key signatures CAN be detected with a fairly high accuracy (e.g. IdrA, TT1, and some others I found to be unique).

2) I'd say that a "Total similarity" rating of less than 0.5 is pretty much guaranteed equivalence. In the SleepWalker replay, the player Idra shows a < 0.5 total similarity to SleepWalker, strongly suggesting that SleepWalker = Idra. Furthermore, the vast majority of the top 50 results in Total is Idra.

3) I must emphasize that just because someone is at the top does not mean that the player is that someone. You must take the similarity into account. In fact, someone might take up the majority of the top 50 results in Total similarity. Even then, you cannot be certain. If the closest player's similarity rating is a 5, that's a pretty terrible similarity rating. The conclusion is that the INTREPID has no clue as to who it is.

Current problems
1) My server uses a 64-bit machine. RepASM, which I am using to analyze replays, was written for 32-bit. Therefore, some replays simply will not produce any results. You can tell because the page will finish load after pressing Analyze and you won't see anything.

2) RepASM has a few deficiencies. I cannot tell which unit is which. I have to detective work (did this unit just build an scv? then it must be the command center). Sometimes, it's impossible to determine the identity of a unit. For example, I can't tell whether a unit is a ComSat station or not because no special ability uses are recorded by RepASM. Thus, one major deficiency is that ComSat hotkeys are completely ignored.

3) I don't have enough progamer replays. In general, it improves the identifier to have more progamer replays. One reason is that the more there is for a player, the more likely it is that I have samples of different build orders. The other reason is that every player has small deviations. To have a greater sample set means that we can estimate the true "signature" of a player better.

4) I'm currently working on a much better (in theory) version of the algorithm. Stay tuned!

It is my hopes that these problems can be fixed by programmers out there. I am hoping that INTREPID is a good enough reason to motivate some coders.

anderoo

Canada1876 Posts

July 06 2009 21:59 GMT

wow
cool

AoN.DimSum

United States2983 Posts

July 06 2009 22:03 GMT

wow good job!

roMAD

Russia2355 Posts

July 06 2009 22:03 GMT

haha nice! btw, what's the address of the website?;

StRyKeR

United States1739 Posts

July 06 2009 22:06 GMT

oops, forgot to include address

tec27

United States3696 Posts

July 06 2009 22:06 GMT

This sounds really cool, nice job!

OneOther

United States10774 Posts

July 06 2009 22:16 GMT

haha this is really cool. been playing around with it for a while now

Sadistx

Zimbabwe5568 Posts

July 06 2009 22:21 GMT

Apparently my TvT hotkeys are closest to Advokate's . Although I doubt the thing's ability to predict it accurately.

Xeofreestyler

Belgium6768 Posts

July 06 2009 22:24 GMT

whoa sick :o

littlechava

United States7218 Posts

July 06 2009 22:26 GMT

#10

On July 07 2009 07:21 Sadistx wrote:
Apparently my TvT hotkeys are closest to Advokate's . Although I doubt the thing's ability to predict it accurately.

You're lucky, my hotkeys/spam were just a mixture of like every z player in the database :[

Shauni

4077 Posts

July 06 2009 22:39 GMT

#11

On July 07 2009 07:26 littlechava wrote:

Show nested quote +

You're lucky, my hotkeys/spam were just a mixture of like every z player in the database :[

HA HA I EVEN FOUND MYSELF IN THE DATABASE, AWESOME TOOL

G0dly

United States450 Posts

July 06 2009 23:38 GMT

#12

hmm it looks like my hotkeys/spam are pretty generic. Interesting tool, I hope you can improve it for greater accuracy, although I don't think romad is replaceable in terms of accuracy.

jimminy_kriket

Canada5499 Posts

July 06 2009 23:39 GMT

#13

sweeeeet, would you like people to share replays with you? I have 7600 replays of koreans and im sure romad has a crapload more.

meathook

1289 Posts

July 07 2009 00:07 GMT

#14

Hmm.. rekrul is apparently Lzgamer. Hehe. Regardless, good work, Stryker!

Oystein

Norway1602 Posts

July 07 2009 00:08 GMT

#15

Cool, I found myself easily in both ZvP and PvZ so I guess that means I have a unique signature? :D
For ZvT I found myself at the top, but lots of others in between further down.

Yeah this seems cool.