Some applications of the classifier

StRyKeR

United States1739 Posts

November 25 2008 03:31 GMT

Brief introduction to what's below
I'm working on an algorithm that would take a replay and determine who played in it. I always thought it would be possible because humans can pretty easily spot patterns in hotkey signatures and conclude who played. For more reference, refer to my first blog entry.

Basically, the machine needs training data (previous replays) and after it trains on that data, we can begin to ask it questions.

As a test, I've applied the current machine to Nony's 6 replays from the Replays section. It's a good sign that all replays have been classified successfully as Nony's.

The confidence is basically the number of samples I had available to train the machine. The more experience the machine has, the more samples it has seen, the better it would be in classifying new examples. The more practice it gets, the better it is going to do in action. Of course, it's not a completely one-to-one correlation. A machine trained on a few samples might end up being amazingly accurate.

The similarity measure is how close the machine thinks the owner of the replay is. A value of 1 means a pretty damn close match (as shown below, with Nony). A value of -1 means a pretty far match.

I'm still working on making the machine logistics easier to deal with (organizing replays and compiling data). Basically, that means I'm trying to write scripts that automate a lot of things. That's been my focus lately more than tweaking and improving the algorithm itself.

That said, here are some new results.

I put the six new Nony replays (found in the Replays section) through the machine and got the following results.

An overwhelming vote for Nony. Whew, it didn't get it wrong

The classifier used for these graphs was trained using 502-dimensional feature vectors grabbed from replays from 9 well-known Starcraft protoss players. In this example, the machine clearly points to Nony. In harder examples, other players may show close resemblances, making it a harder problem.

ilovejonn

Canada2548 Posts

November 25 2008 03:49 GMT

can you tell us what the graphs show us and what they mean?
ie: what does similarity represent and what does confidence represent?

StRyKeR

United States1739 Posts

November 25 2008 03:58 GMT

On November 25 2008 12:49 ilovejonn wrote:
can you tell us what the graphs show us and what they mean?
ie: what does similarity represent and what does confidence represent?

Oh, right. I updated my post for first-timers to my blog.

SonuvBob

Aiur21549 Posts

November 25 2008 04:20 GMT

Nice, I've been hoping for something like this for a while.

ilovejonn

Canada2548 Posts

November 25 2008 04:34 GMT

Ahh, I see. Thanks. Pretty nifty app. xD

overpool

United States191 Posts

November 25 2008 04:46 GMT

That's an amazing idea. Are you planning on releasing any source code?

StRyKeR

United States1739 Posts

November 25 2008 06:45 GMT

On November 25 2008 13:46 overpool wrote:
That's an amazing idea. Are you planning on releasing any source code?

Basically, I have some MATLAB code (you can think of MATLAB as a sophisticated calculator) that takes input data, trains a machine using it, and can classify new examples.
I also have some php code that converts replay files into 502-dimensional feature vectors, but the actual code that reads a replay file was borrowed from Taiche's RepASM library.

In my opinion, the core component is the php feature selection and MATLAB learning algorithm, both of which are more idea and concept-driven than code-related. In other words, I could release the code, but it wouldn't tell you that much because they're already out there. The RepASM library as I mentioned is on Taiche's website and the machine learning code (Support Vector Machine code) is all over the web.

What would be interesting is how I come up with 502 features, I suppose.

BottleAbuser

Korea (South)1888 Posts

November 25 2008 09:01 GMT

Indeed, tell us! I wasn't aware that there are so many attributes of Starcraft playstyle that could be easily quantified.

Please or register to reply.

Some applications of the classifier

Completed

Ongoing

Upcoming