|
On April 22 2012 12:00 zylog wrote: This is awesome. I wonder if this can somehow be integrated into sc2gears to add additional information for replay analyses.
That would be really sick! Awesome work OP
|
Now you got me curious, but i'm still not sure about the actual use of the information.
|
On April 22 2012 09:23 RobiTL wrote: Ok I'm 100% sure there is all the data that we want :D
Values as stored as the double !
0x28 => 40 => 20 0x50 => 80 => 40 0xF001 => 240+0 =>120 0x9802 => 152+1*128 => 280 0xc002 => 192+1*128 => 320 0x9003 => 144+2*128 => 400 0xe003 => 224+2*128 => 480 General rule : YY1 YY2 gives (YY1 + (YY2 - 1) * 128) / 2
Graph data is like this : 02 04 00 04 09 [XX] 05 06 00 09 [YY1 YY2 YY3 ...] XX is X coordinates 00,02,04, ...,3C YY is Y coordinate as stated above.
average unspent resource is preceded by 05040009ce0f02090802001e00 Ressource collection rate is preceded by 05040009ce0f02090a02001e00
05040009ce0f02090a02001e00 ==> ressource collection rate 0205060009 f018 02040004099a1300 => player 1's = f018 0205060009 ba1e 02040004099a1300 => player2's = ba1e 0000 3's 0000 4's 0000 5's 0000 6's 0000 7's 0000 8's 0000 9's 0000 10's 0000 11's 0000 12's 0000 13's 0000 14's 00 end
The format of the data looks like the serialized data format used elsewhere in replay files. See https://github.com/GraylinKim/sc2reader/wiki/Serialized-Data
0x05 is a Key-Value object, so average unspent resources is a Key-Value object with 2 entries. First key (key 0x0) is a variable length integer (0x09). Second key (key 0x02) is also a variable length integer (0x09). sc2reader has a parser for this format written in python, and I also have a parser of my own written in javascript: https://github.com/tec27/comsat/blob/master/lib/parsers/blizzerial.js
Hopefully that helps you see what the prefixes are for and stuff
|
Hi, so I added a couple fixes to the code mentioned above. The wiki hasn't been updated but the code is very simple and can be found on the master branch of the repo (here).
The uncompressed data appears to be a list of structures. I had 9 in the example I was working from, the number may vary based on the number of players, not sure yet. An example python script I am using can be found below:
+ Show Spoiler +import zlib, sys, pprint from sc2reader.utils import ReplayBuffer with open(sys.argv[1],"rb") as s2gs: raw_data = zlib.decompress(s2gs.read()[16 ) buffer = ReplayBuffer(raw_data) data = [ buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct() ] pprint.PrettyPrinter(indent=2).pprint(data)
|
|
Ok I installed your python thingy, here is what i got : on 4th structure : { 0: 999, 1: 0} resources [{ 0: ressources, 1: 0, 2: game time}] { 0: 999, 1: 1} units { 0: 999, 1: 2} structures { 0: 999, 1: ...} ... on 5th structure (weird that it is in another structure, especially the razed count) { 0: 999, 1: 10} is structure razed count ? { 0: structure razed count, 1: 0, 2: game time} { 0: 999, 1: 11} is resource graph [{0:army value, 1:0,2: game time}, ...] { 0: 999, 1: 12} is army graph [{0:army value, 1:0,2: game time}, ...] { 0: 999, 1: 16777328} is unknown ??
So we also have the graph data now, only need to find the build order information now...
|
On both replays, the AI did : 00:03 drone 7/10 00:22 drone 8/10 00:44 drone 8/10
On the first i did : 00:05 drone 7/10 00:19 drone 8/10 00:30 drone 9/10 00:49 over 9/10 00:57 drone 10/10
On the second i did : 00:02 drone 7/10 00:18 drone 8/10 00:28 drone 9/10 00:39 drone 10/10 00:56 over 10/10
I think the data of the build orders (only changes i could find in all structures) is in the 5th structure after the graphs !
Data of the first replay :
{ 0: { 0: 999, 1: 0x1000084}, 1: [ [ { 0: 7, 1: 0x1000A, 2: 0x4900}, { 0: 8, 1: 0x2000A, 2: 0x12F00}, { 0: 9, 1: 0x4000A, 2: 0x1DE00}, { 0: 10, 1: 0x7000A, 2: 0x38C00}], [{ 0: 7, 1: 0xA, 2: 0x3200}, { 0: 8, 1: 0x3000A, 2: 0x16700}], ... { 0: { 0: 999, 1: 0x1000086}, 1: [ [{ 0: 9, 1: 0x6000A, 2: 0x31600}], [{ 0: 8, 1: 0x5000A, 2: 0x2C400}],
Data of the second replay :
{ 0: { 0: 999, 1: 0x1000084}, 1: [ [ { 0: 7, 1: 0xA, 2: 0x1C00}, { 0: 8, 1: 0x2000A, 2: 0x11C00}, { 0: 9, 1: 0x4000A, 2: 0x1C400}, { 0: 10, 1: 0x5000A, 2: 0x27600}], [{ 0: 7, 1: 0x1000A, 2: 0x3200}, { 0: 8, 1: 0x3000A, 2: 0x16700}], ... { 0: { 0: 999, 1: 0x1000086}, 1: [ [{ 0: 10, 1: 0x7000A, 2: 0x37800}], [{ 0: 8, 1: 0x6000A, 2: 0x2C400}],
Edit : I put values of 1: and 2: in Hexadecimal, its easier to get values like this
I think 0x84 is for drones, 0x86 for overlords the 0: is the supply (7,8,9,10) the 1: not sure... the ending A might be the total supply (10) the 2: I think is the time where 0x1C = 00:02 0x32 = 00:03 0x49 = 00:05 0x11C = 00:18 0x12F = 00:19 0x167 = 00:22 0x1C4 = 00:28 0x1DE = 00:30 0x276 = 00:39 0x2C4 = 00:44 0x316 = 00:49 0x378 = 00:56 0x38C = 00:57
anyone to get the general formula ?
Edit : Found ! i's in 16th of seconds ! 0x1DE = 478 and 478/16 = 29.875 so 00:30
|
On April 22 2012 12:21 Marti wrote: Now you got me curious, but i'm still not sure about the actual use of the information.
It's reliable stats. Think if a tournament like MLG or DH gets all s2gs files. They can then, for each player, calculate the most used build order or stuff like that. It would be possible to see trends in metagame with build orders. It's just more stats which you can't get out of the replay files. altough they are "harder" to retrieve for a big tournaments because s2gs isn't made so you can share the information like replay files are.
|
We should create a collection of example files so we can test them out. I'll start: stats0.s2gs stats1.s2gs stats2.s2gs stats3.s2gs
| File | Date | Time | Team | Race | Player | BNet ID | Resc | Units | Structs | O.view | |------+------------+----------+------+------+--------------+---------------+-------+-------+---------+--------| | 0 | 2012-04-18 | 21:17:06 | 1 | Z | PetaleDeRose | /EU/1/2012869 | 23900 | 40725 | 6125 | 71950 | | 0 | | | 2 | P | Hashmush | /EU/1/2158213 | 22025 | 25200 | 4350 | 53025 | | 1 | 2012-04-15 | 18:51:06 | 1 | P | Hashmush | /EU/1/2158213 | 5075 | 7450 | 1475 | 14100 | | 1 | | | 2 | P | LemoN | /EU/2/146521 | 4300 | 4200 | 1400 | 10000 | | 2 | 2012-04-15 | 15:44:51 | 1 | P | imbaL | /EU/1/2407329 | 24400 | 40250 | 4625 | 71025 | | 2 | | | 2 | P | Hashmush | /EU/1/2158213 | 26875 | 52150 | 5900 | 87075 | | 3 | 2012-19-22 | 22:24:38 | 1 | P | overgame | /EU/1/535265 | 17350 | 38525 | 4875 | 62650 | | 3 | | | 1 | Z | Sun | /EU/1/756106 | 18331 | 38300 | 4100 | 62131 | | 3 | | | 2 | P | Hashmush | /EU/1/2158213 | 16950 | 28550 | 2950 | 49150 | | 3 | | | 2 | Z | Freemind | /EU/1/838056 | 18550 | 20150 | 2050 | 42450 |
EDIT: Meh, formatting
I'm only able to parse stats3.s2gs with Graylin's script. + Show Spoiler + Traceback (most recent call last): File "extract.py", line 89, in <module> main() File "extract.py", line 80, in main buffer.read_data_struct() File "build\bdist.win-amd64\egg\sc2reader\utils.py", line 270, in read_data_struct datatype = self.read_byte() File "build\bdist.win-amd64\egg\sc2reader\utils.py", line 150, in read_byte return ord(self.read_basic(1)) TypeError: ord() expected a character, but string of length 0 found
My modified script: + Show Spoiler +import zlib, sys, pprint from sc2reader.utils import ReplayBuffer
races = {'Prot':'Protoss','Zerg':'Zerg','Terr':'Terran','RAND':'Random'} data_names = [ 'R', 'U', 'S', 'O', 'AUR', 'RCR', 'WC', 'UT', 'KUC', 'SB', 'SRC', ] data_names_pretty = [ 'Resources', 'Units', 'Structures', 'Overview', 'Average Unspent Resources', 'Resource Collection Rate', 'Workers Created', 'Units Trained', 'Killed Unit Count', 'Structures Built', 'Structures Razed Count' ]
def getRealm(str): if str == '\x00\x00S2': return "EU" return "?" def getPlayers(data): players = [] parr = data[0][3] for i in range(16): if not (parr[i][0][1] == 0): players.append(getPlayer(data, i))
return players def getPlayer(data, index): pinfo = data[0][3][index] pdata = data[3][0]
player = { 'id': "{}/{}/{}".format(getRealm(pinfo[0][1][0][1]), pinfo[0][1][0][2], pinfo[0][1][0][3]), 'race' : races[pinfo[2]] }
stats = {}
for i in range(len(pdata)): stats[data_names[i]] = pdata[i][1][index][0][0] stats[data_names[len(pdata)]] = data[4][0][0][1][index][0][0] player['stats'] = stats
return player
def main(): p = sys.argv[2] == "players" with open(sys.argv[1],"rb") as s2gs: raw_data = zlib.decompress(s2gs.read()[16:]) buffer = ReplayBuffer(raw_data) data = [ buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct(), buffer.read_data_struct() ] if p: pprint.PrettyPrinter(indent=2).pprint(getPlayers(data)) else: pprint.PrettyPrinter(indent=2).pprint(data) main()
It outputs the following for stats3.s2gs + Show Spoiler +[ { 'id': 'EU/1/535265', 'race': 'Protoss', 'stats': { 'AUR': 958, 'KUC': 82, 'O': 62650, 'R': 17350, 'RCR': 1384, 'S': 4875, 'SB': 34, 'SRC': 5, 'U': 38525, 'UT': 96, 'WC': 61}}, { 'id': 'EU/1/756106', 'race': 'Zerg', 'stats': { 'AUR': 794, 'KUC': 105, 'O': 62131, 'R': 18331, 'RCR': 1360, 'S': 4100, 'SB': 17, 'SRC': 7, 'U': 38300, 'UT': 183, 'WC': 70}}, { 'id': 'EU/1/838056', 'race': 'Zerg', 'stats': { 'AUR': 1134, 'KUC': 20, 'O': 42450, 'R': 18550, 'RCR': 1354, 'S': 2050, 'SB': 17, 'SRC': 0, 'U': 20150, 'UT': 169, 'WC': 72}}, { 'id': 'EU/1/2158213', 'race': 'Protoss', 'stats': { 'AUR': 587, 'KUC': 49, 'O': 49150, 'R': 16950, 'RCR': 1130, 'S': 2950, 'SB': 28, 'SRC': 0, 'U': 28550, 'UT': 90, 'WC': 45}}]
|
I don't have time to migrate documentation to the teamliquid wiki right now. Anyone who wants to compile a team liquid wiki can feel free to steal from the sc2reader wiki. Just leave a back reference to the project if you can.
Also, how can I get quoted text to preserve link markup?
On April 22 2012 21:29 RobiTL wrote: Edit : Found ! i's in 16th of seconds ! 0x1DE = 478 and 478/16 = 29.875 so 00:30
Yes, sc2runs 16 frames per second. You can use the games speed conversion to get real time if you wanted.
I should also mention that unit types on the build order are almost certainly integers. They should match up to the hex values in this data file. The types are also accessible programmatically, see the TargetAbilityEvent for an example.
There are a bunch of other oddities that might apply here. Game date is in a windows timestamp which has a wierd conversion function.
At this point I wish I had maintained my documentation better. Given that we're sitting here parsing a compressed byte stream I am hoping that reading the code isn't too much of an issue.
|
@Prillian You only need to read 5 data structures (the interesting stuff is on the 4th and 5th) On some files if you try to read more it'll make the TypeError ord().
|
On April 22 2012 22:46 RobiTL wrote: @Prillian You only need to read 5 data structures (the interesting stuff is on the 4th and 5th) On some files if you try to read more it'll make the TypeError ord().
Ok, thanks. Works a lot better now ^^
|
On April 22 2012 22:46 RobiTL wrote: @Prillian You only need to read 5 data structures (the interesting stuff is on the 4th and 5th) On some files if you try to read more it'll make the TypeError ord().
The ord error is an unchecked way of letting you know you hit the end of the buffer. The class could benefit from a more defensive programming style...
|
ahhhh nerds and the internet, never ceases to amaze. Good work!
|
Two things.
1. This is an awesome project that will most likely yield great results. 2. Holy shit the TL community can work fast when it wants to.
♥
|
Amazing stuff guys keep it up. I have a lay (idiot) question. If I wanted to see my changing sending quotient over my last hundred games, do I have to go back trough my games played and open the stats page 1 by 1 to generate the file and then run a program of some sort at the end of the process or would I need to get the program to hit the files as they were spat out or... I guess I'm basically asking when the cache folder gets cleaned....
|
My player struct looks like this:
{ 0: { 0: 0, 1: { 0: { 0: 2, 1: '\x00\x00S2', 2: 1, 3: 535265}, 1: { 0: 3405691582L, 1: 11617005556283211776L}}}, 1: { 0: 0}, 2: 'Prot'}
The Bnet id can be constructed from
bnet = { 0: 2, 1: '\x00\x00S2', 2: 1, 3: 535265}
like this (pseudo-code)
bnet[1]/bnet[2]/bnet[3] = EU/1/535265 I believe that '\x00\x00S2' has to do with which server it's on. I'd like to see what it's for the other realms.
It coincides with the bnet part of the player entry in replays: replay.details
|
On April 22 2012 23:30 Dapper_Cad wrote: Amazing stuff guys keep it up. I have a lay (idiot) question. If I wanted to see my changing sending quotient over my last hundred games, do I have to go back trough my games played and open the stats page 1 by 1 to generate the file and then run a program of some sort at the end of the process or would I need to get the program to hit the files as they were spat out or... I guess I'm basically asking when the cache folder gets cleaned.... Test it out. Grab all the .s2gs files and see how old they are
Note that we're not done yet so all desired information isn't available yet
|
On April 22 2012 23:30 Dapper_Cad wrote: Amazing stuff guys keep it up. I have a lay (idiot) question. If I wanted to see my changing sending quotient over my last hundred games, do I have to go back trough my games played and open the stats page 1 by 1 to generate the file and then run a program of some sort at the end of the process or would I need to get the program to hit the files as they were spat out or... I guess I'm basically asking when the cache folder gets cleaned.... I think it can be implemented in sc2gears or some program to create that file for every auto saved replay after the update.
|
I'll have a look when I get home. Regardless, great, GREAT things will come of this, as long as it's going. As an aside it's both odd and depressing that one of the colossal advantages which esports should have over regular sports is automated gathering of statistics with unprecedented breadth and depth and that's exactly what IP acti-bliz's constipated attitude toward IP gets in the way of. Whatever, can't emphasis enough, fantastic work.
|
|
|
|