Serialization Sucks - Rant Blog

CecilSunkure

United States2829 Posts

December 27 2013 07:42 GMT

Serialization is a computer science term, so this is a computer sciency rant blog.

So I've spent a lot of time studying serialization and the impacts it has on development cycles. A dev cycle would refer to the time it takes to tweak something in a project and actually test the change in a meaningful way. For games this usually involves a designer modifying some gameplay aspects and then trying them out in-game.

The worst case dev cycle for a game design would involve: recompile entire game, reload the game's executable, run around in the game and get to a specific point in order to test the modification.

Many improvements can be made here and one of the more important ones is to make sure that the dev cycles of your software engineers are really short. Software engineers are fairly expensive, so you want them to be spending their time wisely.

Whenever you save in a game and reload you are writing to and reading information from a file. This is serialization to and from file. You can also serialize data to/from things like strings or networks.

From wikipedia:

In computer science, in the context of data storage and transmission, serialization is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer, or transmitted across a network connection link) and reconstructed later in the same or another computer environment.

Serialization is pretty non-trivial (in C and C++) and a lot of work can be put into systems to serialize things. Often times these systems are extremely annoying to use. When a new data type is introduced it may be required of a software engineer to write custom code to serialize this new data type. This results in extremely repetitive code thus wasting a whole lot of money.

Here's an example of a game object I just serialized (my own custom format):
+ Show Spoiler [code] +

GameObject
{
  Sprite
  {
    m_tx Transform
    {
      p Vec2
      {
        x float 0.000000
        y float 0.000000
      }
      o float -3.132142
      s Vec2
      {
        x float 5.000000
        y float 5.000000
      }
      zOrder int 0
    }
    m_draw bool true
    m_texture Texture* "test1.png"
  }

  LuaComponent
  {
    isCoroutine bool true
    name std::string "testcomponent"
  }
}

One way to fix this inefficiency is with the use of an introspection library. This allows code to understand its own types of data during run-time, and either operate on data in a generic way, or generate type-specific serialization code. Either way the result is often that programmers are required to do very minimal work in order to serialize a new data type.

I'm ranting because I've spent a very long time studying and creating my own serialization routines. Even so I still have to spend a little bit of time registering new data types. This is mostly due to the fact that my university owns all of my class-related code which resulted in myself re-writing a shitload of code.

Now I'm at a point where I'm duplicating code that is already fairly redundant, that is rewriting some serialization registration stuff for like the 4th time.

It's not even hard or time consuming, I'm just FUCKING TIRED of writing the same code.

But all is well, soon I'll be finished and never have to do this again. That is of course unless I'm hired to write serialization code/tools somewhere... That would be hilarious.

WARNING

If any of you actually want to get into computer science as a profession (this pertains especially to those who are interested in working in the games industry) please realize that passion is something that can be taken advantage of. I can now understand the value in hiring young programmers with lots of passion; they'll be willing to do the grunt work while the higher ups kick back and relax with their pick of the best fruit. Just be careful not to get yourself taken advantage of.

edit: TRIVIA

Where did the term serialization come from!??!? I was once told that it meant "in serial", as in: in order from one byte to another. iirc this originally referred to bytes coming from a network wire in serial format. Pretty lame, I know.

MysteryMeat1

United States3288 Posts

December 27 2013 08:40 GMT

will you have my babies?

CecilSunkure

United States2829 Posts

December 27 2013 08:41 GMT

MysteryMeat1

United States3288 Posts

December 27 2013 08:44 GMT

1/5

BLinD-RawR

ALLEYCAT BLUES49479 Posts

December 27 2013 09:10 GMT

can we be friends?

Talin

Montenegro10532 Posts

December 27 2013 09:46 GMT

So are you still friends with that Brood War progamer?

CecilSunkure

United States2829 Posts

December 27 2013 10:14 GMT

On December 27 2013 18:10 BLinD-RawR wrote:
can we be friends?

yep

On December 27 2013 18:46 Talin wrote:
So are you still friends with that Brood War progamer?

he went back to korea TT, so no

Noobity

United States871 Posts

December 27 2013 15:19 GMT

So I tend to skim these blogs, because I'm fascinated by them, but my degree was in the art side of gaming and not the coding. I zone out and eventually end up comatose reading about code and stuff like that. However one thing resonated with me as something that everyone going into these smaller fields needs to know.

If any of you actually want to get into computer science as a profession (this pertains especially to those who are interested in working in the games industry) please realize that passion is something that can be taken advantage of. I can now understand the value in hiring young programmers with lots of passion; they'll be willing to do the grunt work while the higher ups kick back and relax with their pick of the best fruit. Just be careful not to get yourself taken advantage of.

The bolded part is correct to an extent, but I'd go so far as to state: "please realize that passion is something that is necessary." There are going to be those who get into any field that don't need passion to do the job, they'll get through school on their talents alone and see it as just another thing to do, but that is not you. By "you" I mean anyone going into these smaller fields. If you don't have the passion to continuously work on something to improve yourself, but want to go into making video games because it "seems cool" you'd better have some insane talent.

Please, learn from someone who made the mistake. I wanted to go into art for games and without going into the specifics I coasted through college, got through, but found that because I didn't work as hard as my peers I was at the very bottom of a long list of people looking for one of the few jobs available in the field. CecilSunkure is a great example of a guy who has passion and is doing it the right way. If you don't think you could go on an internet forum and blog about your experiences in learning, much less doing, then I highly suggest rethinking your current career path. That isn't to say that you need to blog to be successful, but if you don't think you could do it then passion may be a problem.

EDIT: It actually occurs to me that Cecil was using that term in the other way. This is what happens when you skim. I think he has a point, but I also think that if you have a passion for something you should be able to deal with being taken advantage of for a while. As long as you learn from it though, and know when to cut it off, that's when you put the power in your own hands.

Merany

France890 Posts

December 27 2013 15:54 GMT

This is the kind of situation where I'm pretty happy to work in C#!
On the project I'm working on, we just serialize data to XML files: I just have to update an XSD file and let the call serializer.Serialize() do all the work for me

. Just have 2+ years experience as dev though and haven't done C / C++ for quite a while, can't comment too much on your specific rant

fonger

United Kingdom1218 Posts

December 27 2013 16:44 GMT

#10

If you're working a project solo or have the lead you can just push as much as possible into POD types, then write a few basic procedures for serialising dynamic types (strings etc) to close the gaps. This can involve making calls like storing references to other world objects as UIDs/hard offsets into an array instead of holding the pointer (or have every serialisable object hold and expose its own UID for reconstruction).

This still leaves you with hardcoding the list of serialise this and that either way, but I have a terrible idea to solve that too. Use RAII for everything, then expose a GLOBAL serialiser which is TESTED from both your constructor and destructor. If it's found and found to be valid in context, the constructor can almost automatically serialise in and the destructor can serialise out.

This involves a reasonably smart serialiser which can hold a sort of "stack" of in/out objects (or expected objects) but after that assuming perfect RAII (except a struct with the POD info maybe which would have to be hardcoded per class) all serialisation would be automatic. This is terrible, probably breaks every rule in the book and most likely collapses horribly whenever an exception is thrown but I think it's cool o_o

makmeatt

2024 Posts

December 27 2013 16:47 GMT

#11

On December 28 2013 00:54 Merany wrote:
This is the kind of situation where I'm pretty happy to work in C#!
On the project I'm working on, we just serialize data to XML files: I just have to update an XSD file and let the call serializer.Serialize() do all the work for me

. Just have 2+ years experience as dev though and haven't done C / C++ for quite a while, can't comment too much on your specific rant

It's always important to remember that in many cases an easy-to-use interface hides a very complicated structure built for generic use and thus its performance might be found subpar compared to hand-made solutions (fml, java). Nevertheless, things like serialization are not the kind of thing you'll be performing thousands of times a second (I think?), so you might as well use an already provided solution. I heard Boost has some neat serialization module, doesn't it?

Steveling

Greece10806 Posts

December 27 2013 16:58 GMT

#12

Serialization is trying to publish a manga on shonen jump.
That's what bakuman taught me.

tarpman

Canada717 Posts

December 27 2013 19:45 GMT

#13

To me, serialization seems like a great example of an already-solved problem that isn't really worth the time solving again. Boost has it. Glib has it. ASN.1 BER has been around forever (and then some); among the libraries implementing it are GNU libtasn1 and OpenLDAP libldap.

Since you said you're studying serialization I'm sure you have particular goals or requirements that make writing your own code a necessity, but in a real-life project I would always use one of these solutions before rolling my own... I'm sorry you have to spend your valuable time on this!

Takkara

United States2503 Posts

December 27 2013 21:44 GMT

#14

I think you have the etymology of the word serialization correct. I've always understood it in the context of writing an object to stream. An object is a "3 dimensional object" and needs to be converted into a serial stream of bits to be saved to disk or for transport over a network, hence serialization of the data.

I don't know if you've had to deal with this quite yet, but I think it is far far worse to deal with backwards compatibility with respect to serialization than to deal with the boilerplate code itself. Certainly it's annoying to write serialization methods, but having to track and update old serialization code is so obnoxious and can lock pieces of the architecture in place longer than you'd like because it would badly break old serialization.

tec27

United States3690 Posts

December 28 2013 00:44 GMT

#15

Uh. https://code.google.com/p/protobuf/

tarpman

Canada717 Posts

December 28 2013 02:07 GMT

#16

On December 28 2013 09:44 tec27 wrote:
Uh. https://code.google.com/p/protobuf/

I knew I was forgetting an important one. *headdesk*

CecilSunkure

United States2829 Posts

December 28 2013 04:09 GMT

#17

It is true that there are tools for serialization already, but companies and studios still always seem to have their own. Different projects have different needs and requirements. Sometimes it's speed, sometimes it's a case of "not made here". Either way a fear of reinventing the wheel is silly for anyone in a research environment (like a student). It's important to learn how to accomplish very difficult tasks.

tec27

United States3690 Posts

December 28 2013 06:37 GMT

#18

You're in here whining that you had to rewrite a ton of code because you didn't own it, not because you were trying to learn something. There's hardly even a case for that, serialization is a mindless task and barely worth anyone's time. I can guarantee you that protobufs are fast enough and general enough for pretty much anyone's use case, don't even try to pull that shit. Hey look, Valve uses them because they're not morons!

I don't care if you want to waste your time reinventing the wheel, but don't come whining on TL because you made the voluntary choice to do it.

CecilSunkure

United States2829 Posts

December 28 2013 06:43 GMT

#19

Get out of my blog with your flabbergastardy. I have good reasons, and maybe if you asked for them I could have a nice discussion with you. BANNED

Please or register to reply.

Serialization Sucks - Rant Blog

Completed

Ongoing

Upcoming