XML, JSON, ProtoBuff serialization performance compared

I’ve been using protobuf-net with good success in serialize objects where performance and compact data representation are primary goals for transmission over cellular data and storage on mobile devices. However, I’m expanding my efforts x-platform and want am reevaluating my choice in this context to identify any pitfalls. App design considerations where serializer selection is concerned include the following (for me):

  • Minimizing storage footprint for cached data on small-form factor devices.
  • Minimizing read/write times to and from storage.
  • Minimizing data transfer times and costs for mobile broadband scenarios.
  • Serializer performance.
  • Interoperability with Windows Web Services (WCF).
  • Client cross-platform support.

The serializers I’ve considered are:

  • BinaryFormatter
  • JsonDataContract
  • JsonNewtonsoft
  • Protobuf-net
  • XmlDataContract
  • Xml

I’m not going to get into the relative pros and cons of each serializer because much has been written on the topic. The following table by ahsteele posted to stackoverflow that provides a nice summary:

serializertable

As a cross-platform solution and for making web service calls the binary formatter isn’t an option because it lacks cross-platform support and produces relatively large output due to embedded type metadata. However, I included it anyway for the sake of comparison.

DataContract versus Protobuf-net

Even before running performance tests, serializer selection comes down to protobuf vs data contract serializer for me. As mentioned, I’ve been using protobuf-net for quite some time now on some fairly deep object graphs. Even though protobuf-net implements serialization attributes such as ProtoContract and ProtoMember, DataContract and DataMember (anlogous attributes) can be used in their place so either serializer can be used without needing to modify class implementations.

As will be shown, it is no surprise that protobuf produces smaller output and faster serialization than datacontract as  these advantages are defining features of protobuf. Probably the biggest single disadvantage of protobuf — for me — is that it’s not standards-based and and is community maintained (Marc Gravell). Moreover, with protocol buffers, you must express inheritance in your class definitions via ProtoInclude attributes and tag class properties with ordering. In my opinion, this makes your code more fragile when it comes to versioning. Finally, you don’t get proxy/stub generation for free via adding a service reference from Visual Studio. With some work, a custom service end-point can be defined using ProtoEndpointBehavior to use protobuf serializer in place of the data contract serializer. But this is simply a path that I do not want to go down because deserializing manually is a small tax to pay and avoid potential complications stemming from cross-platform clients and changes in WCF.

In fairness to protobuf-net, I should point out that some of the issues raised above can be addressed using some of the code generation features included with the distribution, which can also lead to even greater performance gains because custom code replaces reflection for serializing data types. For me it’s just not worth it. The performance gains are sufficiently large using protobuf-net reflective serialization that it’s compelling without needing to take a dependency on the code generation tooling. There’s always the option to revisit this if needing to eek out that last bit of performance.

Measuring Performance

I made the following code available on GitHub  for measuring and comparing performance of the serialization methods listed above :

https://github.com/rolanday/blog/tree/master/SerializerXGames

The code measures the following:

  • Time (in milliseconds) to serializer 1000 objects of a specified type.
  • Time to deserialize the same.
  • Size (in bytes) of the serialized data.
  • Compressed (gzip) size of the serialized data.

I added compressed size because it’s a useful data point low storage footprint and minimizing mobile broadband data usage are primary goals. Compression of course incurs a performance penalty, which is not factored in to the times reported. The app also shows the serialized and deserialized output which are useful for debugging.

To facilitate ongoing development and testing, the app the serializer tests can accept any type, as shown here:

// A different data type can be swapped for Employee to measure
// performance on it. Templated because some of the serializers
// have only templated deserializer methods.
var i = Employee.JamesTKirk;
var c = new Collection
{
new BinaryFormatterSerializerTest(i, Iterations),
new JsonDataContractSerializerTest(i, Iterations),
new JsonNewtonsoftSerializerTest(i, Iterations),
new ProtobufSerializerTest(i, Iterations),
new XmlDataContractSerializerTest(i, Iterations),
new XmlSerializerTest(i, Iterations)
};

foreach (var test in c)
{
test.Execute();
}

Simply replace Employee with your own type to test its performance, or use the app to test deserialization errors during development by catching unhandled exceptions and inspecting the deserialized object property grid.

Let the Games Begin

For testing performance a simple Employee class is used, which is by no means intended for production code. However, it relies on inheritance, has a collection, and a sprinkling of different simple data types to try and reflect something useful and real.

classdiagram1

The following screenshots illustrate the app in action and the performance results for each of the serializers used to test the Employee instance.

The results are displayed in the datagrid control, which is databound to the test collection making it easy to test and report on different data types.

The results are displayed in the datagrid control, which is databound to the test collection making it easy to test and report on different data types.

The deserialized instance for selected serializer is displayed in a property grid, making it easy to visually inspect for accuracy and completeness.

The deserialized instance for selected serializer is displayed in a property grid, making it easy to visually inspect for accuracy and completeness.

serializerxgames3

The serialized instance for the selected test can also be visualized in the Serialized Data view.

serializerxgames4