Friday, February 17, 2012

POF serialization, beware of arrays

As you may have experienced yourself, object marshaling/serialization often lays on performance critical path in distributed system (especially in Java). Default Java serialization is almost always is a poor choice, countless alternatives was developed to replace it. POF (portable object format) is binary object serialization format which was developed by Oracle Coherence team with both cross platform and efficiency in mind (POF can be compared to Thrift or ProtoBuf). While being compact and efficient POF is also offering advanced features such as access to specific attributes of object without deserializing whole thing.

But sometime avoiding deserialization may be more expensive than deserialization itself. Below is a short story of my experience worth sharing.

Recently, I was working on Coherence based solution. After another set code changes, automatic test has shown sudden performance degradation. One of aspect of change set was replacing reflection based filters with ones using PofExtractor. Surprisingly, times of operations which were relying on affected filters have increased about 20 times! While PofExtractors are not always faster compared to ReflectionExtractors in my practice, 20 times slowdown seems to be totally unreasonable. Anyway, switch to PofExtractor was made to avoid classpath dependency, so switching back to ReflectionExtractor was not an option.
Profiling session have showing interesting results:
  • objects were never completely deserialezed (as expected with PofExtractor),
  • filter execution indeed was a bottleneck,
  • more targeted profiling have identified method PofHelper.skipUniformValue() as a hot spot.
While PofHelper.skipUniformValue() is expected to be heavy duty method (it is used to parse POF binary stream seeking attribute to be extracted), it was eating unproportionally large amount of CPU. Instrumentation profiling have revealed one more fact - few thousands of calls to skipUniformValue() have been made for each call to PofExtractor.extractFromEntry().

Well, that could explain HOW code gets slow, but questions are WHY and how to fix it.

Object, I was working with in application, were mostly wrapper around a chunk of data in proprietary encoded binary format. Few attributes were extracted and stored as fields of java objects, but larges part of object was a single chunk of binary data. This chunk were stored as byte[] in Java and written using writeByteArray() in POF stream. There is another method suitable for this task writeBinary(), but using writeByteArray() just feels more convent. That was a key mistake.

POF array vs. binary

POF can encode numbers using variable length format, writeByteArray() encodes byte[] as an array of variable length objects. Well actually they are all exactly one byte at length, but generic POF decode still inspects each byte to calculate array length (and for each byte it would be calling skipUniformValue() which is huge method with switch case for every possible POF data type) . On contrary, writeBinary()writes opaque blob, so parser just reads its size and skips it in single operation.
I have replaced writeByteArray() with writeBinary() (you may also want to replace read code, but it is not necessary readByteArray()can parse Binary from stream and convert it byte[] for you). It have solved problem.

Below is comparison of extractor times with various amounts of binary data inside object, using 2 versions of serialization.
Please mind difference in scale between diagrams, it is an order of magnitude!

Summary

Ok, use writeBinary() will help you with byte arrays, but what if you need to store float[] or other array type?
If arrays are not large it is ok, but if they are, then you should avoid use of PofExtractors, use normal extractors instead.

I hope this performance issue will be fixed eventually,
but until it is fixed,
beware of arrays in your POF objects!