Showing posts with label POF. Show all posts
Showing posts with label POF. Show all posts

Wednesday, March 7, 2012

Coherence. How to get rid of domain classes in grid classpath?

Coherence data grid is working with objects (storing, queering, aggregating etc). Java objects are native for Coherence but .NET and C++ are also supported. Usually this is good thing, but sometimes it may cause you problems.

Typically Coherence deployed as dedicated storage cluster (few JVMs over few servers contributing memory resources) with application processes connecting either as storage disabled members or Coherence*Extend clients. It also possible (and fairly often) that storage cluster can be used by multiple applications.

Idea to have separated release/deploy cycle for storage nodes and application looks very attractive. But there is a trick. Classes for objects stored in Coherence distributed cache should be present in classpath of storage nodes. Bummer.

Well, while statement reflects experience of many Coherence users, it is not technically true. Let me elaborate.
- Coherence storage nodes are storing binary form of keys and values in memory,
- queries (and indexes) may trigger deserialization of object on server side, but if you stick with POF extractors objects wont’t be deserialized,
- entry processors and aggregator may force objects to be desirialized on server side, but you can avoid it by using BinaryEntry API.

So if you are careful, you can get rid of domain classes in classpath on storage nodes. This is huge for complex Coherence based application, you now can just keep grid online while deploying application releases. Of cause, custom entry processor, aggregators, value extractor etc are still have to be available in classpath if you use them, but even if you use them, such kind of code are tending to be much more stable.

Ok, in theory this is achievable, but in practice, it is very hard to achieve. Sticking with binary API to manipulate java object is awkward (and not always efficient due to gaps between object and binary API).

Here is a middle ground solution - partially serialized object


Key idea - use different serializers on client and server side. From same binary presentation; on client side object is fully deserialized, but on server side just outer layer and few fields are, most of object data are still binary blob. This way, on server side we do not need domain classes.

Trick is possible thanks to PofReader.readerRemainder() / PofWriter.writeRemainder(). These methods allow parsing just start of POF stream and keeping its remainder unparsed. At the same time, POF stream are intact, POF extractors can access any attribute of object.

When this technique can be used?

Then is design Coherence based solution, I’m doing my best to keep it modular. Usually there is a layer around Coherence offering application specific, yet generic service. At least one reason to do it in such way - this service can be mocked for testing. Domain object are rarely stored directly in Coherence instead they are wrapped in envelops. Example above illustrates visioned storage, envelop is used to annotate data with timestamp. Envelop is a part of service, while payload of envelop is not.

Code example

package example;

import java.io.IOException;

import com.tangosol.io.pof.PofReader;
import com.tangosol.io.pof.PofSerializer;
import com.tangosol.io.pof.PofWriter;
import com.tangosol.util.Binary;

public class Envelop {

   public static final int TIMESTAMP_POF     =   1;
   public static final int DELETED_POF   =   2;
   public static final int PAYLOAD_POF     =  20;
  
   protected long timestamp;
   protected boolean deleted;
    protected Object payload;
    protected Binary binaryPayload;
    transient boolean serverMode;

    /** TO BE USED WITH SERIALIZER */
    protected Envelop(long timestamp, boolean deleted, Object payload, Binary binaryPayload, boolean serverMode) {
       this.timestamp = timestamp;
       this.deleted = deleted;
       this.payload = payload;
       this.binaryPayload = binaryPayload;
       this.serverMode = serverMode;
   }

    /** Constructor used on client side */
   public Envelop(Object payload, long timestamp, boolean deleted) {
       this.payload = payload;
       this.timestamp = timestamp;
       this.deleted = deleted;
       this.serverMode = false;
    }
  
    public Object getPayload() {
       return payload;
    }
  
    public Binary getBinaryPayload() {
       return binaryPayload;
    }
  
    public long getTimestamp() {
       return timestamp;
    }
  
    public void setTimestamp(long timestamp) {
       this.timestamp = timestamp;
    }

    public boolean isDeleted() {
       return deleted;
   }

   public void setDeleted(boolean deleted) {
       this.deleted = deleted;
   }

   public static class ServerSerializer implements PofSerializer {

       @Override
       public Object deserialize(PofReader in) throws IOException {
           long timestamp = in.readLong(TIMESTAMP_POF);
           boolean deleted = in.readBoolean(DELETED_POF);
           Binary data = in.readRemainder();           
           Envelop dv = new Envelop(timestamp, deleted, null, data, true);
           return dv;
       }

       @Override
       public void serialize(PofWriter out, Object o) throws IOException {           
           Envelop dv = (Envelop) o;
           if (!dv.serverMode) {
               throw new IllegalArgumentException("Object is in client mode, but server serializer is used. Something wrong with POF config!");
           }
           out.writeLong(TIMESTAMP_POF, dv.getTimestamp());
           out.writeBoolean(DELETED_POF, dv.isDeleted());
           out.writeRemainder(dv.getBinaryPayload());
       }
    }
  
    public static class ClientSerializer implements PofSerializer {

       @Override
       public Object deserialize(PofReader in) throws IOException {
           long timestamp = in.readLong(TIMESTAMP_POF);
           boolean deleted = in.readBoolean(DELETED_POF);
           Object payload = in.readObject(PAYLOAD_POF);
           Binary data = in.readRemainder();           
           Envelop dv = new Envelop(timestamp, deleted, payload, data, false);
           return dv;
       }

       @Override
       public void serialize(PofWriter out, Object o) throws IOException {
           Envelop dv = (Envelop) o;
           if (dv.serverMode) {
               throw new IllegalArgumentException("Object is in server mode, but client serializer is used. Something wrong with POF config!");
           }
           out.writeLong(TIMESTAMP_POF, dv.getTimestamp());
           out.writeBoolean(DELETED_POF, dv.isDeleted());
           out.writeObject(PAYLOAD_POF, dv.getPayload());
           out.writeRemainder(dv.getBinaryPayload());
       }       
    }
}

Summary

Using this technique it is possible to exclude application specific classes from classpath of cluster member JVMs. If you are using .NET or C++ you can even avoid implementing domain objects in Java at all, yet be able to do complex operations using POF extractors.

Saturday, August 13, 2011

Announce: ReflectionPofSerializer 1.2 is available


ReflectionPofSerializer has been laying around without significant changes about a year and a half. Though I'm actively using it across projects, it was just working. But recently I decided to make few important functional improvements and extend capabilities of ReflectionPofSerializer.

Advanced support for collections

A few complains I was hearing about ReflectionPofSerializer were bad support for collections. Actually these complains should have been addressed to POF protocol itself, ReflectionPofSerializer just were using it as is. POF has a bad habit to substitute collections and maps with its own implementations, which may not necessary be compatible application needs (i.e. you have a good chance for HashSet to be replaced by list, and TreeMap by HashMap). If you are writing POF serializer by hand you can easily fix it, but for automatic serialization it may be a serious issue.
How new version of ReflectionPofSerializer is handling this problem?
If you have problem with collection classes being converted during serialization and this is breaking your application, you can force Coherence to use ReflectionPofSerializer for that specific collection class. This will require you to add standard (or your custom) collection classes to POF config.
<user-type>
    <type-id>2000type-id>
    <class-name>java.util.ArrayListclass-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.ReflectionPofSerializerclass-name>
    serializer>
user-type>

<user-type>
    <type-id>2001type-id>
    <class-name>java.util.LinkedListclass-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.ReflectionPofSerializerclass-name>
    serializer>
user-type>

<user-type>
    <type-id>2002type-id>
    <class-name>java.util.TreeMapclass-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.ReflectionPofSerializerclass-name>
    serializer>
user-type>

<user-type>
    <type-id>2003type-id>
    <class-name>java.util.HashSetclass-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.ReflectionPofSerializerclass-name>
    serializer>
user-type>

<user-type>
    <type-id>2004type-id>
    <class-name>java.util.TreeSetclass-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.ReflectionPofSerializerclass-name>
    serializer>
user-type>
That is it. ReflectionPofSerializer now knows how to properly serialize/deserialize collection classes.

Using POF without POF config

Using POF drastically improves Coherence performance for most applications. Unfortunately you have to register all your classes in pof-config.xml and provide serializers. Even through you can avoid writing serialization/deserialization code, you still have to maintain pof-config.xml (and it may be hundreds of classes). Could we just generate pof-config.xml in run-time on demand? Well, all nodes should use same pof-config.xml (or at least same class to ID mapping), so it is problematic. But wait, we already have a Coherence which can easily keep such mapping in sync across all nodes! That is the idea behind AutoPofSerializer.
AutoPofSerializer manages shared class to ID mapping (using Coherence cache) and automatically adds new classes to that mapping. It also automatically chooses to use ReflectionPofSerializer if object is not implementing PortableObject interface.
So now, you just need to configure AutoPofSerializer for cache and it will just work without need for pof-config.xml (actually it may look at pof-config.xml, but if class is not in there, it will be added automatically).
<distributed-scheme>
    <scheme-name>simple-distributed-schemescheme-name>
    <serializer>
        <class-name>org.gridkit.coherence.utils.pof.AutoPofSerializerclass-name>
    serializer>
    ...
distributed-scheme>
 AutoPofSerializer is still experimental and I would probably not recommend it to be used in production. But it is very useful for development, prototyping and evaluating POF performance. You can make sure what your code is solving problem and only then do boiler plate work of configuring POF.

See more information of ReflectionPofSerializer page at GridKit.

Friday, July 16, 2010

Coherence, ReflectionPofSerializer now supports POF extractor

A technical article related to Oracle Coherence.

A year ago I have implemented and open sourced ReflectionPofSerializer. This class has saved me from implementing thousands of lines of boring serialization code for various Filters, EntryProcessors, Aggregators , Invocables and other mobile objects in Coherence.
Recently I was asked about POF extrator support in ReflectionPofSerializer. My answer was no, it doesn't support extraction from POF directly. But it turns out what ReflectionPofSerializer can be easily extended to support it. A bit of coding and woala, let me introduce ReflectionPofExtrator.

Full text of article is available at GridDynamics blog - http://blog.griddynamics.com/2010/07/coherence-reflectionpofserializer-now.html