Wednesday, March 7, 2012

Coherence. How to get rid of domain classes in grid classpath?

Coherence data grid is working with objects (storing, queering, aggregating etc). Java objects are native for Coherence but .NET and C++ are also supported. Usually this is good thing, but sometimes it may cause you problems.

Typically Coherence deployed as dedicated storage cluster (few JVMs over few servers contributing memory resources) with application processes connecting either as storage disabled members or Coherence*Extend clients. It also possible (and fairly often) that storage cluster can be used by multiple applications.

Idea to have separated release/deploy cycle for storage nodes and application looks very attractive. But there is a trick. Classes for objects stored in Coherence distributed cache should be present in classpath of storage nodes. Bummer.

Well, while statement reflects experience of many Coherence users, it is not technically true. Let me elaborate.
- Coherence storage nodes are storing binary form of keys and values in memory,
- queries (and indexes) may trigger deserialization of object on server side, but if you stick with POF extractors objects wont’t be deserialized,
- entry processors and aggregator may force objects to be desirialized on server side, but you can avoid it by using BinaryEntry API.

So if you are careful, you can get rid of domain classes in classpath on storage nodes. This is huge for complex Coherence based application, you now can just keep grid online while deploying application releases. Of cause, custom entry processor, aggregators, value extractor etc are still have to be available in classpath if you use them, but even if you use them, such kind of code are tending to be much more stable.

Ok, in theory this is achievable, but in practice, it is very hard to achieve. Sticking with binary API to manipulate java object is awkward (and not always efficient due to gaps between object and binary API).

Here is a middle ground solution - partially serialized object


Key idea - use different serializers on client and server side. From same binary presentation; on client side object is fully deserialized, but on server side just outer layer and few fields are, most of object data are still binary blob. This way, on server side we do not need domain classes.

Trick is possible thanks to PofReader.readerRemainder() / PofWriter.writeRemainder(). These methods allow parsing just start of POF stream and keeping its remainder unparsed. At the same time, POF stream are intact, POF extractors can access any attribute of object.

When this technique can be used?

Then is design Coherence based solution, I’m doing my best to keep it modular. Usually there is a layer around Coherence offering application specific, yet generic service. At least one reason to do it in such way - this service can be mocked for testing. Domain object are rarely stored directly in Coherence instead they are wrapped in envelops. Example above illustrates visioned storage, envelop is used to annotate data with timestamp. Envelop is a part of service, while payload of envelop is not.

Code example

package example;

import java.io.IOException;

import com.tangosol.io.pof.PofReader;
import com.tangosol.io.pof.PofSerializer;
import com.tangosol.io.pof.PofWriter;
import com.tangosol.util.Binary;

public class Envelop {

   public static final int TIMESTAMP_POF     =   1;
   public static final int DELETED_POF   =   2;
   public static final int PAYLOAD_POF     =  20;
  
   protected long timestamp;
   protected boolean deleted;
    protected Object payload;
    protected Binary binaryPayload;
    transient boolean serverMode;

    /** TO BE USED WITH SERIALIZER */
    protected Envelop(long timestamp, boolean deleted, Object payload, Binary binaryPayload, boolean serverMode) {
       this.timestamp = timestamp;
       this.deleted = deleted;
       this.payload = payload;
       this.binaryPayload = binaryPayload;
       this.serverMode = serverMode;
   }

    /** Constructor used on client side */
   public Envelop(Object payload, long timestamp, boolean deleted) {
       this.payload = payload;
       this.timestamp = timestamp;
       this.deleted = deleted;
       this.serverMode = false;
    }
  
    public Object getPayload() {
       return payload;
    }
  
    public Binary getBinaryPayload() {
       return binaryPayload;
    }
  
    public long getTimestamp() {
       return timestamp;
    }
  
    public void setTimestamp(long timestamp) {
       this.timestamp = timestamp;
    }

    public boolean isDeleted() {
       return deleted;
   }

   public void setDeleted(boolean deleted) {
       this.deleted = deleted;
   }

   public static class ServerSerializer implements PofSerializer {

       @Override
       public Object deserialize(PofReader in) throws IOException {
           long timestamp = in.readLong(TIMESTAMP_POF);
           boolean deleted = in.readBoolean(DELETED_POF);
           Binary data = in.readRemainder();           
           Envelop dv = new Envelop(timestamp, deleted, null, data, true);
           return dv;
       }

       @Override
       public void serialize(PofWriter out, Object o) throws IOException {           
           Envelop dv = (Envelop) o;
           if (!dv.serverMode) {
               throw new IllegalArgumentException("Object is in client mode, but server serializer is used. Something wrong with POF config!");
           }
           out.writeLong(TIMESTAMP_POF, dv.getTimestamp());
           out.writeBoolean(DELETED_POF, dv.isDeleted());
           out.writeRemainder(dv.getBinaryPayload());
       }
    }
  
    public static class ClientSerializer implements PofSerializer {

       @Override
       public Object deserialize(PofReader in) throws IOException {
           long timestamp = in.readLong(TIMESTAMP_POF);
           boolean deleted = in.readBoolean(DELETED_POF);
           Object payload = in.readObject(PAYLOAD_POF);
           Binary data = in.readRemainder();           
           Envelop dv = new Envelop(timestamp, deleted, payload, data, false);
           return dv;
       }

       @Override
       public void serialize(PofWriter out, Object o) throws IOException {
           Envelop dv = (Envelop) o;
           if (dv.serverMode) {
               throw new IllegalArgumentException("Object is in server mode, but client serializer is used. Something wrong with POF config!");
           }
           out.writeLong(TIMESTAMP_POF, dv.getTimestamp());
           out.writeBoolean(DELETED_POF, dv.isDeleted());
           out.writeObject(PAYLOAD_POF, dv.getPayload());
           out.writeRemainder(dv.getBinaryPayload());
       }       
    }
}

Summary

Using this technique it is possible to exclude application specific classes from classpath of cluster member JVMs. If you are using .NET or C++ you can even avoid implementing domain objects in Java at all, yet be able to do complex operations using POF extractors.

4 comments:

  1. Hi Alexey,
    just surprised that you didn't mention your ReflectionPofSerializer implementation in this article.
    Using POF format should avoid you to share classes definitions in your cache nodes, am I wrong ?

    ReplyDelete
    Replies
    1. ReflectionPofSerializer is using java classes so you need them on classpath. What I was trying to achieve - strip domain classes from Coherence cluster classpath (and avoid classpath upgrade process for cluster with application releases).

      Delete
  2. Hi,
    Could you provide a more complete as I don't see how you obtain the BinaryPayload and how you are effectively using the Enveloppe ?
    Do you need to declare the 2 PofSerializers ?
    Thx

    ReplyDelete
    Replies
    1. You cannot access binary payload directly. Server side does not have code to deserialize this part so it remains inaccessible binary.

      But POF extractor will work.

      On client side binary payload will be null, but object payload will hold value.

      Object inside evelop in accessible on server side, but Evelop could be passed in entry processor or other sever side running code.

      Two serializer required, client one should be in POF config of process having domain objects in classpath. Server serializer for processes with restricted classpath.

      Delete