Tuesday, December 4, 2012

Coherence 101, Beware of cache listeners

Cache events facility is a quite useful feature of Oracle Coherence. For example, continuous queries and near cache features are build on top of cache event system.
Unfortunately it could be also abused easily. In particular, they are noticeably bad at scale unless you are very careful.
Please note. This article is covering only partitioned cache topology (distributed cache scheme).

Client side map listeners

UPDATE: I was very wrong in my previous description of client side synchronous map listeners. Section below was rewritten to reflect more accurate picture.

Client side map listeners are usually added via NamedCache API. They typically receive events from caches hosted on remote JVMs (storage nodes). But regardless of whenever cache event is produced at remote or local JVM, Coherence will deliver it to listeners using dedicated event dispatch thread (or service thread itself for listeners marked as synchronous).

Each cache service has only one event dispatch thread, and it could easily become a bottle neck, limiting speed of cache event processing on client.

Few tips to mitigate this design aspect are below.

  • Do not do anything time consuming in listener itself, offload processing to other thread instead.
  • Be careful with synchronization – avoid lock contention in listener code.
  • When event hits your listener, its data are still in binary form. To avoid deserialization cost, do not access key or value in event dispatch thread, instead pass reference to map event object to own processing thread (or thread pool).

Last advice may not be intuitive, but deserialization of map event in Coherence’s event dispatch thread often becomes a bottleneck slowing down event processing rate.

Synchronous and normal map listeners

There is a marker interface SynchronousListener in com.tangosol.util package.
You could implement it in your map listener. But this wouldn’t make map event delivery to your listener synchronous with cache operation (as you may think), instead it would affect in which thread your listener is invoked.

Normal listeners are invoked in event dispatch thread.

“Synchronous” listeners would be invoked in service thread

What are the differences?

  • Imagine you have near cache and you are using entry processor to update entry. If cache event would be processed in event dispatch thread, data in near cache may remain stale for short time between entry processor call have returned, but event is not processed yet.
    Using of synchronous listeners would solve this, because event would be guaranteed to be processed before processing response message from entry processor invocation.
  • Time consuming custom map listeners could slow down event dispatch thread increasing event delays. This would affect Coherence build-in facilities such as near caches and CQC would be affected because they use synchronous listeners internally - you can consider it extra level of protection from misbehaving developer :)

But let me stress it again, for any type of listener events are delivered asynchronously relative to other cluster nodes.

Backing map listeners

Backing map listeners are used less often (but being abused more frequently). Backing map listeners are usually configured via XML cache configuration and work on storage side.
On storage side, Coherence could use pool of worker threads to perform operations in parallel. You may assume that you backing map listener would also be invoked in parallel …
… but that is wrong. Backing map listener could process one map event at time for given cache, regardless of thread pool size.
First time, I was also surprised by such behavior. This is not fundamental limitation of Coherence, but all out-of-box variations of backing map use cache global lock to dispatch map event. Even for partitioned backing map Coherence will use ObservableSplittingBackingMap wrapper which is, again, using global lock.
So, if you are using backing map listeners, be aware of that limitation. Live object pattern also relay on backing mapping listener and thus limited by this scalability constraint.

Map triggers

Fortunately map triggers work as a part of cache update transaction on cache service level. In other words map trigger would not harm performance more than entry processors do.
One possible workaround for baking map listeners concurrency issue could be invocation of map listener from map trigger.