Thursday, May 31, 2012

Tech talk at London Coherence SIG: Database Backed Cache, Tips, Tricks and Patterns

Today I was speaking at London Coherence SIG. Below you can find slides from my presentation.
London Coherence SIGs are never boring, but this one was especially interesting.

We had two presentations from Randy Stafford (Oracle), Groovy presentation from Jonathan "Gridman" Knight, yet another Coherence transaction framework (and counting) from David Whitmarsh, "lore" about read-write-backing-map from Phil Wheeler and other interesting stuff.
 
I'm really glad, I was able to get to this event.

Four flavours of "Out of Memory" in HotSpot JVM

If you are working with Java, I bet you are well aware about OutOfMemoryError. But did you know that there are 4 different conditions when OutOfMemoryError is being thrown in Oracle’s HotSpot JVM? These conditions can be distinguished by message provided by exception.

OutOfMemoryError: Java heap space

This is probably most common case for out of memory error. It indicates that heap space is not large enough to hold object created by application. To be precise, this exception is thrown if, after last full GC cycle, free space in heap is below 2% (can be configured via –XX:GCHeapFreeLimit=p JVM option).
Normally you either have to increase JVM heap size or fix memory leak in your application to remedy this.

OutOfMemoryError: GC overhead limit exceeded

This problem is trickier. It means that GC is spending “too much” time cleaning memory. Portion of free space may be above 1%, but process is spending 50 times more time managing memory than actually executing application code. Threshold can be tweaked via -XX:GCTimeLimit=p JVM option, default is 2% (1/50). Usually if you will increase heap size problem will go away. But this error may also indicate that heap is badly configured, i.e. young space is too small for application.

OutOfMemoryError: PermGen space

As you may know, HotSpot JVM is using special memory space for certain internal structures – permanent space. Despite its name object from permanent space could be collected (name is just a historic artifact). But you may run out of memory in perm space same way as you can in normal heap. Thing which may affect permanent space usage are
  • loading/generating classes and creating with class loaders,
  • reflection,
  • calling String.intern() method.
Permanent space does not count to application heap size. If you need to resize it, use –XX:MaxPermSize=s JVM option.

OutOfMemoryError: Direct buffer memory

All previous types of OOME were indicating that GC failed to free memory for new Java object (i.e. they are thrown as result of GC cycle). This last type of OOME is different. Since Java 1.5, JVM have API to manage memory outside of heap. Out of heap memory is not a subject for garbage collection, but is also limited. Capacity of off heap memory pool is configured via –XX:MaxDirectMemorySize=s JVM option.

My statement about off heap memory not being garbage collected is only partially true. Off heap memory blocks are deallocated in ByteBuffer class finalizer so garbage collection of heap is also driving reclamation of associated off-heap memory.

Friday, May 25, 2012

Thursday, May 17, 2012

Tech meet up, distributeted caching and data grid, Moscow 17 May

I would like to announce tech meet up devoted to topic of caching in distributed systems and data grid technology. Event will be held at Moscow on May 17.

Slides from event:


Main talk by Max Alexejev
Bonus presentation by me

Monday, May 14, 2012

Asymmetric read/write path – a trend for scalable architecture

A few years ago I was blogging about architecture using composition of different storage technologies for queering and persistence of data. Rationale behind this was further explained in other my post.

Below is a sketch of such composition:


Unlike classical Von Neumann's memory hierarchy, this composition is asymmetric in terms of read and write paths.

I’m glad to see similar ideas implemented in commercial products marking a trend. In this post I want draw your attention to two very interesting middleware products having principle of composition of specialized storages in their core.

CloudTran

CloudTran is very ambitious product promising performance and scalability for a wide class of application without much extra effort.

CloudTran leverages Oracle Coherence and its integration with EclipseLink to build scalable applications using JPA for persistence. Coherence + EclipseLink (TopLink Grid) is already capable of executing JPQL queries in cache instead of database using rich querying capabilities of Coherence. Missing piece in this tandem was transaction support.
CloudTran is filling this gap adding specialized component for managing transactions. Durability of CloudTran’s transactions is provided by write-ahead disk log (many RDBMSes are using same technique). But unlike RDBMS, CloudTran’s log is not limited to single disk/server, it is distributed (same way as data in Coherence grid) and can benefit from throughput of dozens of disks in cluster.

Full picture of CouldTran based solution is triangle of technologies:
  • Coherence for fast data retrieval,
  • CloudTran transaction log for fast and durable transaction persistence,
  • backend database (relational or NoSQL) is a system of record and long term storage.

  • Backend database being updated asynchronously is on critical path for neither read nor write, thus database is not limiting application performance. On other side data in Coherence are updated synchronously, so application logic can enjoy strong consistency and ACID transactions (which is huge, eventual constancy is a lot of pain for typical enterprise application with sophisticated data model).

    Datomic

    Datomic is another young and interesting product promising combination of ACID and scalability. Datomic is also featuring triangle of technologies, but it has own implementation for both in-memory database/cache and transaction persistence (that component is called transactor in Datomic). For system of record you can use either RDBMS or NoSQL storage (Amazon Dynamo). Datomic is offering own API (and own unique approach) for working with data. Datomic is highly influenced by functional paradigm, which would probably make porting existing applications to Datomic non trivial, but for new projects idea of simple but scalable platform featuring ACID data manipulation may be attractive.

    Cool toys for enterprise developers

    I’m very glad to see such innovative products addressing scalability in not-so-fancy class of enterprise application (of cause both products are not limited to enterprise). IMHO there are enough clones of Google’s BigTable and Amazon’s dynamo in this world already. People working on inventories, reservation systems and other enterprisy stuff also need cool distributed toys.
    I’m a little skeptical about future for these products (goals they have set for themselves are just too challenging), but I sincerely which luck to both projects.

    Please prove my skepticism wrong ;)