Monday, May 14, 2012

Asymmetric read/write path – a trend for scalable architecture

A few years ago I was blogging about architecture using composition of different storage technologies for queering and persistence of data. Rationale behind this was further explained in other my post.

Below is a sketch of such composition:


Unlike classical Von Neumann's memory hierarchy, this composition is asymmetric in terms of read and write paths.

I’m glad to see similar ideas implemented in commercial products marking a trend. In this post I want draw your attention to two very interesting middleware products having principle of composition of specialized storages in their core.

CloudTran

CloudTran is very ambitious product promising performance and scalability for a wide class of application without much extra effort.

CloudTran leverages Oracle Coherence and its integration with EclipseLink to build scalable applications using JPA for persistence. Coherence + EclipseLink (TopLink Grid) is already capable of executing JPQL queries in cache instead of database using rich querying capabilities of Coherence. Missing piece in this tandem was transaction support.
CloudTran is filling this gap adding specialized component for managing transactions. Durability of CloudTran’s transactions is provided by write-ahead disk log (many RDBMSes are using same technique). But unlike RDBMS, CloudTran’s log is not limited to single disk/server, it is distributed (same way as data in Coherence grid) and can benefit from throughput of dozens of disks in cluster.

Full picture of CouldTran based solution is triangle of technologies:
  • Coherence for fast data retrieval,
  • CloudTran transaction log for fast and durable transaction persistence,
  • backend database (relational or NoSQL) is a system of record and long term storage.

  • Backend database being updated asynchronously is on critical path for neither read nor write, thus database is not limiting application performance. On other side data in Coherence are updated synchronously, so application logic can enjoy strong consistency and ACID transactions (which is huge, eventual constancy is a lot of pain for typical enterprise application with sophisticated data model).

    Datomic

    Datomic is another young and interesting product promising combination of ACID and scalability. Datomic is also featuring triangle of technologies, but it has own implementation for both in-memory database/cache and transaction persistence (that component is called transactor in Datomic). For system of record you can use either RDBMS or NoSQL storage (Amazon Dynamo). Datomic is offering own API (and own unique approach) for working with data. Datomic is highly influenced by functional paradigm, which would probably make porting existing applications to Datomic non trivial, but for new projects idea of simple but scalable platform featuring ACID data manipulation may be attractive.

    Cool toys for enterprise developers

    I’m very glad to see such innovative products addressing scalability in not-so-fancy class of enterprise application (of cause both products are not limited to enterprise). IMHO there are enough clones of Google’s BigTable and Amazon’s dynamo in this world already. People working on inventories, reservation systems and other enterprisy stuff also need cool distributed toys.
    I’m a little skeptical about future for these products (goals they have set for themselves are just too challenging), but I sincerely which luck to both projects.

    Please prove my skepticism wrong ;)

    2 comments:

    1. Alexey

      Thanks for the interesting post.

      I think we agree that there is a growing need for products that support enterprise applications, as they start to use grids,
      which isn't met by Dynamo, BigTable etc.

      We really didn't set out to be ambitious!
      There are a certain set of requirement for CloudTran - to manage the grid and the database to provide HA, scalability, bullet-proof reliability and speed.

      Lots of products do most of these. The problem is, that without solving them all, the applications are too slow or unreliable, or they push a lot of complexity onto the application developer.

      You needn't worry about this product space being too challenging. 3 or 4 years ago, you'd have been right ;-), but now the big problems are solved.

      The user feedback we're getting is that CloudTran works and is easy to use.

      Matthew Fowler, CTO, CloudTran

      ReplyDelete
      Replies
      1. Hi Matthew,

        "... provide HA, scalability, bullet-proof reliability and speed" - sound pretty ambitious to me :)

        I agree with you that a lot have been done during last 3 - 4 years in area of distributed applications. Database HA (not just backup) is a norm now. End of GHz race have made people take distributed computing (transactional applications included) seriously. But we still in the middle, a lot more have to done.

        But progress is huge, that is true.

        Regards,
        Alexey

        Delete