Monday, February 23, 2015

So, you have dumped 150 GiB of JVM heap, now what?

150 GiB worth of JVM heap dump is laying on hard drive and I need analyze specific problem detected in that process.

This is a dump of proprietary hybrid of in-memory RDBMS and CEP system, I'm responsible for. All data are stored in Java heap, so heap size of some installation is huge (400 GiB heap is largest to the date).

Problem of analyzing huge heap dumps were on my radar for some time, so I wasn't unprepared.

To be honest, I haven't tried to open this file in Eclipse Memory Analyzer, but I doubt it could handle it.

For me, for some time, most useful tool in heap analyzers was JavaScript based queries. Clicking through millions objects is not fun. It is much better to walk object graph with code, not with mouse.

Heap dump is just a serialized graph of objects, my goal is to extract specific information from this graph. I do not really need a fancy UI, API to heap graph would be even better.

How I can analyze heap dump programmatically?

I have started my research with NetBeans profiler (it was a year ago). NetBeans is open source and have visual heap dump analyzer (same component is also used in JVisualVM). It turns out, what heap dump processing code is separate module and API it provides is suitable for custom analysis logic.

NetBeans heap analyzer has a critical limitation, though. It is using temporary file to keep internal index of heap dump. This file is typically around 25% of heap dump itself. But most important it takes a time to build this file, before any query to heap graph is possible.

After taking better look, I decided, I could remove this temporary file. I have forked library (my fork is available at GitHub). Some functions was lost together with temporary file (e.g. backward reference traversing), but they are not need for my kind of tasks.

Another important change to original library, was implementing HeapPath.
HeapPath is an expression language for object graph. It is useful both as generic predicate language in graph traversal algorithms and as simple tool to extract data from object dump. HeapPath automatically converts strings, primitives and few other simple types from heap dump structures to normal objects.

This library proved itself very useful in our daily job. One of its application was memory reporting tool for our database/CEP system which automatically report actual memory consumption of every relational transformation node (there could be few hundred nodes in single instance).

For interactive exploring API + Java is not best set of tools, tough. But it lets me do my job (and 150 GiB of dump leave me no alternatives).

Should I be adding some JVM scripting language to the mix ...

BTW: Single pass through 150 GiB is taking about 5 minutes. Meaning full analysis usually employ multiple iterations, but processing times are fairly reasonable even for that heap size.

19 comments:

  1. There is a project named "Calcite" that is JDBC driver on top of anything - existed jdbc, csv files, sparc and so on. The most valuable feature that it is possible to put your our underlying data source.

    Volodya Sitnikov made a MAT calcite plugin that provides a full feature SQL syntax (with aggregation etc) on top of hprof files.

    I think it is possible just to use his hprof jdbc driver and run a sql queries from a simple java app.

    ReplyDelete
    Replies
    1. Sounds interesting, but SQL is pretty bad for graph oriented queries. I hardly can imagine how I would rewrite some of my report scripts to SQL.

      Delete
  2. Another point ... as I understood JVisualVM is based on Netbeans profiler and it could takes ages on it to open not so huge dumps while MAT is much much faster.

    Анализ дампов памяти Java-приложений hope it will be useful

    ReplyDelete
    Replies
    1. 5 minutes for single pass of 150 GiB heap dump. 40 minutes to build comprehensive memory usage break down using complex rules set on same head dump.
      Do you have any number for MAT with 100+ GiB dump?

      Delete
    2. What heap analyzer tool can be used for 22G heap dump file?

      Delete
  3. Hey. How to run your hprof-heap? How to specify a dump file name/location?

    ReplyDelete
    Replies
    1. It is not a tool, it is a library. You can instantiate org.netbeans.lib.profiler.heap.FastHprofHeap() instance pointing to your heap dump and navigate though heap graph via API.

      Delete
  4. Oh. OK. Thanks! Can you give an example please?

    ReplyDelete
    Replies
    1. I have put some examples I was using with JBoss 6.1 a while ago

      https://github.com/aragozin/jvm-tools/blob/master/hprof-heap/src/test/java/org/gridkit/jvmtool/heapdump/example/JBossServerDumpExample.java

      https://github.com/aragozin/jvm-tools/blob/master/hprof-heap/src/test/java/org/gridkit/jvmtool/heapdump/example/JsfTreeExample.java

      They are demonstrating various reports from simple jmap like histogram to printing out runtime structure of JSF component trees.

      BTW 22 GiB is not too big, Eclipse MAT should be able to handle it

      Delete
  5. You can try build indexes with mat in the console mode on the server, and then open it on your desktop. Tried it with heap of ~30gb.

    ReplyDelete
  6. So Netbeans' analyzer "could" do the file, it just created a large temp file, or did it run out of RAM "as expected" as it were?

    ReplyDelete
    Replies
    1. I have not tried it with unmodified Netbeans version. I do not even have that mach disk space on my desktop.

      Delete
  7. Tried opening a 20GB hprof using original nb code (https://github.com/aragozin/heaplib/tree/master/hprof-heap/src/main/java/org/netbeans/lib/profiler/heap) (not the fast version you had added), and good lord, it took 12 hours to parse it! It had created a ~20GB map file within 2 minutes though, but took 12 hours just to print summary of the heap.

    Am I missing anything, or Is that the expected speed of the vanilla version?

    ReplyDelete
    Replies
    1. If summary means class histogram, 12 hours is outrageous. I'm not running my code with non "fast" version of heap parser though, so I'm have missed some performance issue. I suggest
      1. Try fast heap version
      2. Use profiler to pin point how code during processing history.
      I do not have any good large heap dump files at the moment to verify issue, so your feedback is much appreciated.

      Delete
  8. Does the fast version compute retainedSize? Can you share how does it compute the dominator tree (and retained size)? Doesn't look like it is using Tarjan's and co's sdom() approach and doing so via brute force approach will take ages.

    ReplyDelete
    Replies
    1. Fast version neither computes retained size, nor supports backward reference traversing. It is using very compact in-memory index and doesn't create on disk index files. Usually I use domain specific traversing rules so forward only traversing is sufficient for my purposes.

      Delete