Alexey Ragozin: So, you have dumped 150 GiB of JVM heap, now what?

Monday, February 23, 2015

So, you have dumped 150 GiB of JVM heap, now what?

150 GiB worth of JVM heap dump is laying on hard drive and I need analyze specific problem detected in that process.

This is a dump of proprietary hybrid of in-memory RDBMS and CEP system, I'm responsible for. All data are stored in Java heap, so heap size of some installation is huge (400 GiB heap is largest to the date).

Problem of analyzing huge heap dumps were on my radar for some time, so I wasn't unprepared.

To be honest, I haven't tried to open this file in Eclipse Memory Analyzer, but I doubt it could handle it.

For me, for some time, most useful tool in heap analyzers was JavaScript based queries. Clicking through millions objects is not fun. It is much better to walk object graph with code, not with mouse.

Heap dump is just a serialized graph of objects, my goal is to extract specific information from this graph. I do not really need a fancy UI, API to heap graph would be even better.

How I can analyze heap dump programmatically?

I have started my research with NetBeans profiler (it was a year ago). NetBeans is open source and have visual heap dump analyzer (same component is also used in JVisualVM). It turns out, what heap dump processing code is separate module and API it provides is suitable for custom analysis logic.

NetBeans heap analyzer has a critical limitation, though. It is using temporary file to keep internal index of heap dump. This file is typically around 25% of heap dump itself. But most important it takes a time to build this file, before any query to heap graph is possible.

After taking better look, I decided, I could remove this temporary file. I have forked library (my fork is available at GitHub). Some functions was lost together with temporary file (e.g. backward reference traversing), but they are not need for my kind of tasks.

Another important change to original library, was implementing HeapPath.
HeapPath is an expression language for object graph. It is useful both as generic predicate language in graph traversal algorithms and as simple tool to extract data from object dump. HeapPath automatically converts strings, primitives and few other simple types from heap dump structures to normal objects.

This library proved itself very useful in our daily job. One of its application was memory reporting tool for our database/CEP system which automatically report actual memory consumption of every relational transformation node (there could be few hundred nodes in single instance).

For interactive exploring API + Java is not best set of tools, tough. But it lets me do my job (and 150 GiB of dump leave me no alternatives).

Should I be adding some JVM scripting language to the mix ...

BTW: Single pass through 150 GiB is taking about 5 minutes. Meaning full analysis usually employ multiple iterations, but processing times are fairly reasonable even for that heap size.

19 comments:

Vladimir DolzhenkoFebruary 23, 2015 at 1:40 PM
There is a project named "Calcite" that is JDBC driver on top of anything - existed jdbc, csv files, sparc and so on. The most valuable feature that it is possible to put your our underlying data source.

Volodya Sitnikov made a MAT calcite plugin that provides a full feature SQL syntax (with aggregation etc) on top of hprof files.

I think it is possible just to use his hprof jdbc driver and run a sql queries from a simple java app.
ReplyDelete
Replies
Vladimir DolzhenkoFebruary 23, 2015 at 1:44 PM
Another point ... as I understood JVisualVM is based on Netbeans profiler and it could takes ages on it to open not so huge dumps while MAT is much much faster.

Анализ дампов памяти Java-приложений hope it will be useful
ReplyDelete
Replies
UnknownSeptember 30, 2015 at 4:06 PM
Hey. How to run your hprof-heap? How to specify a dump file name/location?
ReplyDelete
Replies
UnknownOctober 1, 2015 at 3:42 PM
Oh. OK. Thanks! Can you give an example please?
ReplyDelete
Replies
shermanOctober 18, 2015 at 10:03 PM
You can try build indexes with mat in the console mode on the server, and then open it on your desktop. Tried it with heap of ~30gb.
ReplyDelete
Replies
Roger PackOctober 25, 2016 at 7:11 PM
So Netbeans' analyzer "could" do the file, it just created a large temp file, or did it run out of RAM "as expected" as it were?
ReplyDelete
Replies
Roger PackSeptember 15, 2021 at 7:14 PM
Github link 404?
ReplyDelete
Replies
HVJune 20, 2023 at 12:31 PM
Tried opening a 20GB hprof using original nb code (https://github.com/aragozin/heaplib/tree/master/hprof-heap/src/main/java/org/netbeans/lib/profiler/heap) (not the fast version you had added), and good lord, it took 12 hours to parse it! It had created a ~20GB map file within 2 minutes though, but took 12 hours just to print summary of the heap.

Am I missing anything, or Is that the expected speed of the vanilla version?
ReplyDelete
Replies
HVJuly 14, 2023 at 1:12 PM
Does the fast version compute retainedSize? Can you share how does it compute the dominator tree (and retained size)? Doesn't look like it is using Tarjan's and co's sdom() approach and doing so via brute force approach will take ages.
ReplyDelete
Replies

Add comment

Alexey Ragozin

Pages

Monday, February 23, 2015

So, you have dumped 150 GiB of JVM heap, now what?

19 comments:

About Me

Search This Blog

Popular Posts

Labels

Blog Archive

Alexey Ragozin

Pages

Monday, February 23, 2015

So, you have dumped 150 GiB of JVM heap, now what?

19 comments:

About Me

Search This Blog

Popular Posts

Labels

Blog Archive

Subscribe To