Wednesday, May 30, 2018

SJK is learning new tricks

SJK or (Swiss Java Knife) was my secret weapon for firefighting various types of performance problems for long time.

A new version of SJK was released not too long ago and it contains а bunch of new and powerful features I would like to highlight.

ttop contention monitoring

SJK is living it's name by bundling a number of tool into single executable jar. Though, ttop is a likely single most commonly used tool under SJK roof.

ttop is a kind top for threads of JVM process. Besides CPU usage counter (provided by OS) and allocation rate (tracked by JVM), a new thread contention metrics was introduced in recent SJK release.

Thread contention metrics are calculated by JVM, which counts and times when Java threads enters into BLOCKED or WAITING state.

If enabled, SJK is using these metrics to display rates and percentage of time spent in either state.

2018-05-29T14:20:03.382+0300 Process summary 
  process cpu=231.09%
  application cpu=212.78% (user=195.86% sys=16.92%)
  other: cpu=18.31% 
  thread count: 157
  GC time=4.72% (young=4.72%, old=0.00%)
  heap allocation rate 976mb/s
  safe point rate: 6.3 (events/s) avg. safe point pause: 8.24ms
  safe point sync time: 0.07% processing time: 5.09% (wallclock time)
[000180] user=19.40% sys= 0.31% wait=183.6/s(75.77%) block=    0/s( 0.00%) alloc=  110mb/s - hz._hzInstance_2_dev.cached.thread-8
[000094] user=16.92% sys= 0.16% wait=58.50/s(81.54%) block=    0/s( 0.00%) alloc=   94mb/s - hz._hzInstance_3_dev.generic-operation.thread-0
[000057] user=15.05% sys= 0.62% wait=56.91/s(82.35%) block= 0.20/s( 0.01%) alloc=   91mb/s - hz._hzInstance_2_dev.generic-operation.thread-0
[000095] user=15.21% sys= 0.00% wait=55.61/s(82.32%) block= 0.30/s( 0.04%) alloc=   87mb/s - hz._hzInstance_3_dev.generic-operation.thread-1
[000022] user=14.59% sys= 0.00% wait=56.01/s(83.42%) block= 0.30/s( 0.08%) alloc=   86mb/s - hz._hzInstance_1_dev.generic-operation.thread-1
[000058] user=13.97% sys= 0.16% wait=56.91/s(84.13%) block= 0.10/s( 0.02%) alloc=   81mb/s - hz._hzInstance_2_dev.generic-operation.thread-1

An important fact about these metrics is - CPU time + WAITING + BLOCKED should be 100% in ideal world.

In reality, you a likely to see a gap. A few reason why equation above is not holding:

  • GC pauses are freezing thread execution, but not accounted by thread contention monitoring,
  • thread may be waiting for IO operation, but it is not accounted as BLOCKED or WAITING state by JVM,
  • system may starve on CPU resource and thread is waiting for CPU core on OS level (which is also not accounted by JVM).

Contention monitoring is not enabled by default, use -c flag with ttop command to enabled it.

HTML5 based flame graph

SJK was able to produce flame graphs for sometime already. Though, old flame graphs were generated as svg with limited interactivity.

New version offers a new type of flame graphs based on HTML5 and interactive. Right in browser it allows:

  • filtering data by threads,
  • zoom into specific paths or by presence of specific frame,
  • filtering data by thread state (if state information is available).

HTML5 report is 100% self contained file with no dependencies, it can sent it by email and open on any machine. Here is an example of new flame graph you can play right now.

New flame command is used to generate HTML5 flame graphs.

`jstack` dump support

SJK is accepting a number of input data formats for thread sampling data, which is used for flame graphs and other types of performance analysis.

A new format added in 0.10 version is text thread dump formats produced by jstack. Full list of input formats now:

  • SJK native thread sampling format
  • JVisualVM sampling snapshots (.nps)
  • Java Flight Recorder recording (.jfr)
  • jstack produced text thread dumps