Concurrent Mark Sweep (CMS) is one of garbage collection algorithms implemented in HotSpot JVM. CMS is designed to be mostly concurrent, requiring just two quick stop-the-world pauses per old space garbage collection cycle. But if CMS cannot keep up with allocation of memory by application, it may fallback to stop-the-world whole heap collection - infamous Full GC pause. Full GC is very noticeable application pause, on large heaps it can take several dozens of seconds to collect and compact whole heap.
There are two different failure scenarios for CMS collection (both of them are leading to longer than usual stop-the-world pauses):
- concurrent mode failure,
- promotion failure.
Concurrent mode failure
At beginning of each young GC, collector should ensure that there is enough free memory in old space to promote aged objects from young space. Modern CMS collector estimates size of objects to be promoted using statistics from previous collections. If old space does not have enough free bytes to hold estimated promotion amount, concurrent mode failure will be raise. Concurrent mode failure doesn't necessary lead to Full GC, in certain cases JVM will just wait for concurrent collection cycle to finish, but application will remain in STW pause until young collection will be finished.
Most frequent reason for concurrent mode failure is late initiation of CMS cycle. JVM tries to estimate amount of garbage in heap and duration of CMS cycle and start it as late as possible to avoid wasting of CPU. Unfortunately this estimation may be too optimistic. You can advise JVM to start CMS earlier using following flags:
Setting above will force CMS cycle is more than 30% of old space is use. Second option disables JVM heuristics, without second parameter JVM may not obey CMS initiating occupancy fraction setting.
Normally for server type applications you would like CMS to be running continuously. If you are experiencing concurrent mode failure, even though next CMS cycle is starting right after previous, it means that CMS throughput is just not enough. In this case you should increase size of old generation and give CMS collector more head room to do its job. Alternatively you may try to dedicate more CPU cores for concurrent collector, but CPU is usually even more limited resource on modern servers than memory.
In summary, there are two reasons for concurrent mode failure STW pause mentioned above, both of them can be remedied fairly easily with JVM options.
Promotion failure is more complicated situation. CMS collector is not compacting free memory in old space, instead it have to deal with fragmented free space (a set of free memory chunks). It is possible, that all free bytes are scattered though small chunks, and it is impossible to find certain amount of continuous memory to promote particular large object, even though total number of free bytes is large enough.
Heap fragmentation is well known problem, and there are few effective techniques reducing fragmentation.
CMS memory manager is using separate free lists for different size of chunks. Using these free lists, it can effectively fill small holes in fragmented memory space with objects of exact size. This technique is known to be fairly effective (and widely used in C/C++ memory managers). But, surprisingly, it doesn't seems to work well for JVM in many real live situations.
What is wrong with a way JVM allocating memory in old space?
Let me make a step aside and explain some details of JVM memory management. You may know that JVM is using TLABs (thread local allocation block) to avoid concurrent allocation from single memory source over multiple threads. Same idea is also used for parallel young space collector. Collecting threads are copying live objects (young collection is a copy collection) and they need to allocate memory for objects being copied. To avoid CPU cache contention (due to synchronization of caches between CPUs) each thread have two, one for young space and one for old space, PLABs (parallel or promotion local allocation buffer) of its own. Using PLAB in young space is totally make sense, free memory is always continuous there, but using PLAB for old space in practice makes a huge contribution to fragmentation.
Let me illustrate this by example.
Each collector thread is allocating chunk of continuous memory in old space. Then, during collection, these chunks are filled with objects of different size (large objects will be allocated directly in old space, but small objects will be placed in PLAB).
Once young collection is finished, our PLABs are full with objects. Then, after some time, CMS cycle takes place and sweeper mark few dead objects as free space, creating small free memory chunks.
But on next young collection, instead of reusing small chunks of memory, each GC thread will allocate new large continuous block and will fill it with small objects.
Effectively, due to using PLABs, free lists are not used for small objects and small chunks will never be reused (until JVM could coalesce them into larger chunks). In practice, in our applications, we have a lot of small objects which can live long enough to die in old space. Strings and hash map entries are just to name few such cases.
Ironically, while CMS has sophisticated free list machinery, it makes it inefficient by using PLABs.
Can we workaround this issue?Yes, we can reduce size of PLAB improving reuse of small chunks.
Options above will force JVM to use PLAB as small as 16 heap words (heap word = 4 bytes in 32 bit JVM or 64 bit JVM with compressed pointers enabled, 8 bytes otherwise).
My experiments have shown drastic reduction of heap fragmentation in few application with known promotion failure problems. Though options above will help you reduce fragmentation, I would recommend to use them only if you HAVE problems with fragmentation. Using PLAB is important from multi core prospective and such soft disabling of PLAB may have serious impact on young collection pause time.
How to diagnose problems with fragmentation?
Monitoring heap fragmentation
Below are options, that will help you diagnose and monitor problem with CMS heap fragmentation.
PrintGCDetails is MUST HAVE base line GC diagnostic option. PrintPromotionFailure will print few more details about promotion failures (see example below).
PrintFLSStatistics=1 will print free list statistics each young or old collection (see example below).
Most important parameter of FLS (free list space) statistics is max chunk size. If size of max chunk decreasing over time, that means increasing fragmentation of heap and risk of promotion failure.
Let me summarize, how to reduce risk of concurrent mode and promotion failures using CMS collector.
1. Provide CMS enough head room, more free space – less risk of fragmentation. Remember any garbage collector need some extra space to work efficiently.
2. Let CMS run continuously using initiating occupancy settings (you may want reduce number of concurrent threads though).
3. If you still having problems – reduce old PLAB size.
4. Avoid spurious allocation of very large objects in application (resize of large hash table is a good example). Regular allocation of large objects is ok, though.
See alsoJava GC, HotSpot's CMS promotion buffers
CMS heap fragmentation. Follow up 1
HotSpot JVM garbage collection options cheat sheet
Understanding GC pauses in JVM, HotSpot's minor GC
Understanding GC pauses in JVM, HotSpot's CMS collector
How to tame java GC pauses? Surviving 16GiB heap and greater