Week 2

This document summarizes work done/being done in week 2.

1 Ideas

This section is used as sort of a sounding-board for any ideas that were discussed/picked up this week.

1.1 Profile-Guided Optimizations

With the cost of liveness based collection being rather high in terms of time taken, we should use it only when we need to save big on space. This suggests that we should look to our actual use cases to decide where to collect using liveness, and where to collect using reachability.

I thought of two ways to do this.

1.1.1 Compile-time

This was inspired from certain firefox builds.

This involves three stages. Building a (possibly) instrumented binary, running it (on a possibly instrumented JVM) against a set of typical use cases (or profiles), and noting down the allocation sites which have the largest objects reachable-but-not-live for long intervals of time.

The final step is compiling again, this time using information from the instrumentation phase to determine what areas will use liveness based collection and which areas can use reachability.

The actual implementation could done in several ways, here are some tools which might help. The actual profiling needs me to count the time for all objects reachable into their size – versus the same quantity for objects which are live. Time could be measured as total number of GC bytes allocated since startup, or wall clock time. See this paper for the idea behind profiling this way.

  1. JVMTI JVM Tool Interface allows me to write native clients which can connect to a JVM and help me take actions on the VM as well as attaching callbacks to events of interest. I could attach a callbacks to object accesses, to maintain last use information, and there's even a heap traversal function, which I could invoke to gather reachability info, and then compare the two.
  2. Java Agent This is similar to the above in that it attaches to a JVM, but this is actually runs in the VM too, and allows me to make runtime modifications to the bytecode. This could be used to add instrumentation instructions around opcodes of interest, however, I could not find any way to analyze the heap in this.
  3. Using JikesRVM (instrumenting instructions by changing instruction emit code). We have a complete control over GC in this method, but I've not been able to properly understand instrumentation, further, this doesn't offer easy callbacks like JVMTI.

The advantage to this method is no runtime costs, but at the same time a decent "representative" input set is needed to obtain correct information.

1.1.2 Runtime

This approach seeks to combine the three stages of the above system into the JVM itself. It is possible to know when the workload of the application changes. This is called adaptive optimization, and in JikesRVM, this is done by finding "hot" methods.

Whenever the workload changes, we can also begin instrumenting code for a while, especially in those areas which are "hot". After a few (slower than usual) executions, we will have a dynamically generated list of allocation sites with the sorted by size of dead, reachable objects into the time they exist for, and we can decide where we need to use liveGC.

The disadvantage is the cost of instrumentation. Also, if we mistakenly add liveGC to an allocation site where it is not needed, it may lead to a slow down of the code. The advantage is that we can deal with a large number of workloads without prior knowledge.

1.2 Generational GC

As discussed on last Tuesday, it might make sense to collect the "short lived" nursery using liveness during the "minor" GC cycle, since the number of objects is likely to be smaller over there, and then have occasional "major" GCs which still use reachability.

This works well for that code which assigns a large number of objects, and either uses them once or does not use them at all. I spent some time reading about Generational GCs, mainly this and this.

1.3 Last Use for Large Objects

For every use of an object, we update a "last used" field in the header, and in every mark phase of our GC, we check this field. If the object is large, and the last use is very long ago (as compared to other objects on the heap), then it might make sense to check liveness.

Here we can use a 2-way BFS to speed the search up. If the object is found to be live, we update its last-used and the last-used of all the objects in its path to avoid a re-traversal the next time.

2 Readings

There are some papers and a ton of technical documentation, since I worked a lot on familiarizing myself with tools to use.

Author: Milind Luthra

Emacs 25.3.1 (Org mode 8.2.10)