We’ve recently been performing some memory optimisations on OneMusicAPI to improve the performance of the service.

One aspect we explored was the memory use of cached code. We use a lot of Scala in the codebase and were concerned that many classes were being generated, cached, and using up memory space.

So we broke out the tools and dived into some lesser-known Java tools to analyse how much memory is being consumed in the code cache.

What’s the code cache?

The Java Virtual Machine contains a just-in-time compiler which compiles frequently-run bytecode to native code for faster execution. The compiled native code is kept in memory, in a code cache.

The jcmd command, among other things, provides a listing of the classes that are stored in code cache.

We can generate the compiled code listing using the jcmd <pid> Compiler.codelist command. The output of the above command looks like this (truncated for brevity):

$ jcmd <pid> Compiler.codelist
<pid>:
3 0 0 jdk.internal.misc.Unsafe.getObjectVolatile(Ljava/lang/Object;J)Ljava/lang/Object; [0x00007f4f4c438010, 0x00007f4f4c4381c0 - 0x00007f4f4c438440]
14 0 0 jdk.internal.misc.Unsafe.compareAndSetLong(Ljava/lang/Object;JJJ)Z [0x00007f4f4c438490, 0x00007f4f4c438640 - 0x00007f4f4c438850]
16 0 0 jdk.internal.misc.Unsafe.compareAndSetObject(Ljava/lang/Object;JLjava/lang/Object;Ljava/lang/Object;)Z [0x00007f4f4c438890, 0x00007f4f4c438a40 - 0x00007f4f4c438c78]
23 4 0 java.lang.Object.<init>()V [0x00007f4f4c438c90, 0x00007f4f4c438e20 - 0x00007f4f4c438e98]
20 1 0 java.lang.module.ModuleReference.descriptor()Ljava/lang/module/ModuleDescriptor; [0x00007f4f4c438f10, 0x00007f4f4c4390c0 - 0x00007f4f4c439190]
21 1 0 java.lang.module.ModuleDescriptor.name()Ljava/lang/String; [0x00007f4f4c439210, 0x00007f4f4c4393c0 - 0x00007f4f4c439490]

The source code for the C++ code that generates the list is available on GitHub, but we weren’t able to find any documentation or web pages describing what the output columns represented. However, by referring to the source code, one can see that each line has the following format:

compile_id comp_level get_state method_name [header_begin, code_begin - code_end]

The values for header_begin, code_begin, and code_end are memory addresses in hexadecimal (base 16).

We weren’t able to find documentation about these fields, and given that we aren’t C++ programmers, it was difficult to go through the code base looking for clues. For our use-case, we only needed to check the total size of the compiled code for each base package, so the column names were enough.

Using the first line of the output above as an example, we can calculate the code size as follows:

With an advanced calculator program, e.g. Speedcrunch
Simply input the hex values & subtract them. Calculating 0x00007f4f4c438440-0x00007f4f4c4381c0 gives 640 (in base 10).
With a spreadsheet (e.g. when calculating code size for all the rows in the output):
  1. Truncate the first 8 characters of the memory addresses, keeping the last 10 characters. For example, 0x00007f4f4c4381c0 would be truncated to 4f4c4381c0. This is because most spreadsheets cannot handle arbitrary precision numbers.footnote:[There is a small chance that a code_begin value might be something like 0x000012ffffffffff, making the final code size result negative, but we have not encountered that.]
  2. Convert the hex characters to decimal. Most spreadsheets have a built-in function for this, e.g. HEX2DEC. Continuing the above example, 4f4c4381c0 becomes 340581908928 and 4f4c438440 becomes 340581909568.
  3. Subtract the values. The result is 640 for the above example.

Coming across a question that doesn’t have an answer on the web is a rare occurrence these days. While most people using jcmd Compiler.codecache will be able to figure out the answer themselves, we hope that this blog post ends up being helpful to someone like us.

By Waqqas Dadabhoy

Photo by Immo Wegmann on Unsplash

comments powered by Disqus
© 2012-2024 elsten software limited, Unit 4934, PO Box 6945, London, W1A 6US, UK | terms and conditions | privacy policy