Tenderlove Making

Reducing Memory Usage in Ruby

I’ve been working on building a compacting garbage collector in Ruby for a while now, and one of the biggest hurdles for implementing a compacting GC is updating references. For example, if Object A points to Object B, but the compacting GC moves Object B, how do we make sure that Object A points to the new location?

Solving this problem has been fairly straight forward for most objects. Ruby’s garbage collector knows about the internals of most Ruby Objects, so after the compactor runs, it just walks through all objects and updates their internals to point at new locations for any moved objects. If the GC doesn’t know about the internals of some object (for example an Object implemented in a C extension), it doesn’t allow things referred to by that object to move. For example, Object A points to Object B. If the GC doesn’t know how to update the internals of Object A, it won’t allow Object B to move (I call this “pinning” an object).

Of course, the more objects we allow to move, the better.

Earlier I wrote that updating references for most objects is fairly straight forward. Unfortunately there has been one thorn in my side for a while, and that has been Instruction Sequences.

Instruction Sequences

When your Ruby code is compiled, it is turned in to instruction sequence objects, and those objects are Ruby objects. Typically you don’t interact with these Ruby objects, but they are there. These objects store byte code for your Ruby program, any literals in your code, and some other miscellaneous information about the code that was compiled (source location, coverage info, etc).

Internally, these instruction sequence objects are referred to as “IMEMO” objects. There are multiple sub-types of IMEMO objects, and the instruction sequence sub-type is “iseq”. If you are using Ruby 2.5, and you dump the heap using ObjectSpace, you’ll see the dump now contains these IMEMO subtypes. Lets look at an example.

I’ve been using the following code to dump the heap in a Rails application:

require 'objspace'
require 'config/environment'

File.open('output.txt', 'w') do |f|
  ObjectSpace.dump_all(output: f)
end

The above code outputs all objects in memory to a file called “output.txt” in JSON lines format. Here are a couple IMEMO records from a Rails heap dump:

{
  "address": "0x7fc89d00c400",
  "type": "IMEMO",
  "class": "0x7fc89e95c130",
  "imemo_type": "ment",
  "memsize": 40,
  "flags": {
    "wb_protected": true,
    "old": true,
    "uncollectible": true,
    "marked": true
  }
}
{
  "address": "0x7fc89d00c2e8",
  "type": "IMEMO",
  "imemo_type": "iseq",
  "references": [
    "0x7fc89d00c270",
    "0x7fc89e989a68",
    "0x7fc89e989a68",
    "0x7fc89d00ef48"
  ],
  "memsize": 40,
  "flags": {
    "wb_protected": true,
    "old": true,
    "uncollectible": true,
    "marked": true
  }
}

This example came from Ruby 2.5, so both records contain an imemo_type field. The first example is a “ment” or “method entry”, and the second example is an “iseq” or an “instruction sequence”. Today we’ll look at instruction sequences.

Format of Instruction Sequence

The instruction sequences are the result of compiling our Ruby code. The instruction sequences are a binary representation of our Ruby code. These instructions are stored on the instruction sequence object, specifically this iseq_encoded field (iseq_size is the length of the iseq_encoded field).

If you were to examine iseq_encoded, you’ll find it’s just a list of numbers. The list of numbers is virtual machine instructions as well as parameters (operands) for the instructions.

If we examine the iseq_encoded list, it might look something like this:

  Address Description
0 0x00000001001cddad Instruction (0 operands)
1 0x00000001001cdeee Instruction (2 operands)
2 0x00000001001cdf1e Operand
3 0x000000010184c400 Operand
4 0x00000001001cdeee Instruction (2 operands)
5 0x00000001001c8040 Operand
6 0x0000000100609e40 Operand
7 0x0000000100743d10 Instruction (1 operand)
8 0x00000001001c8040 Operand
9 0x0000000100609e50 Instruction (1 operand)
10 0x0000000100743d38 Operand

Each element of the list corresponds to either an instruction, or the operands for an instruction. All of the operands for an instruction follow that instruction in the list. The operands are anything required for executing the corresponding instruction, including Ruby objects. In other words, some of these addresses could be addresses for Ruby objects.

Since some of these addresses could be Ruby objects, it means that instruction sequences reference Ruby objects. But, if instruction sequences reference Ruby objects, how do the instruction sequences prevent those Ruby objects from getting garbage collected?

Liveness and Code Compilation

As I said, instruction sequences are the result of compiling your Ruby code. During compilation, some parts of your code are converted to Ruby objects and then the addresses for those objects are embedded in the byte code. Lets take a look at an example of when a Ruby object will be embedded in instruction sequences, then look at how those objects are kept alive.

Our sample code is just going to be puts "hello world". We can use RubyVM::InstructionSequence to compile the code, then disassemble it. Disassembly decodes iseq_encoded and prints out something more readable.

>> insns = RubyVM::InstructionSequence.compile 'puts "hello world"'
=> <RubyVM::InstructionSequence:<compiled>@<compiled>>
>> puts insns.disasm
== disasm: #<ISeq:<compiled>@<compiled>>================================
0000 trace            1                                               (   1)
0002 putself          
0003 putstring        "hello world"
0005 opt_send_without_block <callinfo!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
0008 leave            
=> nil
>>

Instruction 003 is the putstring instruction. Lets look at the definition of the putstring instruction which can be found in insns.def:

/* put string val. string will be copied. */
DEFINE_INSN
putstring
(VALUE str)
()
(VALUE val)
{
    val = rb_str_resurrect(str);
}

When the virtual machine executes, it will jump to the location of the putstring instruction, decode operands, and provide those operands to the instruction. In this case, the putstring instruction has one operand called str which is of type VALUE, and one return value called val which is also of type VALUE. The instruction body itself simply calls rb_str_resurrect, passing in str, and assigning the return value to val. rb_str_resurrect just duplicates a Ruby string. So this instruction takes a Ruby object (a string which has been stored in the instruction sequences), duplicates that string, then the virtual machines pushes that duplicated string on to the stack. For a fun exercise, try going through this process with puts "hello world".freeze and take a look at the difference.

Now, how does the string “hello world” stay alive until this instruction is executed? Something must mark the string object so the garbage collector knows that a reference is being held.

The way the instruction sequences keep these objects alive is through the use of what it calls a “mark array”. As the compiler converts your code in to instruction sequences, it will allocate a string for “hello world”, then push that string on to an array. Here is an excerpt from compile.c that does this:

case TS_VALUE:    /* VALUE */
{
    VALUE v = operands[j];
    generated_iseq[code_index + 1 + j] = v;
    /* to mark ruby object */
    iseq_add_mark_object(iseq, v);
    break;
}

All iseq_add_mark_object does is push the VALUE on to an array which is stored on the instruction sequence object. iseq is the instruction sequence object, and v is the VALUE we want to keep alive (in this case the string “hello world”). If you look in vm_core.h, you can find the location of that mark array with a comment that says:

VALUE mark_ary;     /* Array: includes operands which should be GC marked */

Instruction Sequence References and Compaction

So, instruction sequences contain two references to a string literal: one in the instructions in iseq_encoded, and one via the mark array. If the string literal moves, then both locations will need to be updated. Updating array internals is fairly trivial: it’s just a list. Updating instruction sequences on the other hand is not so easy.

To update references in the instruction sequences, we have to disassemble the instructions, locate any VALUE operands, and update those locations. There wasn’t any code to walk these instructions, so I introduced a function that would disassemble instructions and call a function pointer with those objects. This allows us to find new locations of Ruby objects and update the instructions. But what if we could use this function for something more?

Reducing Memory

Now we’re finally on to the part about saving memory. The point of the mark arrays stored on the instruction sequence objects is to keep any objects referred to by instruction sequences alive:

ISeq and Array marking paths

We can reuse the “update reference” function to mark references contained directly in instruction sequences. This means we can reduce the size of the mark array:

Mark Literals via disasm

Completely eliminating the mark array is a different story as there are things stored in the mark array that aren’t just literals. However, if we directly mark objects from the instruction sequences, then we rarely have to grow the array. The amount of memory we save is the size of the array plus any unused extra capacity in the array.

I’ve made a patch that implements this strategy, and you can find it on the GitHub fork of Ruby.

I found that this saves approximately 3% memory on a basic Rails application set to production mode. Of course, the more code you load, the more memory you save. I expected the patch to impact GC performance because disassembling instructions and iterating through them should be harder than just iterating an array. However, since instruction sequences get old, and we have a generational garbage collector, the impact to real apps is very small.

I’m working to upstream this patch to Ruby, and you can follow along and read more information about the analysis here.

Anyway, I hope you found this blurgh post informative, and please have a good day!

<3 <3 <3

I want to give a huge thanks to Allison McMillan. Every week she’s been helping me figure out what is going on with this complex code. I definitely recommend that you follow her on Twitter.

« go back