I want DTrace probes in Ruby

Posted by – June 29, 2011

Recently I was debugging a performance regression in Rails 3.1. The ticket reported for the regression just indicated a speed problem. Namely allocating new active record objects was very slow in Rails 3.1 compared to Rails 3.0.

The benchmark program

Here is the program I was using for benchmarking. It’s slightly modified from the original program that @paul so kindly submitted:

require 'active_record'
require 'benchmark'

p ActiveRecord::VERSION::STRING

ActiveRecord::Base.establish_connection(
  :adapter  => "sqlite3",
  :database => ":memory:"
)

ActiveRecord::Base.connection.execute("CREATE TABLE active_record_models (id INTEGER UNIQUE, title STRING, text STRING)")

class ActiveRecordModel < ActiveRecord::Base; end
ActiveRecordModel.new

N = 100_000
Benchmark.bm { |x| x.report('new') { N.times { ActiveRecordModel.new } } }

I could tell from perftools.rb output we were calling new methods, and those new methods took plenty of time.

Here is the call graph for this benchmark on Rails 3.0:

arnew

Compared to the call graph on Rails 3.1:

arnew31

After some work, I was able to remove the method calls to scope when allocating new AR objects, but the benchmark still showed 3.1 to be slower than 3.0. I noticed we were spending more time in the Garbage Collector on 3.1 than on 3.0. This lead me to believe that we were creating more objects in the newer version.

Counting Object Allocations

Counting object allocations with Ruby 1.9 is actually pretty easy. We can call ObjectSpace.count_objects to get a list of the live objects in the system. That method returns a hash where the keys are the object type and the values are the number of that object in the system (there are other keys, be we don’t care about them for now).

Knowing this information, we can write a function that will calculate a rough difference in object allocations:

def allocate_count
  GC.disable
  before = ObjectSpace.count_objects
  yield
  after = ObjectSpace.count_objects
  after.each { |k,v| after[k] = v - before[k] }
  GC.enable
  after
end

p allocate_count { 100.times { {} } }

The first thing we do in this function is disable the garbage collector. It’s possible that the code we’re investigating could trigger GC, and that will impact our allocated object count.

Then we grab our current list of objects, yield to the code we want to profile, grab the list again, calculate our delta, enable GC and return!

If you run this code, you should see output similar to this (I’ve removed keys I don’t care about):

{ ... :T_HASH=>101, ... }

You can see from the output that we’ve allocated 101 hashes. 100 came from our 100.times call, and one came from calling the method on ObjectSpace.

Since inspecting our code can impact object allocations, these counts can’t be 100% accurate. Another annoying thing is that we can’t determine where our code is allocating these hashes. It’s easy enough to see the hash literals in this code, but imagine a project like rails. You may be able to find hash literals in your source, but determining which ones are being called is difficult.

After running this against the profile, I found that ActiveRecord in 3.1 was allocating 2x the number of hashes that 3.0 allocated. Now the problem is figuring out where these hashes were being allocated.

DTrace

DTrace gives us hooks in to our processes. Many of the hooks provided are for system calls, or other various functions. DTrace requires that we write probes in our code, and unfortunately Ruby does not have DTrace probes built in.

Adding DTrace to Ruby

To find these Hash allocations, I decided to add DTrace support. I usually run Ruby built from Ruby trunk, so running a modified Ruby seemed acceptible. It turns out adding DTrace support is pretty easy.

First I added a probe definition file. This file describes the probes that I want to add to Ruby. It contains the probe names and the signatures of the probes:

provider ruby {
  probe hash__alloc(const char *, int);
};

This definition says we’re declaring a probe called hash-alloc and it will take a string (the file name) and an int (the line number).

Next I used this definition file to generate a header file that ruby would use:

$ dtrace -o probes.h -h -s probes.d

Finally, I modified hash.c to trigger the probe:

diff --git a/hash.c b/hash.c
index b49aff8..c40d94d 100644
--- a/hash.c
+++ b/hash.c
@@ -15,6 +15,7 @@
 #include "ruby/st.h"
 #include "ruby/util.h"
 #include "ruby/encoding.h"
+#include "probes.h"
 #include <errno.h>

 #ifdef __APPLE__
@@ -221,6 +222,9 @@ hash_alloc(VALUE klass)
     OBJSETUP(hash, klass, T_HASH);

     RHASH_IFNONE(hash) = Qnil;
+    if(RUBY_HASH_ALLOC_ENABLED()) {
+       RUBY_HASH_ALLOC(rb_sourcefile(), rb_sourceline());
+    }

     return (VALUE)hash;
 }

After this, I built and installed ruby. Then I wrote a dtrace script to display the filename and line number where hash allocations were happening:

ruby*:::hash-alloc
{
  printf("%s:%d", copyinstr(arg0), arg1);
}

Then I ran the benchmark with dtrace:

$ sudo dtrace -s x.d -c 'ruby -I lib test.rb'

Here is an excerpt of the output:

  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1528
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1529
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1534
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1535
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1525
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/persistence.rb:322
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1527
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1528
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1529
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1534
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1535
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1525

I was able to compare the output from this script on Rails 3.0 vs Rails 3.1. From this output, I was able to determine:

  • The number of hash allocations
  • Where the hashes were allocated
  • Which allocations were new in Rails 3.1

Armed with this information, I was able to eliminate some allocations and return speed in Rails 3.1 to that of Rails 3.0.

DTrace vs gdb vs memprof

I probably could have accomplished this task with memprof, but it requires that I use 1.8.7 from rvm. Working with RVM on my machine is difficult (don’t ask, let’s just say I’m a “special needs” user), and I wanted to use 1.9. So memprof was out the window.

I was able to get the same information by scripting gdb. I set a breakpoint at the correct function, called rb_sourcefile() and rb_sourceline() and redirected to a file. The problem with gdb is that it seemed very slow, and scripting it was a pain (though I am not a gdb expert!).

DTrace was fast, relatively easy to use, and very scriptable. It made me happy!

OMG!!!

I would like to see dtrace probes officially added to ruby trunk. The ruby that ships with OS X has them built in, but they don’t give us information like hash literal allocations. It looks like they were in ruby trunk at one point, but were then reverted. I would like to see them added again.

If you want to play with them on ruby trunk, here is my full patch. Make sure to run make probes.h before make && make install.

HAPPY WEDNESDAY!!!! <3 <3 <3 <3

(it feels good to blog on a non “professional” blog)

13 Comments on I want DTrace probes in Ruby

  1. Avdi Grimm says:

    Very cool, a great read.

    I found it kind of fun how the probe was variously referred to as hash___alloc, HASH_ALLOC, and hash-alloc. Interesting decision by the DTrace developers, there…

  2. Thanks!

    Yes, the double underscore indicates that the probe name should contain a dash rather than an underscore. Strange hack indeed.

  3. postmodern says:

    This is a good step in the right direction. Profilers such as DTrace or SystemTap (http://sourceware.org/systemtap/) are incredibly powerful tools. While I hear SystemTap has a steeper learning curve, have you poked at it much?

  4. bryanl says:

    I thought I remembered that someone went down this path before. You don’t get dtrace with linux or windows do you? And is the OS X implementation as good as the Solaris one?

  5. postmodern says:

    @bryanl ironically, DTrace originated on Solaris. Linux got SystemTap, then DTrace. I wonder when some of those ZFS ideas will start crossing over.

  6. postmodern says:

    @bryanl Windows has something similar, called the Windows Performance Toolkit (http://msdn.microsoft.com/en-us/performance/cc825801.aspx).

  7. dotemacs says:

    Aaron thank you for doing this! If you can get DTrace back into Ruby, that would be great and much appreciated.

    @bryanl the Mac OS X dtrace has disabled probes for certain comercial products, so in a sense it is limited compared to the Solaris one

    Just for the record, FreeBSD also has dtrace.

    And there is a Ruby gem allowing you to write Ruby programs:
    http://ruby-dtrace.rubyforge.org/

  8. Andrew Grimm says:

    Would running ordinary perftools (the section under “Profiling the Ruby VM and C extensions” at https://github.com/tmm1/perftools.rb, not perftools.rb itself) be able to tell you which methods called for object creation?

    Also, are the main tools for Plain Old Ruby Objects currently:
    * Ruby-prof (nice, but mainly for 1.8, and doesn’t distinguish garbage collection from other stuff)
    * perftools.rb (1.8 and 1.9, only shows Ruby methods)
    * ordinary perftools (shows C level calls, but not Ruby level calls?)
    * platform specific tools (eg tools for Linux or Mac or Java)
    * memprof

  9. Anko Painting says:

    I never knew dtrace was on linux until I saw this article! thanks!! <3

  10. raichoo says:

    OpenSolaris came with dtrace-enabled Ruby 1.8. Dunno if OpenIndiana features them, but maybe you can have your patches integrated there.

    Regards,
    raichoo

  11. ProGNOMmers says:

    > Working with RVM on my machine is difficult (don’t ask,[...]

    What are your special needs???

  12. Blake Irvin says:

    IIRC, the one big commercial Solaris-flavored cloud vendor, Joyent, ships DTrace-enabled Ruby.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>