Writing Ruby C extensions: Part 1
I like writing C extensions for Ruby. In this series of blog posts we're
going to explore writing C extensions. I will cover topics including setting
up the development environment, TDD, debugging techniques, dealing with
Ruby's garbage collector, cross compiling for windows, and more.
By the end of this series, we should end up with a Ruby C extension that wraps
libstree. libstree is a Suffix tree implementation written in C.
In this part, we're going to set up our development environment, examine the
layout of a typical C extension, and implement our first method in C. Of
course we will be doing this TDD, so we'll also get autotest running.
Prerequisite gems
First up, we need to install a few gems to make building our extension easier.
Install the following three gems, and while they install, you should read about
why we need them:
$ sudo gem install ZenTest hoe rake-compiler
ZenTest
ZenTest contains autotest, which we'll be using to automatically run the tests
while we're developing
hoe
Hoe abstracts gem specifications for us. It knows how to properly build a
gemspec, and provides us with a few rake tasks that make development simple.
rake-compiler
This gem provides us with compilation tasks, and generally makes building
native gems easier. We'll be looking further in to rake-compiler's
capabilities in later articles.
Create the project
We're going to call this gem "stree". The first thing we'll do is use the "sow"
command supplied by Hoe to create the initial project structure.
$ sow stree
You should now have an initial project tree set up under the "stree" directory.
Remove the "bin" directory, as we won't need that. I rename all of my
documentation files to end in "rdoc", but that is just my personal preference.
Writing our first test
First thing we need to do is write our first failing test. Open up
"test/test_stree.rb" and make it look like this:
require "test/unit"
require "stree"
class TestStree < Test::Unit::TestCase
def test_hello_world
assert_equal 'hello world', Stree.hello_world
end
end
This test is very simple. The trick though, is that the "hello_world" method
will be implemented in C. At this point, you should be able to run "rake" and
see a failing test.
Native extension project layout
Native extension layouts look very similar to normal pure ruby layouts. We just
add one more directory called "ext". Under the "ext" directory we'll add
another directory that is the same name as our gem, "stree". Under
"ext/stree" is where we'll keep all of our C code. Make those directories,
and you should have a file list that looks similar to this:
$ tree
.
|-- CHANGELOG.rdoc
|-- Manifest.txt
|-- README.rdoc
|-- Rakefile
|-- ext
| `-- stree
|-- lib
| `-- stree.rb
`-- test
`-- test_stree.rb
The next step is to modify our Rakefile.
Modifying the Rakefile
The next step is to modify the Rakefile to teach it how to compile our
extension. Once we get done with this step, our Rakefile will have a task
called "compile".
Modify your Rakefile so that it looks similar to this:
require 'rubygems'
require 'hoe'
Hoe.spec 'stree' do
developer('Aaron Patterson', 'aaron@tenderlovemaking.com')
self.readme_file = 'README.rdoc'
self.history_file = 'CHANGELOG.rdoc'
self.extra_rdoc_files = FileList['*.rdoc']
self.extra_dev_deps << ['rake-compiler', '>= 0']
self.spec_extras = { :extensions => ["ext/stree/extconf.rb"] }
Rake::ExtensionTask.new('stree', spec) do |ext|
ext.lib_dir = File.join('lib', 'stree')
end
end
Rake::Task[:test].prerequisites << :compile
I've modified the readme and history file sections to use custom named files.
The important parts are the "spec_extras", the "Rake::ExtensionTask" line and
the "Rake::Task" line.
The "spec_extras" line modifies the gemspec. When someone installs our gem,
this line tells the gem command to execute the "ext/stree/extconf.rb" file.
We'll talk a little bit more about the extconf.rb file later.
The "Rake::ExtensionTask" is the line where we get our "compile" task. It comes
from the rake-compiler gem. This block also configures rake-compiler to tell
it where to copy the compiled extension when it's finished. We want our
compiled extension to end up in "lib/stree/". This is my convention, and I'll
explain why this convention is good in later posts.
The final line tells Rake to always compile our extension before the tests run.
Some people might not want to use this, but I like compiling my extension
before every test run.
Configuring autotest
Autotest doesn't use the normal rake tasks when running your tests. That means
we need to teach autotest to compile our extension before running the tests.
We're going to hook in to the autotest run command and have it build our
extension before running the tests.
While we're at it, we'll also teach autotest to run the tests after any .c
files get modified.
Open up ".autotest" and make it look like this:
require 'autotest/restart'
Autotest.add_hook :initialize do |at|
at.add_mapping(/.*\.c/) do |f, _|
at.files_matching(/test_.*rb$/)
end
end
Autotest.add_hook :run_command do |at|
system "rake clean compile"
end
Start up autotest and let it run in the background. By the end of this blog
post, autotest should show one passing test.
At this point, we should see Rake complaining with a message:
rake aborted!
Don't know how to build task 'ext/stree/extconf.rb'
Let's deal with that error now.
extconf.rb
The responsibility of the extconf.rb file is generate a Makefile that will
be used to build your extension. Eventually, we will need to teach extconf.rb
how to examine the target system to make sure that the libstree library is
installed.
Right now, we don't need to do any inspection of the system. We simply want
to create a Makefile. To build our Makefile, we're going to use a library
that ships with ruby called "mkmf". Open up "ext/stree/extconf.rb" and
modify it to look like this:
require 'mkmf'
create_makefile('stree/stree')
That is the minimum code required to get our Makefile generated.
As I mentioned earlier, the extconf.rb file is executed by the RubyGems system
when installing our gem. While we're developing our gem, rake-compiler will
take care of executing that file for us.
Our first C code
Great! Our environment knows how to compile things, but we don't have anything
to compile! Let's write our first bit of C code.
We are going to write the file named "ext/stree/stree.c". The name of this
file is important. It corresponds to the "create_makefile" line from our
extconf.rb. After our extension is built, we'll end up with a file
"lib/stree/stree.dylib" (or .so depending on your system). This convention is
important, but I'm going to talk about why it's important in a later post.
When Ruby loads the dynamic library we're building, it must supply us with
some way to define our native methods. The way it does this is with another
naming convention using the dynamic library's file name. When "stree.dylib"
is loaded, ruby will automatically try to call a function called "Init_stree".
The second part matches the name of the file it loaded. In the Init_stree
function is where we'll do our native extension initialization.
In stree.c, define the Init_stree function to look like this:
#include <ruby.h>
void Init_stree()
{
VALUE mStree = rb_define_module("Stree");
rb_define_singleton_method(mStree, "hello_world", hello_world, 0);
}
This function does two things, defines the "Stree" module, and the "hello_world"
method on the Stree module.
The first line actually defines the module, the second line tells ruby to
define the singleton method "hello_world", and when that method gets called,
to call the "hello_world" C function pointer. The 0 indicates the number of
arguments.
Let's actually add the hello_world C function now:
static VALUE hello_world(VALUE mod)
{
return rb_str_new2("hello world");
}
We declare this function as static because it's not needed outside this file.
All ruby methods must return a VALUE. The first argument to the C function is
always the recipient of the message, in this case it will be the Stree module.
We create a new ruby string with rb_str_new2() and return it.
The final stree.c file should look like this:
#include <ruby.h>
static VALUE hello_world(VALUE klass)
{
return rb_str_new2("hello world");
}
void Init_stree()
{
VALUE mStree = rb_define_module("Stree");
rb_define_singleton_method(mStree, "hello_world", hello_world, 0);
}
A note on types.
When we write ruby, everything is an object. When we write ruby in C,
everything is a VALUE. We'll learn more about the VALUE type in later posts.
Finishing up
We've written our C code, everything should now compile and copy in to the
right place, but our tests are still failing. What gives?
We've got one more tiny modification to make. We need to actually require
the dynamic library that we built. Open up "lib/stree.rb" and modify it to
look like this:
require 'stree/stree'
module Stree
VERSION = '1.0.0'
end
We've now told ruby to load the dynamic library we built, and changed the
definition of Stree to a module in our ruby code. At this point, our test
should be passing. Congratulations! You have now successfully mixed C and
Ruby code.
If you were successful, your project tree should look like this:
$ tree -I tmp
.
|-- CHANGELOG.rdoc
|-- Manifest.txt
|-- README.rdoc
|-- Rakefile
|-- ext
| `-- stree
| |-- extconf.rb
| `-- stree.c
|-- lib
| |-- stree
| | `-- stree.bundle
| `-- stree.rb
`-- test
`-- test_stree.rb
The "tmp" directory is where rake-compiler stashes your .o files when
compiling your extension. I've omitted that from the tree to keep it short.
Last notes
I've posted the code for stree here in case you're having troubles. I've
made tags for each post so you can follow along.
Next time, we'll tackle making sure libstree is installed, compiling and
linking against libstree, and making a few calls in to libstree from Ruby.
In the mean time, your homework is to read through README.EXT.