Author:

Aaron PattersonMy name is Aaron Patterson.

Ruby Committers Design Contest Update!

Posted by – January 20, 2011

Update:

Hey folks, somehow I made a mistake and one entry was omitted from the index. Please take a look at static mirror, and vote for it here if you like it! Apologies to the people that submitted this entry. I didn’t mean to miss this!

Non Update:

The deadline for entry was last night. I’ve compiled a static mirror of all of the entries here.

I am humbled, and incredibly impressed by the number and quality of all of the entries. I can’t thank everyone enough for doing this. The quality of the results is so good, that I would really appreciate feedback from the community.

Please check out the entries list. Carefully review all entries. Then, leave feedback by either voting up the entry on github, or leaving a comment on my blog. I will take all feedback in to consideration tomorrow when I decide the winner.

Remember that when you judge these designs, please judge just the design. If links are broken, or encoding looks wrong, or something is a little bit off, it’s probably my fault. Pick the designs you like the most, and we’ll polish them up after declaring a winner.

Please tell me about all of the designs you like. If you like 5 of them, let me know. It’s also helpful to know why you like those designs.

As I said, I’ll declare the winner tomorrow before 23:59:59 PST.

Again, I really can’t thank everyone enough for doing this. The results are so much more than I could have hoped for, or expected. You have all outdone yourselves! No amount of Love Bucks can repay for the awesomeness I’ve witnessed, the best I can do is keep fixing bugs on Ruby, Rails, and any other project I get my hands on!

Thank you!

rubycommitters.org design contest!

Posted by – January 4, 2011

omg…. OMG…. ZOMG!!!!! HAPPY TUESDAY TO EVERYONE IN THE WORLD!!!!

Alright, now that the formalities are out of the way, LET’S GET DOWN TO BUSINESS! Some of you may or may not know, I joined the Ruby core team in October 2009. I am very proud to be a member of ruby-core. However, I have noticed that the Ruby core team does not have an awesome website like the Rails core team.

To rectify this situation, I registered a domain: rubycommitters.org. Unfortunately, my design skills are… sub-par. I would like to fist fix this by having a contest. I would like you to code the design for rubycommitters.org.

How do I enter?

Just fork the rubycommitters.org project on github, follow the instructions in the README, and make the site look good! When you’re done, send me a pull request.

How many times can I enter?

As many as you want. However, you can only win one place.

When is the due date?

You must send me the pull request by January 19th, 23:59:59 PST.

When will winners be announced?

I will announce winners by January 21st, 23:59:59 PST. Once I’ve decided, I’ll email the winners to get their PayPal information and transfer the money Love Bucks.

What are the prizes?

My employer pays me in Tenderlove Cash (from here on referred to as “Love Bucks”), so the prize will be in Love Bucks. Fortunately for all of us, Love Bucks exchange with the US Dollar at a 1:1 ratio. So I will send you your prize via PayPal in the form of US Dollars.

First Place: 200 300 Love Bucks (in the form of US Dollars)
Second Place: 100 Love Bucks (in the form of US Dollars)

All entrants will win a hug from me that is redeemable next time we meet each other!

How will entries be judged?

By me, however I want. I’ll probably ask the intertubes for help, but I have the final say.

Why is the prize so low?

These Love Bucks are coming out of my own wallet! Give me a break!

Conclusion

Since I announced I needed help last weekend, already 9 people have started working, so you’d better get cracking!

I am proud to be a member of the Ruby core team, I’m proud to be involved in the Rails development team, and most of all I’m proud to be a member of the best development community in the world. Thanks to everyone that makes being a member of the Ruby community so awesome!

EDIT!!!!

@mariozig has graciously offered to up the ante. He has donated 100 Love Bucks to the first place winner! That’s a total of 300 Love Bucks for the person who wins first place! Awesome!!!

Writing Ruby C Extensions: Part 2

Posted by – December 11, 2010

OMG! It’s been a year since I posted Writing Ruby C Extensions: Part 1. The first post I did was for the Ruby Advent Calendar in 2009. I guess it’s fitting that I write a blog post for the Ruby Advent Calendar 2010. Anyway, if you haven’t read part 1, please go read it now.

In Part 2, we’ll modify our extconf.rb file to find important files in libstree, then we’ll create a Ruby class that is backed by a C structure.

The final code associated with this part of of my Writing Ruby C Extensions series can be found here.

Using mkmf to find libraries

As I mentioned in the previous post, extconf.rb is used when installing a native gem to locate libraries, header files, and test various things about the target system before installing. We’re going to teach our extconf.rb file to locate the libstree dynamic library along with the header files. We’re also going to allow people to tell the gem where to find libstree, and set up our extconf.rb with some sensible defaults.

mkmf configuration with dir_config

The first thing we’ll do is tell mkmf where to look for libstree files by default. We do this using the dir_config method. dir_config takes three arguments:

  • An arbitrary string, but usually the library name (like “stree”)
  • A list of paths to search for header files
  • A list of paths to search for library files

The dir_config method also allows users installing our gem to configure where mkmf should look for various files. Let’s take a look at our call to dir_config and talk about what it does:

LIBDIR      = Config::CONFIG['libdir']
INCLUDEDIR  = Config::CONFIG['includedir']

HEADER_DIRS = [
  # First search /opt/local for macports
  '/opt/local/include',

  # Then search /usr/local for people that installed from source
  '/usr/local/include',

  # Check the ruby install locations
  INCLUDEDIR,

  # Finally fall back to /usr
  '/usr/include',
]

LIB_DIRS = [
  # First search /opt/local for macports
  '/opt/local/lib',

  # Then search /usr/local for people that installed from source
  '/usr/local/lib',

  # Check the ruby install locations
  LIBDIR,

  # Finally fall back to /usr
  '/usr/lib',
]

dir_config('stree', HEADER_DIRS, LIB_DIRS)

First, this code builds a two lists of sensible defaults for finding header files and library files. The HEADER_FILES and LIB_DIRS constants contain lists of common places to find libraries. These settings will be nice for our users because if they have libstree installed in /opt/local/ or /usr/local/ it will find the library without any user intervention.

Finally, we call dir_config with the string “stree” and two lists. This call to dir_config only configures mkmf with directories to search. We actually haven’t done any searching at this point. The dir_config call also allows users to configure the gem on installation. The call sets up the following flags for our user to configure:

  • --with-stree-dir
  • --with-stree-include
  • --with-stree-lib

Finding headers and libraries

Now that we’ve configured mkmf with where we can find libraries and headers, we need to search for required header files and libraries. We’ll do that with two functions: find_header and find_library.

We need to find the stree/lst_string.h header file, so we’ll just supply that to the find_header method like so:

unless find_header('stree/lst_string.h')
  abort "libstree is missing.  please install libstree"
end

This code will tell mkmf to find the header file we need. If the header file can’t be found, find_header will return false, and we can abort installation and provide some instructions. If the find_header method is a success, the directory where the header file was found will be added to the -I flags that get passed to your compiler.

Next, we need to find the libstree dynamic library. For this task, we’ll use the find_library function call:

unless find_library('stree', 'lst_stree_free')
  abort "libstree is missing.  please install libstree"
end

The find_library function takes two arguments. The first argument is the library that we need to link against. This string will be passed to the -l flags. The second argument is a symbol we need to find in the library.

In this code example, mkmf will create a test C program that tries to link against stree and find the function lst_stree_free. If linking is successful, the path will be added to the -L flags provided to your compiler. If it fails, we abort installation and provide an error message.

Creating the Makefile

Just like the last article we still need the call to create_makefile in our extconf.rb:

create_makefile('stree/stree')

You can find the complete extconf.rb here.

Wrapping LST_String from libstree

libstree defines a String type structure. We’re going to define a class in Ruby to wrap up this string type structure. Eventually, we’ll have some Ruby code that looks like this:

string = STree::String.new 'foo'
assert_equal 3, string.length

In fact, since we’re doing TDD let’s start with a test for the length method. We’ll also add a test to ensure that objects other than String objects will raise a TypeError:

require 'stree'
require 'test/unit'

module STree
  class TestString < Test::Unit::TestCase
    def test_length
      string = STree::String.new 'foo'
      assert_equal 3, string.length
    end

    def test_type_error
      assert_raises(TypeError) do
        STree::String.new Object.new
      end
    end
  end
end

File structure

In my C projects, I like to make one C file per class. We have to make an entry point though, so we’ll keep stree.h and stree.c from our previous project. Then we’ll write stree_string.h and stree_string.c to keep our String class.

Library entry point

The entry point to our C code will be in stree.c. The stree.c file will initialize the String class. Here is the new stree.h file that includes libstree:

#ifndef RUBY_STREE
#define RUBY_STREE

#include <ruby.h>;
#include <stree/lst_string.h>;

#include <stree_string.h>;

extern VALUE mSTree;

#endif

We include header files from libstree, we include the header file for the string class, then we declare a global variable which will hold a reference to our Ruby “STree” module.

The new stree.c file looks like this:

#include <stree.h>

VALUE mSTree;

void Init_stree()
{
  mSTree = rb_define_module("STree");

  Init_stree_string();
}

When our library is required, Init_stree is called, then we’ll define the STree module (assigning it to the global module variable) and initialize our String class. Now we need to define Init_stree_string in stree_string.h and stree_string.c.

Defining the String class

First we’ll create the header file for our string class. We’ll only have one public function called Init_stree_string, so our header file will look like this:

#ifndef RUBY_STREE_STRING
#define RUBY_STREE_STRING

#include <stree.h>

void Init_stree_string();

#endif

We include the main stree.h header file, then define our public initialize function. Now we need to define the body of the Init_stree_string function in stree_string.c:

#include <stree_string.h>

void Init_stree_string()
{
  VALUE cSTreeString = rb_define_class_under(mSTree, "String", rb_cObject);
}

The rb_define_class_under function will define a class “String” in the module pointed to by mSTree with a parent class of Object. This C code is equivalent to the following Ruby code:

module STree
  class String
  end
end

At this point, you should be able to compile the project and run the tests. We haven’t defined any methods on the STree::String class in Ruby yet, but our project should compile, and the tests should execute. If you’re following along, you should see test output like this:

  1) Error:
test_length(STree::TestString):
ArgumentError: wrong number of arguments (1 for 0)
    ./test/test_stree_string.rb:7:in `initialize'
    ./test/test_stree_string.rb:7:in `new'
    ./test/test_stree_string.rb:7:in `test_length'

Allocating the String class

The first thing we’re going to do is teach Ruby how to allocate our String class. Ruby gives us a hook when the allocate method is called where we can allocate internal structures (we’re actually defining the allocate method on the STree::String class).

First, let’s modify the init function to tell ruby about our allocate function:

void Init_stree_string()
{
  VALUE cSTreeString = rb_define_class_under(mSTree, "String", rb_cObject);

  rb_define_alloc_func(cSTreeString, allocate);
}

rb_define_alloc_func tells Ruby to call a function pointer allocate when this class gets allocated. New we need to define our allocate function:

static VALUE allocate(VALUE klass)
{
  LST_String * string = malloc(sizeof(LST_String));

  return Data_Wrap_Struct(klass, NULL, deallocate, string);
}

In our allocate function, we allocate enough memory to hold an LST_String struct. Then we call Data_Wrap_Struct to return our actual Ruby object. Data_Wrap_Struct takes four arguments:

  • The Ruby class we’re dealing with (in this case it’s cSTreeString
  • A function pointer that is called when the object is marked
  • A function pointer that is called with the object is freed
  • A void pointer of the data we want to wrap

You’ll notice we’re referencing a function deallocate that isn’t defined yet. Let’s define that function now:

static void deallocate(void * string)
{
  lst_string_free((LST_String *)string);
}

The deallocate function is called with the pointer we passed to Data_Wrap_Struct, in this case an LST_String pointer. We’ll use the lst_string_free function from libstree to free our pointer.

Defining STree::String#initialize

Now we need to define the initialize method. This method will take one argument (a string), and we’ll populate the underlying LST_String struct with information from the Ruby string.

To define the initialize method, first we call rb_define_method:

void Init_stree_string()
{
  VALUE cSTreeString = rb_define_class_under(mSTree, "String", rb_cObject);

  rb_define_alloc_func(cSTreeString, allocate);
  rb_define_method(cSTreeString, "initialize", initialize, 1);
}

rb_define_method takes 4 arguments:

  • The class on which we want to define a method
  • The name of the method we’re defining
  • A function pointer that will be called when our method is called
  • The number of parameters passed to that function

Next we need to define our initialize C function:

static VALUE initialize(VALUE self, VALUE rb_string)
{
  LST_String * string;
  void * data;

  Check_Type(rb_string, T_STRING);

  Data_Get_Struct(self, LST_String, string);

  data = calloc(RSTRING_LEN(rb_string), sizeof(char));
  memcpy(data, StringValuePtr(rb_string), RSTRING_LEN(rb_string));

  lst_string_init(
      string,
      data,
      sizeof(char),
      RSTRING_LEN(rb_string));

  return self;
}

The initialize function has two parameters, the first is the instance of our STree::String object, the second is the single parameter for our method.

After declaring our variables we check the type of the required argument. Check_Type is a macro provided by Ruby to let us perform type checking on objects. We use this macro to ensure that the user passed us a Ruby string. If not, the Check_Type macro will automatically raise a type error.

Next we make a call to a macro provided by Ruby: Data_Get_Struct. Our LST_String pointer is stored inside the Ruby VALUE object, and Data_Get_Struct will extract our pointer. We give this macro the ruby object self, followed by the struct type we want to extract (LST_String), followed by the pointer where it will be assigned (string).

We need to copy the contents of the Ruby string to a buffer that our LST_String can keep. To do that, we use:

  • calloc to allocate the memory
  • RSTRING_LEN to get the number of bytes in our string
  • memcpy to copy the memory contents
  • StringValuePtr to get the underlying character pointer from Ruby

We give the data to libstree by calling lst_string_init, then finally return self.

At this point, we should have one passing test and one failing test:

  1) Error:
test_length(STree::TestString):
NoMethodError: undefined method `length' for #
    ./test/test_stree_string.rb:8:in `test_length'

Next we need to define the length method.

Defining STree::String#length

The hard part is over. Defining the length method should be much easier than the initialize method. Just like the initialize method, we need to call rb_define_method:

void Init_stree_string()
{
  VALUE cSTreeString = rb_define_class_under(mSTree, "String", rb_cObject);

  rb_define_alloc_func(cSTreeString, allocate);
  rb_define_method(cSTreeString, "initialize", initialize, 1);
  rb_define_method(cSTreeString, "length", length, 0);
}

This time, we’re defining a function length that takes 0 arguments. Now lets define the length C function:

static VALUE length(VALUE self)
{
  LST_String * string;

  Data_Get_Struct(self, LST_String, string);

  return INT2NUM(lst_string_get_length(string));
}

Just like the initialize function, we declare our variables, then unwrap our struct. We use the lst_string_get_length function from libstree to get the string length as an integer. Then we use a macro provided by Ruby, INT2NUM, that converts the integer to a Ruby Numeric object and return that object.

After we’ve defined this method, all of our tests should pass:

Loaded suite -e
Started
..
Finished in 0.000873 seconds.

2 tests, 2 assertions, 0 failures, 0 errors

Yay!

Conclusion

OMG! C CODE WRAPPED WITH RUBY!

We’ve scratched the surface for writing C extensions in Ruby. In this part, we:

  • taught our system how to find the library we want to use
  • (briefly) dealt with memory management of our objects
  • defined modules and classes
  • defined methods on our classes

You can grab the code for part 2 here.

Happy holidays to EVERYONE! I hope you liked Part 2 of Writing Ruby C Extensions!

<3<3<3<3<3<3<3<3 –tenderlove

Ruby Advent Calendar

Posted by – December 9, 2010

Every year Japanese Rubyists participate in what is called the “Ruby Advent Calendar”.

What is the Ruby Advent Calendar?

Each day leading up to December 25th, one person posts an article to their blog and adds a link to their blog on the Advent Calendar. So, 25 blog posts total. The posts can be about anything related to Ruby.

In my opinion, the Ruby Advent Calendar is about encouraging people to blog about Ruby and help others participate in the Ruby community.

How I can participate?

In past years, the articles have mainly been in Japanese. This year, we’re trying to do an English language Ruby Advent Calendar.

@yhara_en was kind enough to add English instructions. I will add screen shots with notes here so that you can more easily participate.

Step 1: Get some Coffee

As I’ve said in previous blog posts, I love it when instructions tell you to get coffee because I always do. So here is your opportunity. Go get some coffee!

Step 2: Click a Button

First, go to this link. Then click the button that says “このエベントに参加登録する”:

Ruby Advent Calendar jp-en: 2010 : ATND

Once you’ve clicked that button, you need to register.

Step 3: Register

The easiest way to register is with your Twitter account. Click the link that says “Twitterでログイン”:

Ruby Advent Calendar jp-en: 2010 : ATND

You’ll be taken through the normal Twitter authorization path. That part is in English, so I’m not going to cover it.

Step 4: Edit account details

Enter your name and website and click save (the button that says “保存する”):

users/profile : ATND

Step 5: Register for the Ruby Advent Calendar

Go back to the Ruby Advent Calendar and click the giant red button again:

Ruby Advent Calendar jp-en: 2010 : ATND

This time it will ask you to enter a comment:

Ruby Advent Calendar jp-en: 2010 : ATND

So that we can coordinate better, I just said which day I would like to participate.

Step 6: Publish a Blurgh Post

On your day, publish your blog post, then head back to the advent calendar, scroll to the bottom of the page, and link to your blog in the comments section:

Ruby Advent Calendar jp-en: 2010 : ATND

That’s it!

Happy Holidays, and Happy Blurghing!!!

<3<3<3<3<3<3 –tenderlove

Event based JSON and YAML parsing

Posted by – April 17, 2010

Let’s use Ruby 1.9.2 and Psych to build an event based twitter stream parser. Psych is a YAML parser that I wrote and is in the standard library in 1.9.2. Eventually, it will replace the current YAML parser, but we can still use it today!

But you said YAML and JSON! wtf?

I know! In the YAML 1.2 spec, JSON is a subset of YAML. Psych supports YAML 1.1 right now, so *much* (but not all) JSON is supported. Once libyaml is upgraded to YAML 1.2, it will have full JSON support!

Why do we want to do an event based parser?

Twitter streams are a never ending flow of user status updates, and if we want a process to live forever consuming these updates, it would be nice if that process kept a low memory profile. Psych is built in such a way that we can hand it an IO object, it will read from the IO object, then call callback methods as soon as possible. It buffers as little as possible, sending events as soon as possible. If you are familiar with SAX based XML parsing, this will be familiar to you. Plus it is a fun problem!

Let’s start by writing an event listener for some sample JSON.

Event Listener

Our event listener is only going to listen for scalar events, meaning that when Psych parses a string, it will send that string to our listener. There are many different events that can happen, so Psych ships with a handler from which you can inherit. If you check out the source for the base class handler, you can see what types of events your handler can intercept.

For now, let’s write our scalar handler, and try it out.

require 'psych'

class Listener < Psych::Handler
  def scalar(value, anchor, tag, plain, quoted, style)
    puts value
  end
end

listener = Listener.new
parser   = Psych::Parser.new listener
parser.parse DATA

__END__
{"foo":"bar"}

If you run this code, you should see the strings “foo” and “bar” printed.

In this example, our handler simply prints out every scalar value encountered. We created a new instance of the listener, pass that listener to a new instance of the parser, and tell the parser to parse DATA. We can hand the parser an IO object or a String object. This is important because we’d like to hand the parser our socket connection, that way the parser can deal with reading from the socket for us.

Hooking up to Twitter

It would be convenient for us if Twitter’s stream was one continuous JSON document. Why? If it was, we could feed the socket straight to our JSON parser and start consuming events immediately. Unfortunately, Twitter’s stream is not so kind for us event based consumers. We’ll need to trick our JSON parser to think the feed is one continuous document. We’ll get tricky with our data in a minute, but first let’s deal with authentication.

Authentication

Twitter requires us to authenticate before we can consume a feed. Stream authentication is done via Basic Auth. Let’s write a class that can authenticate and read from the stream. Once we do that, we’ll concentrate on parsing the stream.

require 'socket'

class StreamClient
  def initialize user, pass
    @ba = ["#{user}:#{pass}"].pack('m').chomp
  end

  def listen
    socket = TCPSocket.new 'stream.twitter.com', 80
    socket.write "GET /1/statuses/sample.json HTTP/1.1\r\n"
    socket.write "Host: stream.twitter.com\r\n"
    socket.write "Authorization: Basic #{@ba}\r\n"
    socket.write "\r\n"

    # Read the headers
    while((line = socket.readline) != "\r\n"); puts line if $DEBUG; end

    # Consume the feed
    while line = socket.readline
      puts line
    end
  end
end

StreamClient.new(ARGV[0], ARGV[1]).listen

This class takes a username and password and calculates the basic auth signature. When “listen” is called, it opens a connection, authorizes, reads the response headers, and starts consuming the feed.

Processing the Feed

If we look at the output from the previous script, we’ll see that the Twitter stream looks something like this:

512
{"in_reply_to_screen_name":null,...}

419
{"in_reply_to_screen_name":"tenderlove"...}

Which isn’t valid JSON. Instead, it’s a header (the number) indicating the length of the JSON chunk, the JSON chunk, then a trailing “\r\n”. We would like the stream to look something like this:

---
{"in_reply_to_screen_name":null,...}
...
---
{"in_reply_to_screen_name":"tenderlove"...}
...

This chunk is two valid YAML documents. If the stream looked like this, we could feed it straight to our YAML processor no problem. How can we modify the stream to be suitable for our parser?

Fun with Thread and IO.pipe

If we create a pipe, we can have have one thread process input from Twitter and feed that in to the pipe. We can then give the other end of the pipe to our JSON processor and let it read from our processed feed. Let’s modify the “listen” method in our client to munge the feed to a pipe, and hand that off to our YAML processor. I only care about the text of people’s tweets, so let’s modify our listener too.

Here is our completed program:

require 'socket'
require 'psych'

class StreamClient
  def initialize user, pass
    @ba = ["#{user}:#{pass}"].pack('m').chomp
  end

  def listen listener
    socket = TCPSocket.new 'stream.twitter.com', 80
    socket.write "GET /1/statuses/sample.json HTTP/1.1\r\n"
    socket.write "Host: stream.twitter.com\r\n"
    socket.write "Authorization: Basic #{@ba}\r\n"
    socket.write "\r\n"

    # Read the headers
    while((line = socket.readline) != "\r\n"); puts line if $DEBUG; end

    reader, writer = IO.pipe
    producer = Thread.new(socket, writer) do |s, io|
      loop do
        io.write "---\n"
        io.write s.read s.readline.strip.to_i 16
        io.write "...\n"
        s.read 2 # strip the blank line
      end
    end

    parser = Psych::Parser.new listener
    parser.parse reader

    producer.join
  end
end

class Listener < Psych::Handler
  def initialize
    @was_text = false
  end

  def scalar value, anchor, tag, plain, quoted, style
    puts value if @was_text
    @was_text = value == 'text'
  end
end

StreamClient.new(ARGV[0], ARGV[1]).listen Listener.new

Great! In 30 lines, we’ve been able to provide an event based API for consuming Twitter streams. Were it not for the feed munging, we could reduce that by 9 lines!

Problems

So far, there have only been two problems for me with this script. The first is that we are forced to buffer the response from Twitter, but we cannot help that. The second is that sometimes the JSON emitted from Twitter is not parseable by Psych. I think this is just due to Psych only supporting YAML 1.1.

Conclusion

It’s true that we could have implemented this same interface without a pipe and a thread. Rather than munging the stream, we could create a new parser instance for each status update. But why create so many objects for parsing the stream when we only need one?

Anyway, have fun playing with this code, and I encourage you to try out Ruby 1.9.2. I think it’s really fun! PEW PEW PEW! HAPPY SATURDAY!

RDoc on your iPad

Posted by – April 12, 2010

Oh snap! I haven’t posted here in a long time. My day job and my night jobs have been keeping me too busy! Hopefully I’ll have more time to blog in the future. I have a bunch of ideas, I just need to find the time to write!

Anyway, let’s talk RDoc, iPad, and epub! I like documentation. I especially like consuming documentation. I thought it would be neat if I could read documentation on my iPad. As it turns out, getting RDoc documentation on your iPad isn’t that hard!

Nokogiri on iPad

According to Wikipedia, iBooks is an EPUB reader. EPUB is a standard format for making books. The EPUB format is basically a zip file that contains a bunch of XHTML and XML documents. The XHTML documents are the “meat” of your book, where the XML documents tell the reader where to find everything, and the order in which to put things. RDoc already emits HTML, so our job is to make sure it emits XHTML along with the special XML files. How do we do that?

RDoc supports a plugin system where we can hook in and emit anything we want. To hook in to RDoc, we just add a special file to our gem (“lib/rdoc/discover.rb”), and register with the RDoc plugin system. So I wrote a gem called paddle that plugs in to RDoc, emits the documentation as XHTML along with the supporting XML files. It even comes with a nice Ruby logo! I encourage you to take a look at the source. The code is quite short, but could be refactored even smaller!

Using the Paddle

Creating your own books with Paddle is really easy. First, install paddle:

  $ sudo gem install paddle

Then find a project for which you want to create a book. For this example, I’ll generate a book for one of my gems called texticle. From the project root, use the rdoc command. Make sure to tell rdoc to use the “paddle” formatter, and supply a title (very important to supply a title!):

  $ cd git/texticle
  $ rdoc -f paddle -t 'Texticle Documentation' -o epub lib

Now there should be an “epub” directory that contains your book. But we’re not quite done yet. There is one more step. The book must be in a zipfile, and the zipfile requires a particular format. Let’s create the zipfile now using the “zip” command:

  $ cd epub
  $ zip -Xr9D texticle.epub mimetype *

You should end up with a file named “texticle.epub”. Just drag that file to iTunes, sync up your iPad, and boom!

Problems

I hacked this out in an evening, so there are a few problems. I’ll mention them here, just so you’re not surprised, and to give you ideas for patches to submit! ;-)

  • Right now, the links don’t work:
  • I haven’t figured out why, but they don’t. That will come soon.

  • The author field isn’t filled out in the book:
  • I need to teach RDoc to take more command line options so we can tell Paddle what to use for the book’s author field

  • Only classes, modules, and the things they contain are documented:
  • Right now, your README file won’t show up in the book. That is just missing right now. It should be easy to add, I just haven’t done it.

A couple books to get you started

THE END!

Thanks for reading! Have fun making books for your iPad, and don’t forget to send patches back to me! :-D

Compiling with Clang

Posted by – January 3, 2010

HI EVERYONE AND HAPPY SUNDAY!

Lately I’ve been trying to compile my ruby extensions with Clang. One reason I like trying out my extensions with Clang is because it catches some errors that GCC doesn’t. If you know the right things to set, it’s pretty easy to get your extension to compile with Clang. Unfortunately finding the right thing isn’t always easy, but I found the right bits to flip and I want to share!

Here’s how to do it. Add this line to your extconf.rb right after you require mkmf:

require 'mkmf'

RbConfig::MAKEFILE_CONFIG['CC'] = ENV['CC'] if ENV['CC']

# ... rest of your extconf goes here

Then when you compile your extension, just set CC to point at clang:

$ CC=/Developer/usr/bin/clang rake compile

You can see it in action in the nokogiri extconf. You can even see where clang helped me shake out some bugs, and I think that’s pretty cool.

Writing Ruby C extensions: Part 1

Posted by – December 18, 2009

Writing Ruby C extensions: Part 1

I like writing C extensions for Ruby. In this series of blog posts we’re
going to explore writing C extensions. I will cover topics including setting
up the development environment, TDD, debugging techniques, dealing with
Ruby’s garbage collector, cross compiling for windows, and more.

By the end of this series, we should end up with a Ruby C extension that wraps
libstree. libstree is a Suffix tree implementation written in C.

In this part, we’re going to set up our development environment, examine the
layout of a typical C extension, and implement our first method in C. Of
course we will be doing this TDD, so we’ll also get autotest running.

Prerequisite gems

First up, we need to install a few gems to make building our extension easier.
Install the following three gems, and while they install, you should read about
why we need them:

$ sudo gem install ZenTest hoe rake-compiler

ZenTest

ZenTest contains autotest, which we’ll be using to automatically run the tests
while we’re developing

hoe

Hoe abstracts gem specifications for us. It knows how to properly build a
gemspec, and provides us with a few rake tasks that make development simple.

rake-compiler

This gem provides us with compilation tasks, and generally makes building
native gems easier. We’ll be looking further in to rake-compiler’s
capabilities in later articles.

Create the project

We’re going to call this gem “stree”. The first thing we’ll do is use the “sow”
command supplied by Hoe to create the initial project structure.

$ sow stree

You should now have an initial project tree set up under the “stree” directory.
Remove the “bin” directory, as we won’t need that. I rename all of my
documentation files to end in “rdoc”, but that is just my personal preference.

Writing our first test

First thing we need to do is write our first failing test. Open up
“test/test_stree.rb” and make it look like this:

require "test/unit"
require "stree"

class TestStree < Test::Unit::TestCase
  def test_hello_world
    assert_equal 'hello world', Stree.hello_world
  end
end

This test is very simple. The trick though, is that the “hello_world” method
will be implemented in C. At this point, you should be able to run “rake” and
see a failing test.

Native extension project layout

Native extension layouts look very similar to normal pure ruby layouts. We just
add one more directory called “ext”. Under the “ext” directory we’ll add
another directory that is the same name as our gem, “stree”. Under
“ext/stree” is where we’ll keep all of our C code. Make those directories,
and you should have a file list that looks similar to this:

$ tree
.
|-- CHANGELOG.rdoc
|-- Manifest.txt
|-- README.rdoc
|-- Rakefile
|-- ext
|   `-- stree
|-- lib
|   `-- stree.rb
`-- test
    `-- test_stree.rb

The next step is to modify our Rakefile.

Modifying the Rakefile

The next step is to modify the Rakefile to teach it how to compile our
extension. Once we get done with this step, our Rakefile will have a task
called “compile”.

Modify your Rakefile so that it looks similar to this:

require 'rubygems'
require 'hoe'

Hoe.spec 'stree' do
  developer('Aaron Patterson', 'aaron@tenderlovemaking.com')
  self.readme_file   = 'README.rdoc'
  self.history_file  = 'CHANGELOG.rdoc'
  self.extra_rdoc_files  = FileList['*.rdoc']
  self.extra_dev_deps << ['rake-compiler', '>= 0']
  self.spec_extras = { :extensions => ["ext/stree/extconf.rb"] }

  Rake::ExtensionTask.new('stree', spec) do |ext|
    ext.lib_dir = File.join('lib', 'stree')
  end
end

Rake::Task[:test].prerequisites << :compile

I've modified the readme and history file sections to use custom named files.
The important parts are the "spec_extras", the "Rake::ExtensionTask" line and
the "Rake::Task" line.

The "spec_extras" line modifies the gemspec. When someone installs our gem,
this line tells the gem command to execute the "ext/stree/extconf.rb" file.
We'll talk a little bit more about the extconf.rb file later.

The "Rake::ExtensionTask" is the line where we get our "compile" task. It comes
from the rake-compiler gem. This block also configures rake-compiler to tell
it where to copy the compiled extension when it's finished. We want our
compiled extension to end up in "lib/stree/". This is my convention, and I'll
explain why this convention is good in later posts.

The final line tells Rake to always compile our extension before the tests run.
Some people might not want to use this, but I like compiling my extension
before every test run.

Configuring autotest

Autotest doesn't use the normal rake tasks when running your tests. That means
we need to teach autotest to compile our extension before running the tests.
We're going to hook in to the autotest run command and have it build our
extension before running the tests.

While we're at it, we'll also teach autotest to run the tests after any .c
files get modified.

Open up ".autotest" and make it look like this:

require 'autotest/restart'

Autotest.add_hook :initialize do |at|
  at.add_mapping(/.*\.c/) do |f, _|
    at.files_matching(/test_.*rb$/)
  end
end

Autotest.add_hook :run_command do |at|
  system "rake clean compile"
end

Start up autotest and let it run in the background. By the end of this blog
post, autotest should show one passing test.

At this point, we should see Rake complaining with a message:

rake aborted!
Don't know how to build task 'ext/stree/extconf.rb'

Let's deal with that error now.

extconf.rb

The responsibility of the extconf.rb file is generate a Makefile that will
be used to build your extension. Eventually, we will need to teach extconf.rb
how to examine the target system to make sure that the libstree library is
installed.

Right now, we don't need to do any inspection of the system. We simply want
to create a Makefile. To build our Makefile, we're going to use a library
that ships with ruby called "mkmf". Open up "ext/stree/extconf.rb" and
modify it to look like this:

require 'mkmf'
create_makefile('stree/stree')

That is the minimum code required to get our Makefile generated.

As I mentioned earlier, the extconf.rb file is executed by the RubyGems system
when installing our gem. While we're developing our gem, rake-compiler will
take care of executing that file for us.

Our first C code

Great! Our environment knows how to compile things, but we don't have anything
to compile! Let's write our first bit of C code.

We are going to write the file named "ext/stree/stree.c". The name of this
file is important. It corresponds to the "create_makefile" line from our
extconf.rb. After our extension is built, we'll end up with a file
"lib/stree/stree.dylib" (or .so depending on your system). This convention is
important, but I'm going to talk about why it's important in a later post.

When Ruby loads the dynamic library we're building, it must supply us with
some way to define our native methods. The way it does this is with another
naming convention using the dynamic library's file name. When "stree.dylib"
is loaded, ruby will automatically try to call a function called "Init_stree".
The second part matches the name of the file it loaded. In the Init_stree
function is where we'll do our native extension initialization.

In stree.c, define the Init_stree function to look like this:

#include <ruby.h>
void Init_stree()
{
  VALUE mStree = rb_define_module("Stree");
  rb_define_singleton_method(mStree, "hello_world", hello_world, 0);
}

This function does two things, defines the "Stree" module, and the "hello_world"
method on the Stree module.

The first line actually defines the module, the second line tells ruby to
define the singleton method "hello_world", and when that method gets called,
to call the "hello_world" C function pointer. The 0 indicates the number of
arguments.

Let's actually add the hello_world C function now:

static VALUE hello_world(VALUE mod)
{
  return rb_str_new2("hello world");
}

We declare this function as static because it's not needed outside this file.
All ruby methods must return a VALUE. The first argument to the C function is
always the recipient of the message, in this case it will be the Stree module.

We create a new ruby string with rb_str_new2() and return it.

The final stree.c file should look like this:

#include <ruby.h>

static VALUE hello_world(VALUE klass)
{
  return rb_str_new2("hello world");
}

void Init_stree()
{
  VALUE mStree = rb_define_module("Stree");
  rb_define_singleton_method(mStree, "hello_world", hello_world, 0);
}

A note on types.

When we write ruby, everything is an object. When we write ruby in C,
everything is a VALUE. We'll learn more about the VALUE type in later posts.

Finishing up

We've written our C code, everything should now compile and copy in to the
right place, but our tests are still failing. What gives?

We've got one more tiny modification to make. We need to actually require
the dynamic library that we built. Open up "lib/stree.rb" and modify it to
look like this:

require 'stree/stree'

module Stree
  VERSION = '1.0.0'
end

We've now told ruby to load the dynamic library we built, and changed the
definition of Stree to a module in our ruby code. At this point, our test
should be passing. Congratulations! You have now successfully mixed C and
Ruby code.

If you were successful, your project tree should look like this:

$ tree -I tmp
.
|-- CHANGELOG.rdoc
|-- Manifest.txt
|-- README.rdoc
|-- Rakefile
|-- ext
|   `-- stree
|       |-- extconf.rb
|       `-- stree.c
|-- lib
|   |-- stree
|   |   `-- stree.bundle
|   `-- stree.rb
`-- test
    `-- test_stree.rb

The "tmp" directory is where rake-compiler stashes your .o files when
compiling your extension. I've omitted that from the tree to keep it short.

Last notes

I've posted the code for stree here in case you're having troubles. I've
made tags for each post so you can follow along.

Next time, we'll tackle making sure libstree is installed, compiling and
linking against libstree, and making a few calls in to libstree from Ruby.

In the mean time, your homework is to read through README.EXT.

Rubyconf 2009 Slides

Posted by – December 7, 2009

I’ve posted the slides for the talk that Ryan and I gave at Rubyconf 2009.

You can grab the pdf here.

I’ve also put them on slideshare here.

I replaced the videos in the slides with links to youtube. If you want to jump straight to the videos though, you can find them here, here, and here.

Full Text Search on Heroku

Posted by – October 17, 2009

YA!! IT’S SATURDAY NIGHT! YOU ALL KNOW WHAT THAT MEANS! Time to get krunk and do some full text searching. OW! I’d like to share with my tens of loyal readers how I’m doing Full Text Search on Heroku.

Heroku’s documentation lists two ways to get full text indexing working with your Heroku application. They talk about using Ferret and Solr for full text indexes. The Ferret option looks OK, but it requires you to rebuild your indexes every time you push. Solr would work, but it requires an EC2 instance or some third party server. Since my budget is precisely $0, using Solr is out of the picture.

But there is a third option. A very secret option. A devious but fun option. You see, Heroku runs PostgreSQL for each rails application database. They’re running a version new enough (Version 8.3) to have full text index support built in. If we’re willing to throw out database agnosticism, we can take advantage of the database’s indexing capability. For this article, I’d like to hop on the Postgres train and show you how to get full text indexes working with Postgres in your rails application. I’ll also show you how to get those indexes on Heroku so we can use them “in the cloud” (Heroku is in the cloud, right?).

For the rest of this article, I’m going to assume you have PostgreSQL version 8.3 or higher installed already and can get your rails application working with Postgres. Installing postgres is outside the scope of this article, but I found these instructions to be very helpful.

Step 1: Go get some coffee

I love it when instructions tell me to go get some coffee because I always do. I have to follow the instructions right?

Step 2: Install Texticle

Texticle is a gem I wrote to help you define your text indexes on a per model basis. To install texticle, we just do the normal gem install:

  $ sudo gem install texticle

The gem is pure ruby and isn’t very long, so I encourage you to peek through the source.

While we’re at it, we should configure rails to load the texticle gem. We need to add it to our envoronment.rb file. Here’s what mine looks like:

RAILS_GEM_VERSION = '2.3.4' unless defined? RAILS_GEM_VERSION

require File.join(File.dirname(__FILE__), 'boot')

Rails::Initializer.run do |config|
  config.time_zone = 'UTC'

  config.gem 'texticle'
end

Texticle also comes with some handy rake tasks (which we’ll talk about later). In order to get those we’ll need update the rails Rakefile:

require(File.join(File.dirname(__FILE__), 'config', 'boot'))

require 'rake'
require 'rake/testtask'
require 'rake/rdoctask'

require 'tasks/rails'

require 'rubygems'

## Our texticle rake tasks
require 'texticle/tasks'

Step 3: Configuring your index

Let’s pretend we have an Article model. The Article model has a “title” field and a “body” field:

class CreateArticles < ActiveRecord::Migration
  def self.up
    create_table :articles do |t|
      t.string :title
      t.text   :body

      t.timestamps
    end
  end

  def self.down
    drop_table :articles
  end
end

To index those two fields, we just create an index block in the model and list the fields we want to index:

class Article < ActiveRecord::Base
  index do
    title
    body
  end
end

Declaring this index automatically defines a “search” method on the model that we can use to search our articles:

>> Article.search('coffee instruction')
=> [#<Article id: 4, title: "coffee", body: "I like getting coffee to be in instructions", created_at: "2009-10-17 21:42:13", updated_at: "2009-10-17 21:42:13">]
>> Article.create(:title => 'kittens', :body => 'kitten poop smells bad, but I still like kittens.')
=> #<Article id: 5, title: "kittens", body: "kitten poop smells bad, but I still like kittens.", created_at: "2009-10-17 21:42:33", updated_at: "2009-10-17 21:42:33">
>> Article.search('kittens')
=> [#<Article id: 5, title: "kittens", body: "kitten poop smells bad, but I still like kittens.", created_at: "2009-10-17 21:42:33", updated_at: "2009-10-17 21:42:33">]
>>

Great! We can search our records. There’s just one catch: we haven’t indexed our data. Doing these types of searches will be slow against large sets of data unless we add an index. Writing these indexes is a PITA, so texticle comes with a handy rake task for generating a migration to create your indexes:

  $ rake texticle:migration
  $ rake db:migrate

After running this, Postgres can use the prebuilt indexes when searching your data.

Just remember: every time you modify columns in your index block, or add new index blocks, you should create a new migration to updated the indexes. If you don’t update the indexes, searches will still work as expected, they just might be kind of slow.

Step 4: Integrating With Heroku

This part is pretty easy. First we update our heroku gem manifest:

  $ echo "texticle" >> .gems
  $ git add .gems
  $ git commit -m'updating gem manifest'
  $ git push origin master

Once your code is up on heroku, just tell heroku to migrate the database:

  $ heroku rake db:migrate

It’s just that easy! Your indexes should be available on the Heroku database server and your application can use them.

Advanced Texticle Usage

Texticle has a few more features I’d like to briefly mention. The first one is search ranking. We can tell Postgres which field has a higher priority. For example, we can tell Postgres to weigh matches in the article’s title higher than matches in the body:

class Article < ActiveRecord::Base
  index do
    title 'A'
    body  'B'
  end
end

The ranks are ‘A’ through ‘D’, and multiple fields can have the same rank.

We can also group indexes. The index we’ve seen so far will search all columns listed. We can add another index so that we only search the “title” field:

class Article < ActiveRecord::Base
  index do
    title 'A'
    body  'B'
  end

  index('title') { title }
end

This gives us a “search_title” method in addition to the “search” method:

>> Article.search_title('kittens')
=> [#<Article id: 5, title: "kittens", body: "kitten poop smells bad, but I still like kittens.", created_at: "2009-10-17 21:42:33", updated_at: "2009-10-17 21:42:33">]
>>

The last thing I want to mention is “rank”. When you perform a search, texticle adds an extra field to your model called “rank”. The rank indicates how well your record matched the search criteria:

>> Article.search('like').map { |x| x.rank }
=> ["0.4", "0.4"]
>> Article.search('coffee').map { |x| x.rank }
=> ["1.4"]
>>

Search results are already returned sorted by rank in descending order, so no need to worry about sorting.

Conclusion

I hope you enjoy tickling text with texticle as much as I do. So far, I’ve been pretty happy with this solution.

Things I like:

  • It’s the right price for use with Heroku (namely $0)
  • Easy to configure and deploy
  • No need to rebuild indexes on pushes
  • Postgres can be configured to use different dictionaries, so you aren’t stuck with English

The only drawbacks I’ve found so far are:

  • INSERTs and UPDATEs are slower
  • It’s database specific

Inserts and updates will be slower, but that comes with the territory of adding database indexes. My data is mostly doing reads, so it doesn’t bother me. Texticle is database specific, but other databases are starting to have full text search support. I think texticle could be extended to support other databases, but I’m quite happy with postgres.

Anyway, thanks for reading. The final step is that you should go get another cup of coffee.