<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tender Lovemaking &#187; computadora</title>
	<atom:link href="http://tenderlovemaking.com/category/computadora/feed/" rel="self" type="application/rss+xml" />
	<link>http://tenderlovemaking.com</link>
	<description>The act of making love, tenderly.</description>
	<lastBuildDate>Thu, 16 Feb 2012 17:10:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Connection Management in ActiveRecord</title>
		<link>http://tenderlovemaking.com/2011/10/20/connection-management-in-activerecord/</link>
		<comments>http://tenderlovemaking.com/2011/10/20/connection-management-in-activerecord/#comments</comments>
		<pubDate>Thu, 20 Oct 2011 19:14:53 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=566</guid>
		<description><![CDATA[OMG! Happy Thursday! I am trying to be totally enthusiastic, but the truth is that I have a cold, so there will be fewer uppercase letters and exclamation points than usual. Anyway, I want to talk &#8230; <a class="more-link" href="http://tenderlovemaking.com/2011/10/20/connection-management-in-activerecord/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>OMG! Happy Thursday!  I am trying to be totally enthusiastic, but the truth is that I have a cold, so there will be fewer uppercase letters and exclamation points than usual.</p>
<p>Anyway, I want to talk about database connection management in ActiveRecord.  I am not too pleased with its current state of affairs.  I would like to describe how ActiveRecord connection management works today, how I think it <em>should</em> work, and steps towards fixing the current system.</p>
<p><strong>TL;DR: database connection API in ActiveRecord should be more similar to File API</strong></p>
<h2>Thinking in terms of files</h2>
<p>It&#8217;s convenient to think of our database connection as a file.  Dealing with files is very common.  When we work with files, the basic sequence goes something like this:</p>
<ul>
<li>Open the file</li>
<li>Do some work on the file handle</li>
<li>Close the file</li>
</ul>
<p>We&#8217;re very used to doing these steps when dealing with files.  Typically our code will look something like this:</p>
<pre class="brush: ruby; title: ; notranslate">
File.open('somefile.txt', 'wb') do |fh| # Open the file
  fh.write &quot;hello world&quot;                # Do some work with the file
end                                     # Close file when block returns
</pre>
<p>We don&#8217;t want to share open files among threads because dealing with synchronization around reading and writing to the file is too difficult (and time consuming).  So maybe we&#8217;ll store the handle in a thread local or something until we&#8217;re ready to close it.</p>
<p>Our basic requirements for dealing with a database connection are essentially the same as when dealing with files.  We need to open our database connection, do some work with the connection (send and receive queries), and close the connection.  We have these similarities, yet the API for dealing with database connections in ActiveRecord is vastly different.  Let&#8217;s look at how each of these steps are performed in ActiveRecord today.</p>
<h2>Opening a connection</h2>
<p>Opening a connection to the database is very easy.  First we configure ActiveRecord with the database specification, then we call <code>connection</code> to actually get back a database handle:</p>
<pre class="brush: ruby; title: ; notranslate">
ActiveRecord::Base.establish_connection(
  :adapter  =&gt; &quot;sqlite&quot;,
  :database =&gt; &quot;path/to/dbfile&quot;)

connection_handle = ActiveRecord::Base.connection
</pre>
<p>The main difference between this API and the File API is that we&#8217;ve separated the connection specification from actually opening the connection.  In the case of opening a file, we call <code>open</code> along with a &#8220;specification&#8221; which includes the file name and how we want to open it.  In this case, we&#8217;ve separated the two; essentially storing the specification in a global place, then opening the connection later.</p>
<p>This leads to two questions:</p>
<ol>
<li>Where is the specification stored?</li>
<li>When I call <code>connection</code>, what specification is used?</li>
</ol>
<p>The answer to the first question can be found by reading the <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb#L57-91"><code>establish_connection</code> method</a>.  Specifically if we look at <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb#L63">line 63</a> we&#8217;ll find a clue.  Since this method is a class method, the call to <code>name</code> returns the <em>class name of the recipient</em>.  This name (along with our actual spec) is passed in to the connection handler object.  If we jump through a few more layers of indirection, we&#8217;ll find that what we have is essentially a one to one mapping of <em>class name to connection specification</em>.</p>
<p>Armed with this information, we can tackle the second question.  If we look at the <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb#L114-116">implementation of <code>connection</code></a>, it calls <code>retrieve_connection</code> on itself, which calls <code>retrieve_connection</code> on the connection handler with itself.  A few more method calls later, and we see that each ActiveRecord subclass <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L432-437">walks up the inheritance tree looking for a connection</a>:</p>
<pre class="brush: ruby; title: ; notranslate">
def retrieve_connection_pool(klass)
  pool = @connection_pools[klass.name]
  return pool if pool
  return nil if ActiveRecord::Base == klass
  retrieve_connection_pool klass.superclass
end
</pre>
<p>If we read this code carefully, we&#8217;ll notice that not only are connection specifications mapped to classes so are database connections!</p>
<h3>Why is this bad?</h3>
<p>This behavior smells bad to me.  The reason is because we&#8217;re tightly coupling classes along with database connections when really this relationship doesn&#8217;t need to exist.</p>
<h3>How can it be improved?</h3>
<p>If this tight coupling is removed, the complexity of ActiveRecord can be reduced and at the same time increasing the features available!  The way we can reduce this coupling is by passing the connection specification to the method that actually opens the connection.  Specifications can be stored on each class as a <em>convenience</em>, but nothing more.</p>
<p>What if opening a connection looked more like this?</p>
<pre class="brush: ruby; title: ; notranslate">
spec = ActiveRecord::Base.specificiation
ActiveRecord::ConnectionPool.open(spec) do |conn|
  ...
end
</pre>
<p>We could maintain the current behavior by storing specifications on each class, but eliminate the coupling between connection and class.  We would be able to delete all of the code that looks up connections by class hierarchy, and open the doors to having features like this:</p>
<pre class="brush: ruby; title: ; notranslate">
spec = database_a
ActiveRecord::ConnectionPool.open(spec) do |conn|
  User.find_all
end

spec = database_b
ActiveRecord::ConnectionPool.open(spec) do |conn|
  User.find_all
end
</pre>
<h2>Working with the connection</h2>
<p>Working with our connection should remain the same.  We have one place to retrieve our connection and work with it.  Woo!</p>
<h2>Dealing with thread safety</h2>
<p>Sharing open file handles among threads probably isn&#8217;t a good idea and the same can be said about open database connections.  So how does ActiveRecord keep connections localized to one thread?  If we jump through many, many, method calls, we&#8217;ll find where the connection is actually checked out of the connection pool.  It is <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L156-163">here we see how thread safety is handled</a>:</p>
<pre class="brush: ruby; title: ; notranslate">
# Retrieve the connection associated with the current thread, or call
# #checkout to obtain one if necessary.
#
# #connection can be called any number of times; the connection is
# held in a hash keyed by the thread id.
def connection
  @reserved_connections[current_connection_id] ||= checkout
end
</pre>
<p>A hash is kept where the key is the <code>current_connection_id</code>.  The implementation of <code>current_connection_id</code> looks up the current id.  If the id isn&#8217;t set, it sets it to the object id of the current thread:</p>
<pre class="brush: ruby; title: ; notranslate">
def current_connection_id #:nodoc:
  ActiveRecord::Base.connection_id ||= Thread.current.object_id
end
</pre>
<p>Next we look at the implementation of <code>connection_id</code> to find that it just <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_specification.rb#L118-124">gets and sets a thread local</a>:</p>
<pre class="brush: ruby; title: ; notranslate">
def connection_id
  Thread.current['ActiveRecord::Base.connection_id']
end

def connection_id=(connection_id)
  Thread.current['ActiveRecord::Base.connection_id'] = connection_id
end
</pre>
<p>These methods ensure that we have a one to one relationship of open connection and thread.</p>
<h2>Closing the connection</h2>
<p>Finally we reach our last step: closing the connection.  How many of you have closed your connection to the database in ActiveRecord?  My guess is that it&#8217;s very few.  I think the reason people don&#8217;t typically close their connections with ActiveRecord is twofold.  One, you don&#8217;t have to because it just does it for you, and two, the API to close a particular connection is pretty convoluted.</p>
<p>So how is the connection closed today?  There are two ways, the easy way and the hard way.</p>
<h3>The easy way</h3>
<p>The easy way is good enough in a non-threaded application.  A rack middleware <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L461-467">clears out all of the connections at the end of the request</a>.  The source for <code>clear_active_connections!</code> is pretty simple.  For each connection pool in the system (remember it&#8217;s one pool per AR class and connection spec), release that connection:</p>
<pre class="brush: ruby; title: ; notranslate">
# Returns any connections in use by the current thread back to the pool,
# and also returns connections to the pool cached by threads that are no
# longer alive.
def clear_active_connections!
  @connection_pools.each_value {|pool| pool.release_connection }
end
</pre>
<p>Each pool releases the connection it has using the <code>current_connection_id</code> (which happens to be the current thread id):</p>
<pre class="brush: ruby; title: ; notranslate">
# Signal that the thread is finished with the current connection.
# #release_connection releases the connection-thread association
# and returns the connection to the pool.
def release_connection(with_id = current_connection_id)
  conn = @reserved_connections.delete(with_id)
  checkin conn if conn
end
</pre>
<p>Not bad.  But what if our system has multiple threads?</p>
<h3>The hard way</h3>
<p>Believe it or not, the connection pool in ActiveRecord will check in connections in the checkout method.  Let me say that again: the checkout method checks in connections and checks out connections.  If you&#8217;re not facepalming yet, let&#8217;s look at a small part of the checkout method:</p>
<pre class="brush: ruby; title: ; notranslate">
@queue.wait(@timeout)

if(@checked_out.size &lt; @connections.size)
  next
else
  clear_stale_cached_connections!
  if @size == @checked_out.size
    raise ConnectionTimeoutError, &quot;could not obtain a database connection#{&quot; within #{@timeout} seconds&quot; if @timeout}. The max pool size is currently #{@size}; consider increasing it.&quot;
  end
end
</pre>
<p>This bit of the checkout method is not called unless our connection pool has become full.  First we wait for other threads to check in their connection.  While we&#8217;re waiting, if other threads checked in their connection, the first branch of the if statement executes, and a connection is returned.  If no threads have checked in their connection, we call <code>clear_stale_cached_connections!</code>:</p>
<pre class="brush: ruby; title: ; notranslate">
def clear_stale_cached_connections!
  keys = @reserved_connections.keys - Thread.list.find_all { |t|
    t.alive?
  }.map { |thread| thread.object_id }
  keys.each do |key|
    checkin @reserved_connections[key]
    @reserved_connections.delete(key)
  end
end
</pre>
<p>This method walks through every thread in your system, looking for connections that were allocated to threads that no longer exist.  Then it checks in connections associated with those dead threads.  Since there is really no easy way for users to check in their own connections, this is actually a common code path for systems that use threads.</p>
<h3>Why is this bad?</h3>
<p>It should be pretty clear why this behavior is bad.  Walking through every thread in the system, and asking if it&#8217;s alive isn&#8217;t very cheap.  Even worse is that we&#8217;re coupling ourselves to the threading system.  We cannot change the connection pool to work with other concurrency solutions (like Fibers) because those solutions may not give us the introspection we need to perform this operation!</p>
<p>But really, this is treating a symptom.  The real problem is that checking in connections is too difficult, so people don&#8217;t do it.</p>
<h3>How can we fix this?</h3>
<p>I think the best solution for this is to mimic the File API.  If we do this, it will become natural for people dealing with the database connection to actually close the connection.</p>
<p>We should make <code>ActiveRecord::Base.connection</code> consult a thread local.  That thread local is set in the rack middleware where the connection is opened.  If someone creates a new thread, they must populate that thread local, and close the connection at the end of the thread.</p>
<p>Simplified, our middleware would become something like this:</p>
<pre class="brush: ruby; title: ; notranslate">
class ConnectionManagement
  def call env
    spec       = ActiveRecord::Base.spec
    connection = ActiveRecord::ConnectionPool.open spec
    ActiveRecord::Base.connection = connection

    @app.call env

    connection.close
  end
end
</pre>
<p>When people create a new thread, it would look something like this:</p>
<pre class="brush: ruby; title: ; notranslate">
Thread.new do
  spec = ActiveRecord::Base.spec
  ActiveRecord::ConnectionPool.open(spec) do |connection|
    ActiveRecord::Base.connection = connection

    # do some stuff
  end
end
</pre>
<h3>What does this buy us?</h3>
<p>This buys us two important things: simple connection pool management, and freedom of choice on our concurrency model.</p>
<h2>omg the end.</h2>
<p>I hope I&#8217;ve convinced you that by simply learning to treat our database connection like a file, we can reduce code complexity and at the same time increase the features available.  I think I can add this feature to Rails 3.2 and mostly maintain backwards compatibility.  I think we can keep 100% backwards compatibility if we add some sort of flag like <code>config.i_suck_and_will_not_close_my_database_connections = true</code> or, <code>config.my_app_is_awesome = true</code>.</p>
<p>Anyway, I&#8217;m totally sick and I&#8217;ll stop blllluuurrrrggghhhing now.</p>
<p>&lt;3 &lt;3 &lt;3 &lt;3 &lt;3</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2011/10/20/connection-management-in-activerecord/feed/</wfw:commentRss>
		<slash:comments>24</slash:comments>
		</item>
		<item>
		<title>I want DTrace probes in Ruby</title>
		<link>http://tenderlovemaking.com/2011/06/29/i-want-dtrace-probes-in-ruby/</link>
		<comments>http://tenderlovemaking.com/2011/06/29/i-want-dtrace-probes-in-ruby/#comments</comments>
		<pubDate>Wed, 29 Jun 2011 22:59:16 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=559</guid>
		<description><![CDATA[Recently I was debugging a performance regression in Rails 3.1. The ticket reported for the regression just indicated a speed problem. Namely allocating new active record objects was very slow in Rails 3.1 compared to Rails &#8230; <a class="more-link" href="http://tenderlovemaking.com/2011/06/29/i-want-dtrace-probes-in-ruby/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Recently I was debugging a <a href="https://github.com/rails/rails/issues/1717">performance regression in Rails 3.1</a>.  The ticket reported for the regression just indicated a speed problem.  Namely allocating new active record objects was <em>very</em> slow in Rails 3.1 compared to Rails 3.0.</p>
<h3>The benchmark program</h3>
<p>Here is the program I was using for benchmarking.  It&#8217;s slightly modified from the original program that @paul so kindly submitted:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'active_record'
require 'benchmark'

p ActiveRecord::VERSION::STRING

ActiveRecord::Base.establish_connection(
  :adapter  =&gt; &quot;sqlite3&quot;,
  :database =&gt; &quot;:memory:&quot;
)

ActiveRecord::Base.connection.execute(&quot;CREATE TABLE active_record_models (id INTEGER UNIQUE, title STRING, text STRING)&quot;)

class ActiveRecordModel &lt; ActiveRecord::Base; end
ActiveRecordModel.new

N = 100_000
Benchmark.bm { |x| x.report('new') { N.times { ActiveRecordModel.new } } }
</pre>
<p>I could tell from <a href="https://github.com/tmm1/perftools.rb">perftools.rb</a> output we were calling new methods, and those new methods took plenty of time.</p>
<p>Here is the call graph for this benchmark on Rails 3.0:</p>
<div class="thumbnail"><a href="http://skitch.com/aaron.patterson/fghfi/arnew"><img src="http://img.skitch.com/20110627-c2kr8xwqa4x2kiecpdgwyspecg.preview.jpg" alt="arnew" /></a></div>
<p>Compared to the call graph on Rails 3.1:</p>
<div class="thumbnail"><a href="http://skitch.com/aaron.patterson/fghgk/arnew31"><img src="http://img.skitch.com/20110627-1dcj3fcsdqtk4xp3ihan9canyw.preview.jpg" alt="arnew31" /></a></div>
<p>After some work, I was able to remove the method calls to <code>scope</code> when allocating new AR objects, but the benchmark still showed 3.1 to be slower than 3.0.  I noticed we were spending more time in the Garbage Collector on 3.1 than on 3.0.  This lead me to believe that we were creating more objects in the newer version.</p>
<h3>Counting Object Allocations</h3>
<p>Counting object allocations with Ruby 1.9 is actually pretty easy.  We can call <code>ObjectSpace.count_objects</code> to get a list of the live objects in the system.  That method returns a hash where the keys are the object type and the values are the number of that object in the system (there are other keys, be we don&#8217;t care about them for now).</p>
<p>Knowing this information, we can write a function that will calculate a rough difference in object allocations:</p>
<pre class="brush: ruby; title: ; notranslate">
def allocate_count
  GC.disable
  before = ObjectSpace.count_objects
  yield
  after = ObjectSpace.count_objects
  after.each { |k,v| after[k] = v - before[k] }
  GC.enable
  after
end

p allocate_count { 100.times { {} } }
</pre>
<p>The first thing we do in this function is disable the garbage collector.  It&#8217;s possible that the code we&#8217;re investigating could trigger GC, and that will impact our allocated object count.</p>
<p>Then we grab our current list of objects, yield to the code we want to profile, grab the list again, calculate our delta, enable GC and return!</p>
<p>If you run this code, you should see output similar to this (I&#8217;ve removed keys I don&#8217;t care about):</p>
<pre><code>{ ... :T_HASH=&gt;101, ... }
</code></pre>
<p>You can see from the output that we&#8217;ve allocated 101 hashes.  100 came from our <code>100.times</code> call, and one came from calling the method on ObjectSpace.</p>
<p>Since inspecting our code can impact object allocations, these counts can&#8217;t be 100% accurate.  Another annoying thing is that we can&#8217;t determine <em>where</em> our code is allocating these hashes.  It&#8217;s easy enough to see the hash literals in this code, but imagine a project like rails.  You may be able to find hash literals in your source, but determining which ones are being called is difficult.</p>
<p>After running this against the profile, I found that ActiveRecord in 3.1 was allocating 2x the number of hashes that 3.0 allocated.  Now the problem is figuring out <em>where</em> these hashes were being allocated.</p>
<h3>DTrace</h3>
<p>DTrace gives us hooks in to our processes.  Many of the hooks provided are for system calls, or other various functions.  DTrace requires that we write probes in our code, and unfortunately Ruby does not have DTrace probes built in.</p>
<h3>Adding DTrace to Ruby</h3>
<p>To find these Hash allocations, I decided to add DTrace support.  I usually run Ruby built from Ruby trunk, so running a modified Ruby seemed acceptible.  It turns out adding DTrace support is pretty easy.</p>
<p>First I added a probe definition file.  This file describes the probes that I want to add to Ruby.  It contains the probe names and the signatures of the probes:</p>
<pre><code>provider ruby {
  probe hash__alloc(const char *, int);
};
</code></pre>
<p>This definition says we&#8217;re declaring a probe called <code>hash-alloc</code> and it will take a string (the file name) and an int (the line number).</p>
<p>Next I used this definition file to generate a header file that ruby would use:</p>
<pre><code>$ dtrace -o probes.h -h -s probes.d
</code></pre>
<p>Finally, I modified hash.c to trigger the probe:</p>
<pre class="brush: diff; title: ; notranslate">
diff --git a/hash.c b/hash.c
index b49aff8..c40d94d 100644
--- a/hash.c
+++ b/hash.c
@@ -15,6 +15,7 @@
 #include &quot;ruby/st.h&quot;
 #include &quot;ruby/util.h&quot;
 #include &quot;ruby/encoding.h&quot;
+#include &quot;probes.h&quot;
 #include &lt;errno.h&gt;

 #ifdef __APPLE__
@@ -221,6 +222,9 @@ hash_alloc(VALUE klass)
     OBJSETUP(hash, klass, T_HASH);

     RHASH_IFNONE(hash) = Qnil;
+    if(RUBY_HASH_ALLOC_ENABLED()) {
+       RUBY_HASH_ALLOC(rb_sourcefile(), rb_sourceline());
+    }

     return (VALUE)hash;
 }
</pre>
<p>After this, I built and installed ruby.  Then I wrote a dtrace script to display the filename and line number where hash allocations were happening:</p>
<pre><code>ruby*:::hash-alloc
{
  printf("%s:%d", copyinstr(arg0), arg1);
}
</code></pre>
<p>Then I ran the benchmark with dtrace:</p>
<pre><code>$ sudo dtrace -s x.d -c 'ruby -I lib test.rb'
</code></pre>
<p>Here is an excerpt of the output:</p>
<pre><code>  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1528
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1529
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1534
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1535
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1525
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/persistence.rb:322
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1527
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1528
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1529
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1534
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1535
  1 126930            hash_alloc:hash-alloc /Users/aaron/git/rails/activerecord/lib/active_record/base.rb:1525</code></pre>
<p>I was able to compare the output from this script on Rails 3.0 vs Rails 3.1.  From this output, I was able to determine:</p>
<ul>
<li>The number of hash allocations</li>
<li>Where the hashes were allocated</li>
<li>Which allocations were new in Rails 3.1</li>
</ul>
<p>Armed with this information, I was able to eliminate some allocations and return speed in Rails 3.1 to that of Rails 3.0.</p>
<h3>DTrace vs gdb vs memprof</h3>
<p>I probably could have accomplished this task with <a href="https://github.com/ice799/memprof">memprof</a>, but it requires that I use 1.8.7 from rvm.  Working with RVM on my machine is difficult (don&#8217;t ask, let&#8217;s just say I&#8217;m a &#8220;special needs&#8221; user), and I wanted to use 1.9.  So memprof was out the window.</p>
<p>I was able to get the same information by scripting gdb.  I set a breakpoint at the correct function, called <code>rb_sourcefile()</code> and <code>rb_sourceline()</code> and redirected to a file.  The problem with gdb is that it seemed very slow, and scripting it was a pain (though I am not a gdb expert!).</p>
<p>DTrace was fast, relatively easy to use, and very scriptable.  It made me happy!</p>
<h3>OMG!!!</h3>
<p>I would like to see dtrace probes officially added to ruby trunk.  The ruby that ships with OS X has them built in, but they don&#8217;t give us information like hash literal allocations.  It looks like <a href="http://redmine.ruby-lang.org/issues/2565">they were in ruby trunk at one point, but were then reverted</a>.  I would like to see them added again.</p>
<p>If you want to play with them on ruby trunk, <a href="https://gist.github.com/1055159">here is my full patch</a>.  Make sure to run <code>make probes.h</code> before <code>make &amp;&amp; make install</code>.</p>
<p>HAPPY WEDNESDAY!!!! &lt;3 &lt;3 &lt;3 &lt;3</p>
<p><small>(it feels good to blog on a non &#8220;professional&#8221; blog)</small></p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2011/06/29/i-want-dtrace-probes-in-ruby/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>TIL: It&#8217;s OK to return nil from to_ary</title>
		<link>http://tenderlovemaking.com/2011/06/28/til-its-ok-to-return-nil-from-to_ary/</link>
		<comments>http://tenderlovemaking.com/2011/06/28/til-its-ok-to-return-nil-from-to_ary/#comments</comments>
		<pubDate>Tue, 28 Jun 2011 23:08:24 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[ruby]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=545</guid>
		<description><![CDATA[tl;dr: You can return nil from to_ary and to_a. Today I discovered that it&#8217;s OK for to_ary to return nil. Ruby spec tests this behavior, and the ruby implementation supports it. But why would you want &#8230; <a class="more-link" href="http://tenderlovemaking.com/2011/06/28/til-its-ok-to-return-nil-from-to_ary/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>tl;dr: You can return nil from <code>to_ary</code> and <code>to_a</code>.</p>
<p>Today I discovered that it&#8217;s OK for <code>to_ary</code> to return nil.  <a href="https://github.com/rubyspec/rubyspec/blob/master/core/array/flatten_spec.rb#L105">Ruby spec tests this behavior</a>, and the<br />
ruby implementation <a href="https://github.com/ruby/ruby/blob/trunk/array.c#L3687-3693">supports it</a>.  But why would you want to implement this?</p>
<h3>Array#flatten and <code>to_ary</code></h3>
<p>Consider the following code:</p>
<pre class="brush: ruby; title: ; notranslate">
class Item
  def respond_to?(name, visibility = false)
    p &quot;respond to: #{name}&quot;
    false # we don't respond to anything!!
  end

  def method_missing(name, *args)
    p &quot;method missing: #{name}&quot;
    super # if something?
    # do something else
  end
end

[[Item.new]].flatten
</pre>
<p>In Ruby 1.9, <code>flatten</code> will actually call <code>to_ary</code> on each of the items in the collection.  If the method table for this object contains <code>to_ary</code>, Ruby will call <code>to_ary</code>, and if <code>method_missing</code> is implemented, it will call <code>method_missing</code> regardless of how your <code>respond_to?</code> returns.  The reason ruby will call <code>method_missing</code> is because it <em>could</em> implement <code>to_ary</code>. (As for why it ignores the return value of <code>respond_to?</code>, I don&#8217;t know)</p>
<p>Naturally, if we were to call <code>to_ary</code> directly on this object it would raise a <code>NoMethodError</code> exception.  When calling flatten, ruby will swallow the exception.  If we enable <code>$DEBUG</code> by running ruby with <code>-d</code>, we can see the exception:</p>
<pre><code>[aaron@higgins ~]$ ruby -d test.rb
Exception `LoadError' at /Users/aaron/.local/lib/ruby/site_ruby/1.9.1/rubygems.rb:1215 - cannot load such file -- rubygems/defaults/operating_system
Exception `LoadError' at /Users/aaron/.local/lib/ruby/site_ruby/1.9.1/rubygems.rb:1224 - cannot load such file -- rubygems/defaults/ruby
"method missing: to_ary"
Exception `NoMethodError' at test.rb:9 - undefined method `to_ary' for #&lt;Item:0x00000101079e80&gt;
"respond to: to_ary"
[aaron@higgins ~]$
</code></pre>
<p>Dispatching to <code>method_missing</code> can be expensive, and using exceptions for flow control even more problematic.  The way we can get around this issue is by implementing <code>to_ary</code> and having it return nil:</p>
<pre class="brush: ruby; title: ; notranslate">
class Item
  def respond_to?(name, visibility = false)
    p &quot;respond to: #{name}&quot;
    false # we don't respond to anything!!
  end

  def method_missing(name, *args)
    p &quot;method missing: #{name}&quot;
    super # if something?
    # do something else
  end

  private
  def to_ary
    nil
  end
end

[[Item.new]].flatten
</pre>
<p>Run the code again with <code>-d</code>, and you&#8217;ll see no more calls to <code>respond_to?</code>, no more calls to <code>method_missing</code>, and no more exceptions raised:</p>
<pre><code>[aaron@higgins ~]$ ruby -d test.rb
Exception `LoadError' at /Users/aaron/.local/lib/ruby/site_ruby/1.9.1/rubygems.rb:1215 - cannot load such file -- rubygems/defaults/operating_system
Exception `LoadError' at /Users/aaron/.local/lib/ruby/site_ruby/1.9.1/rubygems.rb:1224 - cannot load such file -- rubygems/defaults/ruby
[aaron@higgins ~]$
</code></pre>
<h3>Array() and <code>to_a</code></h3>
<p>The Array() function exhibits a similar behavior, but with the <code>to_a</code> method.  Try this same code, but rather than using <code>Array#flatten</code>, do this:</p>
<pre class="brush: ruby; title: ; notranslate">
Array(Item.new)
</pre>
<p>We can fix the error produced by this code with a slight change to our class:</p>
<pre class="brush: ruby; title: ; notranslate">
class Item
  def respond_to?(name, visibility = false)
    p &quot;respond to: #{name}&quot;
    false # we don't respond to anything!!
  end

  def method_missing(name, *args)
    p &quot;method missing: #{name}&quot;
    super # if something?
    # do something else
  end

  private
  def to_ary
    nil
  end
  alias :to_a :to_ary
end

[[Item.new]].flatten
Array(Item.new)
</pre>
<h3>Hmmmm</h3>
<p>Are warnings annoying? Yes. Is this strange behavior? I tend to think so. But if I&#8217;m forced to deal with a class the implements <code>method_missing</code>, I&#8217;d like to reduce the number of calls to <code>method_missing</code>.</p>
<p>Anyway.  I hope you found this informative.  Have a Happy Tuesday!!!!</p>
<p>&lt;3 &lt;3 &lt;3 &lt;3</p>
<h3>Small Side Note</h3>
<p>The reason there are exceptions coming from rubygems when <code>-d</code> is enabled is because rubygems attempts to require files that are not shipped with rubygems.  These files are for packagers to provide.  For example someone packaging rubygems for debian, may need to do customizations and those files are where that happens.</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2011/06/28/til-its-ok-to-return-nil-from-to_ary/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Rack API is awkward</title>
		<link>http://tenderlovemaking.com/2011/03/03/rack-api-is-awkward/</link>
		<comments>http://tenderlovemaking.com/2011/03/03/rack-api-is-awkward/#comments</comments>
		<pubDate>Thu, 03 Mar 2011 19:24:59 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[rails]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=522</guid>
		<description><![CDATA[TL;DR: Rack API is poor when you consider streaming response bodies. ZOMG!!!! HAPPY THURSDAY!!!! Maybe I shouldn&#8217;t be so excited now. I want to talk about stuff I&#8217;ve been working on in Rails 3.1, and problems &#8230; <a class="more-link" href="http://tenderlovemaking.com/2011/03/03/rack-api-is-awkward/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><strong>TL;DR: Rack API is poor when you consider streaming response bodies.</strong></p>
<p>ZOMG!!!!  HAPPY THURSDAY!!!!  Maybe I shouldn&#8217;t be so excited now.  I want to talk about stuff I&#8217;ve been working on in Rails 3.1, and problems I&#8217;m encountering today.  I want to use this blllurrrggghhh blog post to talk through through the problems I&#8217;ve been having, and to share the pain with others.</p>
<h2>Pie is delicious!</h2>
<p>One feature that would be useful to add to Rails is having a streaming response body.  When Rails processes a response, the entire response is buffered in memory before it can be sent to the user.  Some information like Content Length (among other things) is derived, and the response is sent.</p>
<p>Sometimes buffering a response is less than ideal.  It would be nice if we could send the head tag along with any css or script includes to the browser as quickly as possible.  Then the browser can download external resources while we&#8217;re still processing data on the server.  If this were possible, total response time may remain the same, but the time to first byte would be decreased and the page would load faster as external resource can be downloaded in parallel.</p>
<p>This feature sounds great, but there are many things to think about before it can be implemented.  We need to support infinite streams, chunked encoding, prevent header manipulation, ensure database connections, blah, blah blah.</p>
<h2>Rack interface</h2>
<p>I&#8217;m getting ahead of myself.  Before we get to our ultimate &#8220;pie in the sky&#8221; streaming solution, let&#8217;s take a look at the Rack API.  Rack defines an interface for writing web applications.  A rack handler must respond to <code>call</code> which takes one parameter, the request environment.  <code>call</code> must return a three item list of:</p>
<ul>
<li>Response code</li>
<li>Headers</li>
<li>Body</li>
</ul>
<p>The response code should be a number (like 200), the headers are a hash (like { &#8216;X-Omg&#8217; => &#8216;hello!&#8217; }).  The body must respond to <code>each</code> and take a block.  The body must yield a string to the block, and the string will be output to the client.  Optionally, the body may respond to <code>close</code>, and rack will call <code>close</code> when output is complete.</p>
<h2>An Example Rack application</h2>
<p>Let&#8217;s write an example application.  Our sample application will simulate an ERb page.  We&#8217;ll add some <code>sleep</code> statements to simulate work happening during the ERb rendering process:</p>
<pre class="brush: ruby; title: ; notranslate">
class FooApplication
  class ErbPage
    def to_a
      head = &quot;the head tag&quot;
      sleep(2)
      body = &quot;the body tag&quot;
      sleep(2)
      [head, body]
    end
  end

  def call(env)
    [200, {}, ErbPage.new.to_a]
  end
end
</pre>
<p>For the purposes of demonstration, we&#8217;ll be using a fake implementation of rack:</p>
<pre class="brush: ruby; title: ; notranslate">
class FakeRack
  def serve(application)
    status, headers, body = application.call({})
    p :status  =&gt; status
    p :headers =&gt; headers

    body.each do |string|
      p string
    end

    body.close if body.respond_to?(:close)
  end
end
</pre>
<p>If we feed our application through FakeRack like this:</p>
<pre class="brush: ruby; title: ; notranslate">
app  = FooApplication.new
rack = FakeRack.new

rack.serve app
</pre>
<p>We&#8217;ll see output from the rack application, and the total program run time is about 4 seconds:</p>
<pre class="brush: ruby; title: ; notranslate">
$ time ruby foo.rb
{:status=&gt;200}
{:headers=&gt;{}}
&quot;the head tag&quot;
&quot;the body tag&quot;

real    0m4.008s
user    0m0.003s
sys     0m0.003s
</pre>
<p>Great!  So far, no problem.  Why don&#8217;t we add a middleware to time how long the response takes.</p>
<h2>Rack Middleware</h2>
<p>Rack Middleware is simply another Rack application.  With Rack, we set up a linked list of middleware that eventually point to the real application.  We give the head of the linked list to Rack, Rack calls <code>call</code> on the head of the list, and it is the list&#8217;s responsibility to call <code>call</code> on it&#8217;s link.</p>
<p>Here, we&#8217;ll write a Rack middleware to measure how long the &#8220;ERb render&#8221; takes and add a header indicating the response time.</p>
<pre class="brush: ruby; title: ; notranslate">
class ResponseTimer
  def initialize(app)
    @app = app
  end

  def call(env)
    now                        = Time.now
    status, headers, body      = @app.call(env)
    headers['X-Response-Took'] = Time.now - now

    [status, headers, body]
  end
end
</pre>
<p>When we construct the ResponseTimer, we pass it the real application.  Then we pass the response timer instance to rack:</p>
<pre class="brush: ruby; title: ; notranslate">
app   = FooApplication.new
timer = ResponseTimer.new app
rack  = FakeRack.new

rack.serve timer
</pre>
<p>When rack calls <code>call</code> on the response timer, it records the current time, then calls <code>call</code> on the real application.  When the real application returns, the response timer then adds a header with the time delta.  The output of this program will look like this:</p>
<pre class="brush: ruby; title: ; notranslate">
$ time ruby foo.rb
{:status=&gt;200}
{:headers=&gt;{&quot;X-Response-Took&quot;=&gt;3.999937}}
&quot;the head tag&quot;
&quot;the body tag&quot;

real    0m4.010s
user    0m0.004s
sys     0m0.004s
</pre>
<h2>Speeding up our response time</h2>
<p>We&#8217;ve noticed a problem with our Rack application.  When a client connects, it takes 4 seconds before they receive any data!  It would be nice if we could feed our client the head tag ASAP so they can download external resources.</p>
<p>We know that Rack will call <code>each</code> and (depending on your webserver) immediately send data to the client.  Rather than computing values in ERb ahead of time, we&#8217;ll compute them when Rack asks for them (when <code>each</code> is called).</p>
<p>Let&#8217;s refactor the ERb page to be lazy about calculating values:</p>
<pre class="brush: ruby; title: ; notranslate">
class FooApplication
  class ErbPage
    def each
      head = &quot;the head tag&quot;
      yield head

      sleep(2)

      body = &quot;the body tag&quot;
      yield body

      sleep(2)
    end
  end

  def call(env)
    [200, {}, ErbPage.new]
  end
end
</pre>
<p>Now no values are calculated until rack calls <code>each</code> on our body.  If we run the program, we&#8217;ll see output from the application more quickly than before.</p>
<p>However, the output is somewhat strange:</p>
<pre class="brush: ruby; title: ; notranslate">
$ time ruby foo.rb
{:status=&gt;200}
{:headers=&gt;{&quot;X-Response-Took&quot;=&gt;1.1e-05}}
&quot;the head tag&quot;
&quot;the body tag&quot;

real    0m4.032s
user    0m0.027s
sys     0m0.016s
</pre>
<p>The time command reports that our response was about 4 seconds.  But our response header says that the response took nearly 0 seconds!  Why is this?</p>
<p>If we look closely at our timer middleware, we can see it is only timing <em>how long it took for <code>call</code> to return</em>.</p>
<p>We cannot guarantee that <em>any</em> processing happened during the <code>call</code> method.</p>
<p>Let me say that again:</p>
<p><strong>We cannot guarantee that <em>any</em> processing happened during the <code>call</code> method.</strong></p>
<p>We wanted our response timer to time how long the ERb took to render, but really it is just timing how long the <code>call</code> method took.</p>
<h2>ZOMG HOW FIX?!?</h2>
<h3>Iterating over the body</h3>
<p>One way to fix is to iterate over the body.  If the timer iterates over the body, then we can calculate the real time:</p>
<pre class="brush: ruby; title: ; notranslate">
class ResponseTimer
  def initialize(app)
    @app = app
  end

  def call(env)
    now                        = Time.now
    status, headers, body      = @app.call(env)

    newbody = []
    body.each { |str| newbody &lt;&lt; str }
    headers['X-Response-Took'] = Time.now - now

    [status, headers, newbody]
  end
end
</pre>
<p>But this solution is no good!  Our response timer now buffers the response, and our client ends up waiting for 4 seconds before they get any data.</p>
<p>We know that Rack calls <code>close</code> on the body after it&#8217;s done processing the request.  Why don&#8217;t we try hooking on that method?</p>
<h3>Introducing a Proxy Object</h3>
<p>One way we can hook on to the close method is by wrapping the response body in a proxy object.  Then we can intercept calls made on the body and perform any work we need done:</p>
<pre class="brush: ruby; title: ; notranslate">
class ResponseTimer
  class TimerProxy
    def initialize(body)
      @now     = Time.now
      @body    = body
    end

    def close
      @body.close if @body.respond_to?(:close)

      $stderr.puts({'X-Response-Took' =&gt; (Time.now - @now)})
    end

    def each(&amp;block)
      @body.each(&amp;block)
    end
  end

  def initialize(app)
    @app = app
  end

  def call(env)
    status, headers, body = @app.call(env)

    [status, headers, TimerProxy.new(body)]
  end
end
</pre>
<p>Wow!  Suddenly our middleware is not so simple.  This proxy solution is sub-optimal for a few reasons.  We&#8217;re required to make a new object for every request, and our proxy object will add another stack frame between calls from rack to the response body.  Even worse, every middleware that needs to do work after the response is finished must define this proxy object.</p>
<p>This solution does get the job done.  If we look at the output from the program, we&#8217;ll see that the TimerProxy in fact measures ERb processing time correctly:</p>
<pre class="brush: ruby; title: ; notranslate">
$ time ruby foo.rb
{:status=&gt;200}
{:headers=&gt;{}}
&quot;the head tag&quot;
&quot;the body tag&quot;
{&quot;X-Response-Took&quot;=&gt;4.000268}

real    0m4.044s
user    0m0.029s
sys     0m0.015s
</pre>
<p>Diligent readers will note that the response time is no longer part of the response headers.  This is because when the body is flushed, the headers must be flushed too.  We no longer have the opportunity to add extra headers when <code>each</code> is called on the body.</p>
<p>Our solution isn&#8217;t <em>too bad</em>, but it actually isn&#8217;t complete.  The full awkwardness of this API along with a complete solution can actually be felt (and read) <a href="https://github.com/rack/rack/blob/master/lib/rack/lock.rb">in the Rack source itself</a>.</p>
<h3>Lady Gaga Solution</h3>
<p>Another possible solution is to decorate the body using a module.  We can define a module, then simply call <code>extend</code> on the body with the module:</p>
<pre class="brush: ruby; title: ; notranslate">
class ResponseTimer
  def initialize(app)
    @app = app
  end

  def call(env)
    status, headers, body      = @app.call(env)
    body.extend(Module.new {
      now = Time.now

      define_method(:close) do
        super if defined?(super)

        $stderr.puts({'X-Response-Took' =&gt; (Time.now - now)})
      end
    })

    [status, headers, body]
  end
end
</pre>
<p>The body is extended with an anonymous module.  During module definition, the time is recorded.  We use <code>define_method</code> because it uses a lambda which will keep a reference to the previously calculated time.  In the <code>close</code> method, we call super if it&#8217;s defined, then output our time.</p>
<p>This example also works, but has a few downsides.  It is different than previous examples because we are timing <em>only</em> the ERb rendering and not <code>call</code> plus ERb rendering.  Using this solution, we&#8217;re required to create a new module on every request, and also break method caching on every request.  Similar to the proxy object solution, we must create a new module and extend for every middleware that must to processing after the response is finished.</p>
<h2>ZOMG YOUR EXAMPLE IS CONTRIVED</h2>
<p>Yup.  But I merely simplified a real world problem.  As I mentioned earlier, you can see the awkwardness of this API <a href="https://github.com/rack/rack/blob/master/lib/rack/lock.rb">in rack</a>.</p>
<p>But now that we know about this problem, we can identify middleware that will break streaming responses.  For example, Rails defines a middleware <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L407-421">that checks connections back in to the connection pool</a>.  If our ERb in Rails was streaming, we would lose the database connection during ERb render.  The same is true with the <a href="https://github.com/rails/rails/blob/master/activerecord/lib/active_record/query_cache.rb#L30-34">query cache in active record</a>.  Surely, these cannot be the only middleware that will break when a streaming body is used!</p>
<h2>Lifecycle hooks</h2>
<p>I think a good solution to this problem would be if Rack provided lifecycle hooks.  A Place where we can say &#8220;run this when the response is done&#8221;.  We can define something like that today using middleware:</p>
<pre class="brush: ruby; title: ; notranslate">
class EndOfLife
  attr_reader :callbacks

  def initialize(app)
    @app       = app
    @callbacks = []
  end

  def call(env)
    status, headers, body = @app.call(env)
    body.extend(Module.new {
      attr_accessor :eol

      def close
        super if defined?(super)
        eol.callbacks.each { |cb| cb.call }
      end
    })
    body.eol = self

    [status, headers, body]
  end
end

app = FooApplication.new
eol = EndOfLife.new app
eol.callbacks &lt;&lt; lambda { puts &quot;it finished!&quot; }

rack  = FakeRack.new

rack.serve eol
</pre>
<p>This keeps us from defining many proxy objects or module extensions during a response.  We only define one module extension, and hook any &#8220;end of life&#8221; hooks on to this instance.  The downside is that we cannot guarantee the position of this middleware in the middleware linked list.  That means that the &#8220;end of life&#8221; middleware may not actually execute at the end of the response!</p>
<h2>A &#8220;real&#8221; solution</h2>
<p>Rack&#8217;s interface is simple, and I like that.  The simplicity is attractive, but the API seems to fall on it&#8217;s face when we start talking about streaming web servers.  If I remember correctly, Apache 1.0 modules suffered the same problems that Rack is presenting us today.  Maybe we should look at Apache 2.0 <a href="http://www.apachetutor.org/dev/brigades">buckets and filters</a> and design our API using patterns from a project that has already solved this problem.</p>
<h2>ZOMG I AM TIRED OF TYPING!!</h2>
<p>I&#8217;m not happy with any of the solutions I&#8217;ve presented.  All of them have downsides that I find unattractive.  We can live with the downsides, but life will suck.  If any of you dear readers have better solutions for me, I am all ears!</p>
<p>Thanks for listening, and HAVE A GREAT DAY!!!!</p>
<p>&lt;3 &lt;3 &lt;3 &lt;3 &lt;3</p>
<p><strong>Edit:</strong> I just noticed that Rack contains a &#8220;timer&#8221; middleware similar to the one I&#8217;ve implemented in this blog post.  You can view the broken middleware <a href="https://github.com/rack/rack/blob/master/lib/rack/runtime.rb">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2011/03/03/rack-api-is-awkward/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>rubycommitters.org design contest!</title>
		<link>http://tenderlovemaking.com/2011/01/04/rubycommitters-org-design-contest/</link>
		<comments>http://tenderlovemaking.com/2011/01/04/rubycommitters-org-design-contest/#comments</comments>
		<pubDate>Tue, 04 Jan 2011 17:06:10 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[life]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=483</guid>
		<description><![CDATA[omg&#8230;. OMG&#8230;. ZOMG!!!!! HAPPY TUESDAY TO EVERYONE IN THE WORLD!!!! Alright, now that the formalities are out of the way, LET&#8217;S GET DOWN TO BUSINESS! Some of you may or may not know, I joined the &#8230; <a class="more-link" href="http://tenderlovemaking.com/2011/01/04/rubycommitters-org-design-contest/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<h2>omg&#8230;. OMG&#8230;. ZOMG!!!!!  HAPPY TUESDAY TO EVERYONE IN THE WORLD!!!!</h2>
<p>Alright, now that the formalities are out of the way, LET&#8217;S GET DOWN TO BUSINESS!  Some of you may or may not know, I joined the Ruby core team in October 2009.  I am very proud to be a member of ruby-core.  However, I have noticed that the Ruby core team does not have an awesome website like the <a href="http://rubyonrails.org/core">Rails core team</a>.</p>
<p>To rectify this situation, I registered a domain: <a href="http://rubycommitters.org/">rubycommitters.org</a>.  Unfortunately, my design skills are&#8230; sub-par.  I would like to <strike>fist</strike> fix this by having a contest.  I would like <strong>you</strong> to code the design for <a href="http://rubycommitters.org/">rubycommitters.org</a>.</p>
<h2>How do I enter?</h2>
<p>Just fork the rubycommitters.org project <a href="https://github.com/tenderlove/rubycommitters.org">on github</a>, follow the instructions in <a href="https://github.com/tenderlove/rubycommitters.org/blob/master/README.rdoc">the README</a>, and make the site look good!  When you&#8217;re done, send me a pull request.</p>
<h2>How many times can I enter?</h2>
<p>As many as you want.  However, you can only win one place.</p>
<h2>When is the due date?</h2>
<p>You must send me the pull request by January 19th, 23:59:59 PST.</p>
<h2>When will winners be announced?</h2>
<p>I will announce winners by January 21st, 23:59:59 PST.  Once I&#8217;ve decided, I&#8217;ll email the winners to get their PayPal information and transfer the <strike>money</strike> Love Bucks.</p>
<h2>What are the prizes?</h2>
<p><a href="http://engineering.attinteractive.com/">My employer</a> pays me in Tenderlove Cash (from here on referred to as &#8220;Love Bucks&#8221;), so the prize will be in Love Bucks.  Fortunately for all of us, Love Bucks exchange with the US Dollar at a 1:1 ratio.  So I will send you your prize via PayPal in the form of US Dollars.</p>
<p><strong>First Place: <strike>200</strike> 300 Love Bucks (in the form of US Dollars)</strong><br />
<strong>Second Place: 100 Love Bucks (in the form of US Dollars)</strong></p>
<p>All entrants will win a hug from me that is redeemable next time we meet each other!</p>
<h2>How will entries be judged?</h2>
<p>By me, however I want.  I&#8217;ll probably ask the intertubes for help, but I have the final say.</p>
<h2>Why is the prize so low?</h2>
<p>These Love Bucks are coming out of my own wallet!  Give me a break!</p>
<h2>Conclusion</h2>
<p>Since I announced I needed help last weekend, <a href="https://github.com/tenderlove/rubycommitters.org/network/members">already 9 people have started working</a>, so <strong>you&#8217;d better get cracking</strong>!</p>
<p>I am proud to be a member of the Ruby core team, I&#8217;m proud to be involved in the Rails development team, and most of all I&#8217;m proud to be a member of the best development community in the world.  Thanks to everyone that makes being a member of the Ruby community so awesome!</p>
<h2>EDIT!!!!</h2>
<p><a href="http://twitter.com/mariozig">@mariozig</a> has graciously offered to up the ante.  He has donated 100 Love Bucks to the first place winner!  That&#8217;s a total of 300 Love Bucks for the person who wins first place!  Awesome!!!</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2011/01/04/rubycommitters-org-design-contest/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Event based JSON and YAML parsing</title>
		<link>http://tenderlovemaking.com/2010/04/17/event-based-json-and-yaml-parsing/</link>
		<comments>http://tenderlovemaking.com/2010/04/17/event-based-json-and-yaml-parsing/#comments</comments>
		<pubDate>Sat, 17 Apr 2010 23:41:58 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=432</guid>
		<description><![CDATA[Let&#8217;s use Ruby 1.9.2 and Psych to build an event based twitter stream parser. Psych is a YAML parser that I wrote and is in the standard library in 1.9.2. Eventually, it will replace the current &#8230; <a class="more-link" href="http://tenderlovemaking.com/2010/04/17/event-based-json-and-yaml-parsing/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s use Ruby 1.9.2 and <a href="http://github.com/tenderlove/psych">Psych</a> to build an event based twitter stream parser.  Psych is a YAML parser that I wrote and is in the standard library in 1.9.2.  Eventually, it will replace the current YAML parser, but we can still use it today!</p>
<p><strong>But you said YAML <em>and JSON</em>!  wtf?</strong></p>
<p>I know!  In the YAML 1.2 spec, <a href="http://www.yaml.org/spec/1.2/spec.html">JSON is a subset of YAML</a>.  Psych supports YAML 1.1 right now, so *much* (but not all) JSON is supported.  Once libyaml is upgraded to YAML 1.2, it will have full JSON support!</p>
<h3>Why do we want to do an event based parser?</h3>
<p><a href="http://apiwiki.twitter.com/Streaming-API-Documentation">Twitter streams</a> are a never ending flow of user status updates, and if we want a process to live forever consuming these updates, it would be nice if that process kept a low memory profile.  Psych is built in such a way that we can hand it an IO object, it will read from the IO object, then call callback methods as soon as possible.  It buffers as little as possible, sending events as soon as possible.  If you are familiar with SAX based XML parsing, this will be familiar to you.  Plus it is a fun problem!</p>
<p>Let&#8217;s start by writing an event listener for some sample JSON.</p>
<h3>Event Listener</h3>
<p>Our event listener is only going to listen for scalar events, meaning that when Psych parses a string, it will send that string to our listener.  There are many different events that can happen, so Psych ships with <a href="http://github.com/tenderlove/psych/blob/master/lib/psych/handler.rb">a handler from which you can inherit</a>.  If you check out the source for the <a href="http://github.com/tenderlove/psych/blob/master/lib/psych/handler.rb">base class handler</a>, you can see what types of events your handler can intercept.</p>
<p>For now, let&#8217;s write our scalar handler, and try it out.</p>
<pre class="brush: ruby; title: ; notranslate">
require 'psych'

class Listener &lt; Psych::Handler
  def scalar(value, anchor, tag, plain, quoted, style)
    puts value
  end
end

listener = Listener.new
parser   = Psych::Parser.new listener
parser.parse DATA

__END__
{&quot;foo&quot;:&quot;bar&quot;}
</pre>
<p>If you run this code, you should see the strings &#8220;foo&#8221; and &#8220;bar&#8221; printed.</p>
<p>In this example, our handler simply prints out every scalar value encountered.  We created a new instance of the listener, pass that listener to a new instance of the parser, and tell the parser to parse DATA.  We can hand the parser an IO object or a String object.  This is important because we&#8217;d like to hand the parser our socket connection, that way the parser can deal with reading from the socket for us.</p>
<h3>Hooking up to Twitter</h3>
<p>It would be convenient for us if Twitter&#8217;s stream was one continuous JSON document.  Why?  If it was, we could feed the socket straight to our JSON parser and start consuming events immediately.  Unfortunately, Twitter&#8217;s stream is not so kind for us event based consumers.  We&#8217;ll need to trick our JSON parser to think the feed is one continuous document.  We&#8217;ll get tricky with our data in a minute, but first let&#8217;s deal with authentication.</p>
<p><strong>Authentication</strong></p>
<p>Twitter requires us to authenticate before we can consume a feed.  <a href="http://apiwiki.twitter.com/Streaming-API-Documentation#Authentication">Stream authentication</a> is done via Basic Auth.  Let&#8217;s write a class that can authenticate and read from the stream.  Once we do that, we&#8217;ll concentrate on parsing the stream.</p>
<pre class="brush: ruby; title: ; notranslate">
require 'socket'

class StreamClient
  def initialize user, pass
    @ba = [&quot;#{user}:#{pass}&quot;].pack('m').chomp
  end

  def listen
    socket = TCPSocket.new 'stream.twitter.com', 80
    socket.write &quot;GET /1/statuses/sample.json HTTP/1.1\r\n&quot;
    socket.write &quot;Host: stream.twitter.com\r\n&quot;
    socket.write &quot;Authorization: Basic #{@ba}\r\n&quot;
    socket.write &quot;\r\n&quot;

    # Read the headers
    while((line = socket.readline) != &quot;\r\n&quot;); puts line if $DEBUG; end

    # Consume the feed
    while line = socket.readline
      puts line
    end
  end
end

StreamClient.new(ARGV[0], ARGV[1]).listen
</pre>
<p>This class takes a username and password and calculates the basic auth signature.  When &#8220;listen&#8221; is called, it opens a connection, authorizes, reads the response headers, and starts consuming the feed.</p>
<h3>Processing the Feed</h3>
<p>If we look at the output from the previous script, we&#8217;ll see that the Twitter stream looks something like this:</p>
<pre>
512
{"in_reply_to_screen_name":null,...}

419
{"in_reply_to_screen_name":"tenderlove"...}
</pre>
<p>Which isn&#8217;t valid JSON.  Instead, it&#8217;s a header (the number) indicating the length of the JSON chunk, the JSON chunk, then a trailing &#8220;\r\n&#8221;.  We would <em>like</em> the stream to look something like this:</p>
<pre>
---
{"in_reply_to_screen_name":null,...}
...
---
{"in_reply_to_screen_name":"tenderlove"...}
...
</pre>
<p>This chunk is two valid YAML documents.  If the stream looked like this, we could feed it straight to our YAML processor no problem.  How can we modify the stream to be suitable for our parser?</p>
<h3>Fun with Thread and IO.pipe</h3>
<p>If we create a pipe, we can have have one thread process input from Twitter and feed that in to the pipe.  We can then give the other end of the pipe to our JSON processor and let it read from our processed feed.  Let&#8217;s modify the &#8220;listen&#8221; method in our client to munge the feed to a pipe, and hand that off to our YAML processor.  I only care about the text of people&#8217;s tweets, so let&#8217;s modify our listener too.</p>
<p>Here is our completed program:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'socket'
require 'psych'

class StreamClient
  def initialize user, pass
    @ba = [&quot;#{user}:#{pass}&quot;].pack('m').chomp
  end

  def listen listener
    socket = TCPSocket.new 'stream.twitter.com', 80
    socket.write &quot;GET /1/statuses/sample.json HTTP/1.1\r\n&quot;
    socket.write &quot;Host: stream.twitter.com\r\n&quot;
    socket.write &quot;Authorization: Basic #{@ba}\r\n&quot;
    socket.write &quot;\r\n&quot;

    # Read the headers
    while((line = socket.readline) != &quot;\r\n&quot;); puts line if $DEBUG; end

    reader, writer = IO.pipe
    producer = Thread.new(socket, writer) do |s, io|
      loop do
        io.write &quot;---\n&quot;
        io.write s.read s.readline.strip.to_i 16
        io.write &quot;...\n&quot;
        s.read 2 # strip the blank line
      end
    end

    parser = Psych::Parser.new listener
    parser.parse reader

    producer.join
  end
end

class Listener &lt; Psych::Handler
  def initialize
    @was_text = false
  end

  def scalar value, anchor, tag, plain, quoted, style
    puts value if @was_text
    @was_text = value == 'text'
  end
end

StreamClient.new(ARGV[0], ARGV[1]).listen Listener.new
</pre>
<p>Great!  In 30 lines, we&#8217;ve been able to provide an event based API for consuming Twitter streams.  Were it not for the feed munging, we could reduce that by 9 lines!</p>
<h3>Problems</h3>
<p>So far, there have only been two problems for me with this script.  The first is that we are forced to buffer the response from Twitter, but we cannot help that.  The second is that sometimes the JSON emitted from Twitter is not parseable by Psych.  I think this is just due to Psych only supporting YAML 1.1.</p>
<h3>Conclusion</h3>
<p>It&#8217;s true that we could have implemented this same interface without a pipe and a thread.  Rather than munging the stream, we could create a new parser instance for each status update.  But why create so many objects for parsing the stream when we only need one?</p>
<p>Anyway, have fun playing with this code, and I encourage you to try out Ruby 1.9.2.  I think it&#8217;s really fun!  PEW PEW PEW!  HAPPY SATURDAY!</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2010/04/17/event-based-json-and-yaml-parsing/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Compiling with Clang</title>
		<link>http://tenderlovemaking.com/2010/01/03/compiling-with-clang/</link>
		<comments>http://tenderlovemaking.com/2010/01/03/compiling-with-clang/#comments</comments>
		<pubDate>Mon, 04 Jan 2010 00:05:00 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[nokogiri]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=414</guid>
		<description><![CDATA[HI EVERYONE AND HAPPY SUNDAY! Lately I&#8217;ve been trying to compile my ruby extensions with Clang. One reason I like trying out my extensions with Clang is because it catches some errors that GCC doesn&#8217;t. If &#8230; <a class="more-link" href="http://tenderlovemaking.com/2010/01/03/compiling-with-clang/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>HI EVERYONE AND HAPPY SUNDAY!</p>
<p>Lately I&#8217;ve been trying to compile my ruby extensions with <a href="http://clang.llvm.org/">Clang</a>.  One reason I like trying out my extensions with Clang is because it catches some errors that GCC doesn&#8217;t.  If you know the right things to set, it&#8217;s pretty easy to get your extension to compile with Clang.  Unfortunately finding the right thing isn&#8217;t always easy, but I found the right bits to flip and I want to share!</p>
<p>Here&#8217;s how to do it.  Add this line to your extconf.rb right after you require mkmf:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'mkmf'

RbConfig::MAKEFILE_CONFIG['CC'] = ENV['CC'] if ENV['CC']

# ... rest of your extconf goes here
</pre>
<p>Then when you compile your extension, just set CC to point at clang:</p>
<pre><code>$ CC=/Developer/usr/bin/clang rake compile</code></pre>
<p>You can see it in action in the <a href="http://github.com/tenderlove/nokogiri/blob/master/ext/nokogiri/extconf.rb#L7">nokogiri extconf</a>.  You can even see where <a href="http://github.com/tenderlove/nokogiri/commit/6321c97963f657e8b4ece1783a6a25ca21504fda">clang helped me shake out some bugs</a>, and I think that&#8217;s pretty cool.</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2010/01/03/compiling-with-clang/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Full Text Search on Heroku</title>
		<link>http://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku/</link>
		<comments>http://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku/#comments</comments>
		<pubDate>Sat, 17 Oct 2009 23:29:50 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[texticle]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=369</guid>
		<description><![CDATA[YA!! IT&#8217;S SATURDAY NIGHT! YOU ALL KNOW WHAT THAT MEANS! Time to get krunk and do some full text searching. OW! I&#8217;d like to share with my tens of loyal readers how I&#8217;m doing Full Text &#8230; <a class="more-link" href="http://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>YA!!  IT&#8217;S <strong>SATURDAY NIGHT</strong>!  YOU ALL KNOW WHAT THAT MEANS!  Time to get krunk and do some <strong>full text searching</strong>.  OW!  I&#8217;d like to share with my tens of loyal readers how I&#8217;m doing Full Text Search on Heroku.</p>
<p>Heroku&#8217;s documentation <a href="http://docs.heroku.com/full-text-indexing">lists two ways</a> to get full text indexing working with your Heroku application.  They talk about using <a href="http://www.davebalmain.com/">Ferret</a> and <a href="http://lucene.apache.org/solr/">Solr</a> for full text indexes.  The Ferret option looks OK, but it requires you to rebuild your indexes every time you push.  Solr would work, but it requires an EC2 instance or some third party server.  Since my budget is precisely $0, using Solr is out of the picture.</p>
<p>But there is a third option.  A very <em>secret</em> option.  A devious but fun option.  You see, <a href="http://docs.heroku.com/database">Heroku runs PostgreSQL</a> for each rails application database.  They&#8217;re running a version new enough (Version 8.3) to have full text index support built in.  If we&#8217;re willing to throw out database agnosticism, we can take advantage of the database&#8217;s indexing capability.  For this article, I&#8217;d like to hop on the Postgres train and show you how to get full text indexes working with Postgres in your rails application.  I&#8217;ll also show you how to get those indexes on Heroku so we can use them &#8220;in the cloud&#8221; (Heroku is in the cloud, right?).</p>
<p>For the rest of this article, I&#8217;m going to assume you have PostgreSQL version 8.3 or higher installed already and can get your rails application working with Postgres.  Installing postgres is outside the scope of this article, but I found <a href="http://www.gregbenedict.com/2009/08/31/installing-postgresql-on-snow-leopard-10-6/">these instructions</a> to be very helpful.</p>
<h3>Step 1: Go get some coffee</h3>
<p>I love it when instructions tell me to go get some coffee because I always do.  I have to follow the instructions right?</p>
<h3>Step 2: Install Texticle</h3>
<p><a href="http://texticle.rubyforge.org/">Texticle</a> is a gem I wrote to help you define your text indexes on a per model basis.  To install texticle, we just do the normal gem install:</p>
<pre>
  $ sudo gem install texticle
</pre>
<p>The gem is pure ruby and isn&#8217;t very long, so I encourage you to <a href="http://github.com/tenderlove/texticle">peek through the source</a>.</p>
<p>While we&#8217;re at it, we should configure rails to load the texticle gem.  We need to add it to our envoronment.rb file.  Here&#8217;s what mine looks like:</p>
<pre class="brush: ruby; title: ; notranslate">
RAILS_GEM_VERSION = '2.3.4' unless defined? RAILS_GEM_VERSION

require File.join(File.dirname(__FILE__), 'boot')

Rails::Initializer.run do |config|
  config.time_zone = 'UTC'

  config.gem 'texticle'
end
</pre>
<p>Texticle also comes with some handy rake tasks (which we&#8217;ll talk about later).  In order to get those we&#8217;ll need update the rails Rakefile:</p>
<pre class="brush: ruby; title: ; notranslate">
require(File.join(File.dirname(__FILE__), 'config', 'boot'))

require 'rake'
require 'rake/testtask'
require 'rake/rdoctask'

require 'tasks/rails'

require 'rubygems'

## Our texticle rake tasks
require 'texticle/tasks'
</pre>
<h3>Step 3: Configuring your index</h3>
<p>Let&#8217;s pretend we have an Article model.  The Article model has a &#8220;title&#8221; field and a &#8220;body&#8221; field:</p>
<pre class="brush: ruby; title: ; notranslate">
class CreateArticles &lt; ActiveRecord::Migration
  def self.up
    create_table :articles do |t|
      t.string :title
      t.text   :body

      t.timestamps
    end
  end

  def self.down
    drop_table :articles
  end
end
</pre>
<p>To index those two fields, we just create an index block in the model and list the fields we want to index:</p>
<pre class="brush: ruby; title: ; notranslate">
class Article &lt; ActiveRecord::Base
  index do
    title
    body
  end
end
</pre>
<p>Declaring this index automatically defines a &#8220;search&#8221; method on the model that we can use to search our articles:</p>
<pre class="brush: ruby; title: ; notranslate">
&gt;&gt; Article.search('coffee instruction')
=&gt; [#&lt;Article id: 4, title: &quot;coffee&quot;, body: &quot;I like getting coffee to be in instructions&quot;, created_at: &quot;2009-10-17 21:42:13&quot;, updated_at: &quot;2009-10-17 21:42:13&quot;&gt;]
&gt;&gt; Article.create(:title =&gt; 'kittens', :body =&gt; 'kitten poop smells bad, but I still like kittens.')
=&gt; #&lt;Article id: 5, title: &quot;kittens&quot;, body: &quot;kitten poop smells bad, but I still like kittens.&quot;, created_at: &quot;2009-10-17 21:42:33&quot;, updated_at: &quot;2009-10-17 21:42:33&quot;&gt;
&gt;&gt; Article.search('kittens')
=&gt; [#&lt;Article id: 5, title: &quot;kittens&quot;, body: &quot;kitten poop smells bad, but I still like kittens.&quot;, created_at: &quot;2009-10-17 21:42:33&quot;, updated_at: &quot;2009-10-17 21:42:33&quot;&gt;]
&gt;&gt;
</pre>
<p>Great!  We can search our records.  There&#8217;s just one catch: we haven&#8217;t indexed our data.  Doing these types of searches will be slow against large sets of data unless we add an index.  Writing these indexes is a PITA, so texticle comes with a handy rake task for generating a migration to create your indexes:</p>
<pre>
  $ rake texticle:migration
  $ rake db:migrate
</pre>
<p>After running this, Postgres can use the prebuilt indexes when searching your data.</p>
<p>Just remember: every time you modify columns in your index block, or add new index blocks, you should create a new migration to updated the indexes.  If you don&#8217;t update the indexes, searches will still work as expected, they just might be kind of slow.</p>
<h3>Step 4: Integrating With Heroku</h3>
<p>This part is pretty easy.  First we update our <a href="http://blog.heroku.com/archives/2009/3/10/gem_manifests/">heroku gem manifest</a>:</p>
<pre>
  $ echo "texticle" >> .gems
  $ git add .gems
  $ git commit -m'updating gem manifest'
  $ git push origin master
</pre>
<p>Once your code is up on heroku, just tell heroku to migrate the database:</p>
<pre>
  $ heroku rake db:migrate
</pre>
<p>It&#8217;s just that easy!  Your indexes should be available on the Heroku database server and your application can use them.</p>
<h3>Advanced Texticle Usage</h3>
<p>Texticle has a few more features I&#8217;d like to briefly mention.  The first one is search ranking.  We can tell Postgres which field has a higher priority.  For example, we can tell Postgres to weigh matches in the article&#8217;s title higher than matches in the body:</p>
<pre class="brush: ruby; title: ; notranslate">
class Article &lt; ActiveRecord::Base
  index do
    title 'A'
    body  'B'
  end
end
</pre>
<p>The ranks are &#8216;A&#8217; through &#8216;D&#8217;, and multiple fields can have the same rank.</p>
<p>We can also group indexes.  The index we&#8217;ve seen so far will search all columns listed.  We can add another index so that we only search the &#8220;title&#8221; field:</p>
<pre class="brush: ruby; title: ; notranslate">
class Article &lt; ActiveRecord::Base
  index do
    title 'A'
    body  'B'
  end

  index('title') { title }
end
</pre>
<p>This gives us a &#8220;search_title&#8221; method in addition to the &#8220;search&#8221; method:</p>
<pre class="brush: ruby; title: ; notranslate">
&gt;&gt; Article.search_title('kittens')
=&gt; [#&lt;Article id: 5, title: &quot;kittens&quot;, body: &quot;kitten poop smells bad, but I still like kittens.&quot;, created_at: &quot;2009-10-17 21:42:33&quot;, updated_at: &quot;2009-10-17 21:42:33&quot;&gt;]
&gt;&gt;
</pre>
<p>The last thing I want to mention is &#8220;rank&#8221;.  When you perform a search, texticle adds an extra field to your model called &#8220;rank&#8221;.  The rank indicates how well your record matched the search criteria:</p>
<pre class="brush: ruby; title: ; notranslate">
&gt;&gt; Article.search('like').map { |x| x.rank }
=&gt; [&quot;0.4&quot;, &quot;0.4&quot;]
&gt;&gt; Article.search('coffee').map { |x| x.rank }
=&gt; [&quot;1.4&quot;]
&gt;&gt;
</pre>
<p>Search results are already returned sorted by rank in descending order, so no need to worry about sorting.</p>
<h3>Conclusion</h3>
<p>I hope you enjoy tickling text with texticle as much as I do.  So far, I&#8217;ve been pretty happy with this solution.</p>
<p>Things I like:</p>
<ul>
<li>It&#8217;s the right price for use with Heroku (namely $0)</li>
<li>Easy to configure and deploy</li>
<li>No need to rebuild indexes on pushes</li>
<li>Postgres can be configured to use different dictionaries, so you aren&#8217;t stuck with English</li>
</ul>
<p>The only drawbacks I&#8217;ve found so far are:</p>
<ul>
<li>INSERTs and UPDATEs are slower</li>
<li>It&#8217;s database specific</li>
</ul>
<p>Inserts and updates will be slower, but that comes with the territory of adding database indexes.  My data is mostly doing reads, so it doesn&#8217;t bother me.  Texticle <strong>is</strong> database specific, but other databases are starting to have full text search support.  I think texticle could be extended to support other databases, but I&#8217;m quite happy with postgres.</p>
<p>Anyway, thanks for reading.  The final step is that you should go get another cup of coffee.</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2009/10/17/full-text-search-on-heroku/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>Ruby and RFID tags</title>
		<link>http://tenderlovemaking.com/2009/09/19/ruby-and-rfid-tags/</link>
		<comments>http://tenderlovemaking.com/2009/09/19/ruby-and-rfid-tags/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 05:52:26 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[nfc]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=358</guid>
		<description><![CDATA[It&#8217;s been forever since I&#8217;ve written a blog entry, so LETS DO THIS. I want to talk about reading RFID tags with Ruby. I am a nerd, so even though I can&#8217;t think of a good &#8230; <a class="more-link" href="http://tenderlovemaking.com/2009/09/19/ruby-and-rfid-tags/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been forever since I&#8217;ve written a blog entry, so <strong>LETS DO THIS</strong>.  I want to talk about reading RFID tags with Ruby.  I am a nerd, so even though I can&#8217;t think of a good application, I am <em>compelled</em> to be able to read RFID tags.  I love programming Ruby, so of course, I have to do this with Ruby.</p>
<h3>Getting an RFID Reader</h3>
<p>First thing to do, is buy an RFID reader.  After searching around, I found the <a href="http://www.touchatag.com/">touchatag reader</a>.  I bought the <a href="http://store.touchatag.com/usshop/acatalog/touchatag_starter_pack.html">touchatag starter pack</a>.  It&#8217;s only $40, USB, and comes with 10 RFID tags.  Most importantly, it works well with <a href="http://www.libnfc.org/">libnfc</a> (more about that later).</p>
<p><a href="http://www.flickr.com/photos/aaronp/3935563715/" title="IMG_0315 by fakebeard, on Flickr"><img src="http://farm4.static.flickr.com/3436/3935563715_1f799ab95b.jpg" width="500" height="333" alt="IMG_0315" /></a></p>
<p>The tags that come with the reader have an adhesive back, so you can stick them to stuff.  They also have the unique identifier printed on them so that you can make sure your program output is correct.</p>
<p><a href="http://www.flickr.com/photos/aaronp/3936348912/" title="IMG_0317 by fakebeard, on Flickr"><img src="http://farm4.static.flickr.com/3529/3936348912_36d1b2fe2a_m.jpg" width="240" height="160" alt="IMG_0317" /></a><a href="http://www.flickr.com/photos/aaronp/3936350030/" title="IMG_0318 by fakebeard, on Flickr"><img src="http://farm3.static.flickr.com/2471/3936350030_ea34963cb7_m.jpg" width="240" height="160" alt="IMG_0318" /></a></p>
<h3>Interfacing with the reader</h3>
<p>Now that we&#8217;ve got the reader, let&#8217;s do something with it!  I mentioned earlier that the touchatag reader works with <a href="http://libnfc.org/">libnfc</a>.  Libnfc is a C library that knows how to work with NFC devices (nerd talk for &#8220;RFID readers&#8221;).  I&#8217;ve written a gem called <a href="http://github.com/tenderlove/nfc">nfc</a> that wraps up the C library in to something we can use in Ruby.</p>
<p>First thing we need to do is install libnfc.  I use macports with OS X.  With macports, installing libnfc is quite easy:</p>
<pre>
    $ sudo port install libnfc
</pre>
<p>Installing on linux should be just as easy, but you&#8217;ll need to consult your package manager.  Make sure to install the devel packages too!</p>
<p>After that, simply install the nfc Ruby gem:</p>
<pre>
    $ sudo gem install nfc
</pre>
<p>Now that that is out of the way, we can actually read an RFID tag.  Here is our code:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'rubygems'
require 'nfc'
# Find a tag
NFC.instance.find do |tag|
  # Print out the tag we find
  p tag
end
</pre>
<p>That&#8217;s it!  Run the code, then touch a tag to the reader, and boom!  We have output.  With the tag I&#8217;m using, the output looks like this:</p>
<pre>
$ ruby -I lib test.rb
(NFC) ISO14443A Tag
 ATQA (SENS_RES): 00  44
    UID (NFCID1): 04  D7  62  91  21  25  80
   SAK (SEL_RES): 00
</pre>
<p>The important part of this output is the UID field.  That field is the unique identifier for this tag.  The identifier comes back as a list of integers, but they are printed on the tag as hex.  We can adjust the program just a little bit to see that list, or to get the same string that&#8217;s printed on the tag:</p>
<pre class="brush: ruby; title: ; notranslate">
# Find a tag
NFC.instance.find do |tag|
  # Examine the raw numbers
  p tag.uid
  # Get just the UID as a string
  puts tag.to_s
end
</pre>
<p>The output looks like this:</p>
<pre>
$ ruby -I lib test.rb
[4, 215, 98, 145, 33, 37, 128]
04D76291212580
</pre>
<p>That&#8217;s pretty much it.  Unfortunately, I can&#8217;t think of anything fun to do with my tags, but maybe you can!  <a href="http://www.flickr.com/photos/aaronp/3804698617/">I hooked my tags up to the &#8220;say&#8221; command that comes with OS X and made each tag say something different</a>.</p>
<h3>Non-Blocking NFC interaction</h3>
<p>Our previous example blocked until an RFID tag was read.  If you run the program without having an RFID tag on the reader, it will just sit there until it can read a tag.  Sometimes we might want to tell whether or not there is a tag on the reader <em>right now</em>.  In other words, we <em>don&#8217;t</em> want our program to block.</p>
<p>Calling find without providing a block will return immediately:</p>
<pre class="brush: ruby; title: ; notranslate">
p NFC.instance.find.to_s
</pre>
<p>You&#8217;ll get a return value immediately.  The tag returned will either contain a blank uid, or an actual UID.  Here is the output run once with a tag sitting on the reader, and once without a tag:</p>
<pre>
$ ruby -I lib test.rb
"04D76291212580"
$ ruby -I lib test.rb
""
</pre>
<h3>Conclusion</h3>
<p>That&#8217;s pretty much it.  Interacting with the touchatag reader is quite simple and straight forward.  Currently the nfc gem supports reading ISO1443A tags (the tags that come with the reader).  The reader should be able to read other tag types, but I haven&#8217;t had a chance to get other tags to test.</p>
<p>Touchatag provides an <a href="http://www.touchatag.com/developer/docs/guide">official API</a> for their readers.  But the API seems difficult and is dependent on a network connection.</p>
<p><a href="http://www.flickr.com/photos/aaronp/3804698617/">Here is a video of me reading some tags</a>.<br />
<a href="http://gist.github.com/164896">Here is the code from the video</a>.<br />
<a href="http://www.flickr.com/photos/aaronp/tags/touchatag/">Here you can find more photos of the reader</a>.<br />
Finally, <a href="http://github.com/tenderlove/nfc">here is the source of the NFC gem</a>.</p>
<p>Have fun reading some RFID tags!</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2009/09/19/ruby-and-rfid-tags/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>String Encoding in Ruby 1.9 C extensions</title>
		<link>http://tenderlovemaking.com/2009/06/26/string-encoding-in-ruby-1-9-c-extensions/</link>
		<comments>http://tenderlovemaking.com/2009/06/26/string-encoding-in-ruby-1-9-c-extensions/#comments</comments>
		<pubDate>Fri, 26 Jun 2009 15:48:54 +0000</pubDate>
		<dc:creator>Aaron Patterson</dc:creator>
				<category><![CDATA[computadora]]></category>
		<category><![CDATA[nokogiri]]></category>

		<guid isPermaLink="false">http://tenderlovemaking.com/?p=315</guid>
		<description><![CDATA[One of the challenges of developing nokogiri has been dealing with String encodings in C. I would like to present one of the problems encountered, along with a solution. I will be using RubyInline in the &#8230; <a class="more-link" href="http://tenderlovemaking.com/2009/06/26/string-encoding-in-ruby-1-9-c-extensions/">More<span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>One of the challenges of <a href="http://github.com/tenderlove/nokogiri">developing nokogiri</a> has been dealing with String encodings in C.  I would like to present one of the problems encountered, along with a solution.  I will be using <a href="http://rubyforge.org/projects/rubyinline/">RubyInline</a> in the examples below, but the C code presented should be easy to port to your own C extensions.</p>
<h2>Examining the Encoding</h2>
<p>If you&#8217;ve developed a C extension before, you&#8217;re probably familiar with <b>rb_str_new2</b> and friends.  They all basically turn a <b>char *</b> in to a string <b>VALUE</b>.  But in Ruby 1.9, what is the encoding of the returned Ruby String?  Well, using RubyInline, it&#8217;s easy enough to see by calling the &#8220;encoding&#8221; method.  Here is a script that works in Ruby 1.8 and Ruby 1.9:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'rubygems'
require 'inline'

class HelloWorld
  inline do |builder|
    builder.c '
      static VALUE test() {
        return rb_str_new2(&quot;Hello world&quot;);
      }
    '
  end
end

string = HelloWorld.new.test

if string.respond_to? :encoding
  puts string.encoding
else
  puts string
end
</pre>
<p>In Ruby 1.8, this outputs the string, and in 1.9 we see the encoding.  In 1.9, the encoding returned is <b>ASCII-8BIT</b>.  Now <b>ASCII-8BIT</b> may be the encoding that you want, but then again, it may not.  In Nokogiri, the strings coming from libxml2 are already encoded according to the document declaration.  So strings returned must be marked with the appropriate encoding.  How can we update the encoding?</p>
<h2>Changing the Encoding</h2>
<p>In Ruby 1.9, we get a few new functions specifically for dealing with encoding.  These functions are defined in <b>&lt;ruby/encoding.h&gt;</b>.  We&#8217;re going to be dealing with two of them: <b>rb_enc_find_index</b> and <b>rb_enc_associate_index</b>.</p>
<p>The first function, <b>rb_enc_find_index</b>, given a <b>char *</b> will look up the index of your encoding.  The function takes a string like &#8220;UTF-8&#8243; and returns a magic index number for that encoding.</p>
<p>The second function, <b>rb_enc_associate_index</b>, will associate a string held in a <b>VALUE</b> with the encoding index returned from the first function.</p>
<p>Armed with this knowledge, we can modify our original program to return a string encoded with UTF-8.  The only modifications are to include <b>&lt;ruby/encoding.h&gt;</b>, get the index for the desired encoding, then associate the <b>VALUE</b> with the returned index:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'rubygems'
require 'inline'

class HelloWorld
  inline do |builder|
    builder.include &quot;&lt;ruby/encoding.h&gt;&quot;

    builder.c '
      static VALUE test() {
        VALUE string = rb_str_new2(&quot;Hello World&quot;);
        int enc = rb_enc_find_index(&quot;UTF-8&quot;);
        rb_enc_associate_index(string, enc);
        return string;
      }
    '
  end
end

string = HelloWorld.new.test

if string.respond_to? :encoding
  puts string.encoding
else
  puts string
end
</pre>
<p>Great!  When this is run under Ruby 1.9, the encoding returned is UTF-8.  Unfortunately, this example is now specific for Ruby 1.9.  Ruby 1.8 does not ship with the correct header files, and definitely does not include the functions for looking up and assigning encoding.  This code will just not work under Ruby 1.8.  Luckily, this code can be refactored to work under either version of Ruby.</p>
<h2>Refactoring for 1.8 Support</h2>
<p>Both Ruby 1.8 and 1.9 provide a <b>&lt;ruby.h&gt;</b> header file.  The Ruby 1.9 version of that file defines a constant <b>HAVE_RUBY_ENCODING_H</b> that lets us determine whether the proper header file exists.  Our final attempt tests for the encoding constant, then defines a macro to wrap <b>rb_str_new2</b>.  If the version of Ruby we compile against has encoding support, the macro can add the encoding to the string, otherwise, it just ignores the encoding:</p>
<pre class="brush: ruby; title: ; notranslate">
require 'rubygems'
require 'inline'

class HelloWorld
  inline do |builder|

    builder.prefix &lt;&lt;-eoc
#include &lt;ruby.h&gt;

#ifdef HAVE_RUBY_ENCODING_H

#include &lt;ruby/encoding.h&gt;

#define ENCODED_STR_NEW2(str, encoding) \
  ({ \
    VALUE _string = rb_str_new2((const char *)str); \
    int _enc = rb_enc_find_index(encoding); \
    rb_enc_associate_index(_string, _enc); \
    _string; \
  })

#else

#define ENCODED_STR_NEW2(str, encoding) \
  rb_str_new2((const char *)str)

#endif
    eoc

    builder.c '
      static VALUE test() {
        return ENCODED_STR_NEW2(&quot;Hello world&quot;, &quot;UTF-8&quot;);
      }
    '
  end
end

string = HelloWorld.new.test

if string.respond_to? :encoding
  puts string.encoding
else
  puts string
end
</pre>
<p>In 1.8, the macro just returns the new string.  In 1.9, the macro returns the string and additionally sets the encoding.  Now if we use this macro wherever we create new strings, we&#8217;ll be working well with 1.8 and 1.9!</p>
<h2>Final Notes</h2>
<p>This example was slightly simplified.  Since the encoding index is determined at runtime, there could be problems.  If <b>rb_enc_find_index</b> cannot find the requested encoding, it simply returns a <b>-1</b>.  The macro should handle that case.</p>
<p>Also, if you&#8217;re playing along at home, remember to save the file between running it with 1.8 and 1.9.  RubyInline examines the mtime of the ruby file, and will only recompile when the rb file has been written to.  That means if you run it with 1.8, then immediately run again with 1.9, it won&#8217;t recompile it for 1.9.  I suppose I should send in a patch.  <img src='http://tenderlovemaking.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>One last thing&#8230;  There may be better ways to do this.  I needed to determine the encoding at runtime because XML files declare their encoding scheme.  If you parse an XML file that declares it&#8217;s encoding as EUC-JP, it would make sense that the strings you pull our are encoded in EUC-JP, right?  If you know that you&#8217;re <i>always</i> going to be returning UTF-8 strings from your C extensions, it could be a different story.  Either way, using macros and checking for constants should make sure your code works with 1.8 or 1.9.</p>
]]></content:encoded>
			<wfw:commentRss>http://tenderlovemaking.com/2009/06/26/string-encoding-in-ruby-1-9-c-extensions/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

