SafeBuffer which forces us to do runtime
checks that are probably unnecessary
I made a post about YAGNI methods hurting you where I said I would provide two examples, but then I got tired of writing the article so I just did one example. Here is the other example! The previous example demonstrated a memory leak that was introduced because the “footprint” (the number of methods implemented on the object) was too large.
This example will show how these YAGNI methods are impacting performance of your Rails application, and we’ll talk about how to fix it.
This problem is in
ActionView::OutputBuffer which inherits from
ActiveSupport::SafeBuffer, which inherits from Ruby’s
String. Let’s talk
about the behavior
ActiveSupport::SafeBuffer first, then we’ll see how it
impacts to the performance of
This class is the class that’s used to mark a string as being “html safe”. It’s how Rails detects whether a string has been html escaped or not. In Rails, a normal Ruby string is considered to be not “html safe”. For example:
>> x = "foo" => "foo" >> x.class => String >> x.html_safe? => false
If we call
html_safe on the string, it returns an instance of
ActiveSupport::SafeBuffer that is “tagged” as html safe:
>> x = "foo" => "foo" >> y = x.html_safe => "foo" >> y.class => ActiveSupport::SafeBuffer >> y.html_safe? => true
Whenever we html escape a string in Rails, it returns a safe buffer:
>> x = "<html>" => "<html>" >> x.html_safe? => false >> y = ERB::Util.html_escape x => "<html>" >> y.class => ActiveSupport::SafeBuffer >> y.html_safe? => true
Now, using the
html_safe? predicate, we can easily tell the difference between
strings that have been tagged as “html safe” and strings that haven’t been tagged as “html safe” (side note: just like encodings, tagging something does not mean that it actually is correct. We can tag things as “html safe” without them actually being html safe).
We can also concatenate unsafe strings to a safe string, and it will be automatically escaped:
>> x = ERB::Util.html_escape "<html>" => "<html>" >> x.concat "<blink>" => "<html><blink>" >> x.html_safe? => true
Finally, we can concatenate safe strings and they will not be double escaped:
>> x = ERB::Util.html_escape "<html>" => "<html>" >> x.concat ERB::Util.html_escape "<blink>" => "<html><blink>" >> x.html_safe? => true
Infecting a SafeBuffer
So far, the
html_safe? predicate is a 1:1 relationship with the class.
Meaning that if the class of the string is
html_safe? would return
true, and if the class was
Unfortunately it is not a 1:1 relationship. We can mutate a SafeBuffer, making it unsafe again. For example:
>> x = ERB::Util.html_escape "<html>" => "<html>" >> dangerous = "><script>" => "><script>" >> x.gsub!(/>/, dangerous) => "<html><script>" >> x => "<html><script>" >> x.class => ActiveSupport::SafeBuffer >> x.html_safe? => false
You can see that the string in
dangerous has been embedded in to the
SafeBuffer without being escaped. The class of
x is still
html_safe? will return
false. As you can see, the return value of
html_safe? is not a 1:1 relationship with the class.
Concatenating to an infected SafeBuffer
SafeBuffer still supports concatenation:
>> x => "<html><script>" >> x.class => ActiveSupport::SafeBuffer >> x.html_safe? => false >> x.concat "<blink>" => "<html><script><blink>"
But you can see that the “<blink>” string was not escaped. This means that the
concat behavior on a
SafeBuffer changes depending on the value of
html_safe?. If you look at the implementation of
along with it’s helper method,
you can indeed see that this is true.
What is OutputBuffer?
OutputBuffer is a buffer that is fed to Rack and then output to the client
over the wire. Templates instantiate an
OutputBuffer and concatenate computed
output to the buffer.
Impact of SafeBuffer on OutputBuffer
OutputBuffer? Let’s take a look at a compiled ERB
template to find out. Here is an ERB template after it’s been compiled to
@output_buffer = output_buffer || ActionView::OutputBuffer.new;@output_buffer.safe_append='<h1>Listing books</h1> <table> <thead> <tr> <th>Name</th> <th colspan="3"></th> </tr> </thead> <tbody> '.freeze; @books.each do |book| @output_buffer.safe_append=' <tr> <td>'.freeze;@output_buffer.append=( book.name );@output_buffer.safe_append='</td> <td>'.freeze;@output_buffer.append=( link_to 'Show', book );@output_buffer.safe_append='</td> </tr> '.freeze; end @output_buffer.safe_append=' </tbody> </table> <br> '.freeze;@output_buffer.append=( link_to 'New Book', new_book_path );@output_buffer.safe_append=' '.freeze;@output_buffer.to_s
The output isn’t pretty, but you can see that we essentially call two methods on
the output buffer:
safe_append= (why we have an
= is a
mystery for the ages).
append= is used when there are dynamic values like
<%= link_to ... %>, and
safe_append= is used for string literals in your
Let’s look at the implementation of
def safe_concat(value) return self if value.nil? super(value.to_s) end alias :safe_append= :safe_concat
First, notice that
value is never
nil. The ERB compiler ensures that fact, so the
value.nil? check is useless. However, this blurrrggghhh post is about how
YAGNI methods are hurting us. We do need to concatenate, so you are gonna
OutputBuffer is a subclass of
safe_concat supers in to
SafeBuffer. Let’s look at the implementation of
def safe_concat(value) raise SafeConcatError unless html_safe? original_concat(value) end
safe_concat raises an exception if the current object is not
Every time our ERB template concatenates a string, it checks whether or not the
current object is “html safe”.
As we saw earlier, the only way to make an instance of a
html safe is by calling an unsafe method on that
YAGNI method pain
Every time we concatenate to an
OutputBuffer object, we’re forced to check a
flag. This flag is directly related to people calling
gsub! on the
OutputBuffer. This bothers me because, as you can imagine, we concatenate on
OutputBuffer extremely frequently.
I’m betting that it’s extremely rare for someone to call
gsub! on an
OutputBuffer. I’m also willing to bet that people calling
gsub! on an
OutputBuffer would be willing to call
to_s on it before doing their
mutation, so why do we “support” this?
OutputBuffer is punishing most people for a feature that is extremely
rare. This is thanks to the features we got “for free” from our superclass.
Secondly, if you look in the same file that defines
OutputBuffer, you’ll see
a class called
This class is meant to be a replacement for
OutputBuffer, but it streams to
the client. Notice that this class does not inherit from
Further evidence that
OutputBuffer is a YAGNI method.
How do we fix this?
I think we can fix this by disconnecting
OutputBuffer from it’s superclass. I
suspect that the methods people actually use on
OutputBuffer are extremely
few. I’d also say that this class should be completely private. In other
words, we (the Rails team) should design our API such that people never actually
touch instances of the
If we disconnect
OutputBuffer from it’s superclass, I think we can even do
some tricks with the ERB compiler to attain faster output.