YAGNI methods slow us down
Jun 4, 2014 @ 11:49 amTL;DR: OutputBuffer
subclasses SafeBuffer
which forces us to do runtime
checks that are probably unnecessary
I made a post about YAGNI methods hurting you where I said I would provide two examples, but then I got tired of writing the article so I just did one example. Here is the other example! The previous example demonstrated a memory leak that was introduced because the “footprint” (the number of methods implemented on the object) was too large.
This example will show how these YAGNI methods are impacting performance of your Rails application, and we’ll talk about how to fix it.
This problem is in ActionView::OutputBuffer
which inherits from
ActiveSupport::SafeBuffer
, which inherits from Ruby’s String
. Let’s talk
about the behavior ActiveSupport::SafeBuffer
first, then we’ll see how it
impacts to the performance of ActionView::OutputBuffer
.
ActiveSupport::SafeBuffer
This class is the class that’s used to mark a string as being “html safe”. It’s how Rails detects whether a string has been html escaped or not. In Rails, a normal Ruby string is considered to be not “html safe”. For example:
>> x = "foo"
=> "foo"
>> x.class
=> String
>> x.html_safe?
=> false
If we call html_safe
on the string, it returns an instance of
ActiveSupport::SafeBuffer
that is “tagged” as html safe:
>> x = "foo"
=> "foo"
>> y = x.html_safe
=> "foo"
>> y.class
=> ActiveSupport::SafeBuffer
>> y.html_safe?
=> true
Whenever we html escape a string in Rails, it returns a safe buffer:
>> x = "<html>"
=> "<html>"
>> x.html_safe?
=> false
>> y = ERB::Util.html_escape x
=> "<html>"
>> y.class
=> ActiveSupport::SafeBuffer
>> y.html_safe?
=> true
Now, using the html_safe?
predicate, we can easily tell the difference between
strings that have been tagged as “html safe” and strings that haven’t been tagged as “html safe” (side note: just like encodings, tagging something does not mean that it actually is correct. We can tag things as “html safe” without them actually being html safe).
We can also concatenate unsafe strings to a safe string, and it will be automatically escaped:
>> x = ERB::Util.html_escape "<html>"
=> "<html>"
>> x.concat "<blink>"
=> "<html><blink>"
>> x.html_safe?
=> true
Finally, we can concatenate safe strings and they will not be double escaped:
>> x = ERB::Util.html_escape "<html>"
=> "<html>"
>> x.concat ERB::Util.html_escape "<blink>"
=> "<html><blink>"
>> x.html_safe?
=> true
Infecting a SafeBuffer
So far, the html_safe?
predicate is a 1:1 relationship with the class.
Meaning that if the class of the string is ActiveSupport::SafeBuffer
, then
html_safe?
would return true
, and if the class was String
, html_safe?
would return false
.
Unfortunately it is not a 1:1 relationship. We can mutate a SafeBuffer, making it unsafe again. For example:
>> x = ERB::Util.html_escape "<html>"
=> "<html>"
>> dangerous = "><script>"
=> "><script>"
>> x.gsub!(/>/, dangerous)
=> "<html><script>"
>> x
=> "<html><script>"
>> x.class
=> ActiveSupport::SafeBuffer
>> x.html_safe?
=> false
You can see that the string in dangerous
has been embedded in to the
SafeBuffer without being escaped. The class of x
is still SafeBuffer
, but
calling html_safe?
will return false
. As you can see, the return value of
html_safe?
is not a 1:1 relationship with the class.
Concatenating to an infected SafeBuffer
Our infected SafeBuffer
still supports concatenation:
>> x
=> "<html><script>"
>> x.class
=> ActiveSupport::SafeBuffer
>> x.html_safe?
=> false
>> x.concat "<blink>"
=> "<html><script><blink>"
But you can see that the “<blink>” string was not escaped. This means that the
concat
behavior on a SafeBuffer
changes depending on the value of html_safe?
. If you look at the implementation of concat
,
along with it’s helper method,
you can indeed see that this is true.
What is OutputBuffer?
OutputBuffer
is a buffer that is fed to Rack and then output to the client
over the wire. Templates instantiate an OutputBuffer
and concatenate computed
output to the buffer.
Impact of SafeBuffer on OutputBuffer
How does SafeBuffer
impact OutputBuffer
? Let’s take a look at a compiled ERB
template to find out. Here is an ERB template after it’s been compiled to
Ruby:
@output_buffer = output_buffer || ActionView::OutputBuffer.new
@output_buffer.safe_append=' <tr>
<td>'.freeze
@output_buffer.append=( book.name )
The output isn’t pretty, but you can see that we essentially call two methods on
the output buffer: append=
, and safe_append=
(why we have an =
is a
mystery for the ages). append=
is used when there are dynamic values like
<%= link_to ... %>
, and safe_append=
is used for string literals in your
template.
Let’s look at the implementation of
safe_append=
:
def safe_concat(value)
return self if value.nil?
super(value.to_s)
end
alias :safe_append= :safe_concat
First, notice that value
is never nil
. The ERB compiler ensures that fact, so the
value.nil?
check is useless. However, this blurrrggghhh post is about how
YAGNI methods are hurting us. We do need to concatenate, so you are gonna
need this.
OutputBuffer
is a subclass of SafeBuffer
, and safe_concat
supers in to
SafeBuffer
. Let’s look at the implementation of safe_concat
on
SafeBuffer
:
def safe_concat(value)
raise SafeConcatError unless html_safe?
original_concat(value)
end
safe_concat
raises an exception if the current object is not html_safe?
.
Every time our ERB template concatenates a string, it checks whether or not the
current object is “html safe”.
As we saw earlier, the only way to make an instance of a SafeBuffer
not
html safe is by calling an unsafe method on that
instance.
YAGNI method pain
Every time we concatenate to an OutputBuffer
object, we’re forced to check a
flag. This flag is directly related to people calling gsub!
on the
OutputBuffer
. This bothers me because, as you can imagine, we concatenate on
to the OutputBuffer
extremely frequently.
I’m betting that it’s extremely rare for someone to call gsub!
on an
OutputBuffer
. I’m also willing to bet that people calling gsub!
on an
OutputBuffer
would be willing to call to_s
on it before doing their
mutation, so why do we “support” this?
Our OutputBuffer
is punishing most people for a feature that is extremely
rare. This is thanks to the features we got “for free” from our superclass.
Secondly, if you look in the same file that defines OutputBuffer
, you’ll see
a class called
StreamingBuffer
.
This class is meant to be a replacement for OutputBuffer
, but it streams to
the client. Notice that this class does not inherit from SafeBuffer
.
Further evidence that gsub!
on OutputBuffer
is a YAGNI method.
How do we fix this?
I think we can fix this by disconnecting OutputBuffer
from it’s superclass. I
suspect that the methods people actually use on OutputBuffer
are extremely
few. I’d also say that this class should be completely private. In other
words, we (the Rails team) should design our API such that people never actually
touch instances of the OutputBuffer
.
If we disconnect OutputBuffer
from it’s superclass, I think we can even do
some tricks with the ERB compiler to attain faster output.