Removing config.threadsafe!

Jun 18, 2012 @ 11:04 am

TL;DR: config.threadsafe! can be removed, but for Rails 4.0 we should just enable it by default.

A while back a ticket was filed on the Rails tracker to turn on config.threadsafe! mode by default in production. Unfortunately, this change was met with some resistance. Rather than make resistance to change a negative thing, I would like to make it a positive thing by talking about exactly what config.threadsafe! does. My goal is to prove that enabling config.threadsafe! in multi threaded environments and multi process environments is beneficial, therefore the option can be removed.

Before we discuss the impact of config.threadsafe! on multi-process vs multi-threaded environments, let’s understand exactly what the option does.

`config.threadsafe!`: what does it do?

Let’s take a look at the threadsafe! method:

def threadsafe!
  @preload_frameworks = true
  @cache_classes      = true
  @dependency_loading = false
  @allow_concurrency  = true
  self
end

Calling this method sets four options in our app configuration. Let’s walk through each option and talk about what it does.

Preloading Frameworks

The first option @preload_frameworks does pretty much what it says, it forces the Rails framework to be eagerly loaded on boot. When this option is not enabled, framework classes are loaded lazily via autoload. In multi-threaded environments, the framework needs to be eagerly loaded before any threads are created because of thread safety issues with autoload. We know that loading the framework isn’t threadsafe, so the strategy is to load it all up before any threads are ready to handle requests.

Caching classes

The @cache_classes option controls whether or not classes get reloaded. Remember when you’re doing “TDD” in your application? You modify a controller, then reload the page to “test” it and see that things changed? Ya, that’s what this option controls. When this option is false, as in development, your classes will be reloaded when they are modified. Without this option, we wouldn’t be able to do our “F5DD” (yes, that’s F5 Driven Development).

In production, we know that classes aren’t going to be modified on the fly, so doing the work to figure out whether or not to reload classes is just wasting resources, so it makes sense to never reload class definitions.

Dependency loading

This option, @dependency_loading controls code loading when missing constants are encountered. For example, a controller references the User model, but the User constant isn’t defined. In that case, if @dependency_loading is true, Rails will find the file that contains the User constant, and load that file. We already talked about how code loading is not thread safe, so the idea here is that we should load the framework, then load all user code, then disable dependency loading. Once dependency loading is disabled, framework code and app code should be loaded, and any missing constants will just raise an exception rather than attempt to load code.

We justify disabling this option in production because (as was mentioned earlier) code loading is not threadsafe, and we expect to have all code loaded before any threads can handle requests.

Allowing concurrency

@allow_concurrency is my favorite option. This option controls whether or not the Rack::Lock middleware is used in your stack. Rack::Lock wraps a mutex around your request. The idea being that if you have code that is not threadsafe, this mutex will prevent multiple threads from executing your controller code at the same time. When threadsafe! is set, this middleware is removed, and controller code can be executed in parallel.

Multi Process vs Multi Thread

Whether a multi-process setup or a multi-threaded setup is best for your application is beyond the scope of this article. Instead, let’s look at how the threadsafe! option impacts each configuration (multi-proc vs mult-thread) and compare and contrast the two.

Code loading and caching

I’m going to lump the first three options (@preload_frameworks, @cache_classes, and @dependency_loading) together because they control roughly the same thing: code loading. We know autoload to not be threadsafe, so it makes sense that in a threaded environment we should do these things in advance to avoid deadlocks.

@cache_classes is enabled by default regardless of your concurrency model. In production, Rails automatically preloads your application code so if we were to disable @dependency_loading in either a multi-process model or a multi-threading model, it would have no impact.

Among these settings, the one to differ most depending on concurrency model would be @preload_frameworks. In a multi-process environment, if @preload_frameworks is enabled, it’s possible that the total memory consumption could go up. But this depends on how much of the framework your application uses. For example, if your Rails application makes no use of Active Record, enabling @preload_frameworks will load Active Record in to memory even though it isn’t used.

So the worst case scenario in a multi-process environment is that a process might take up slightly more memory. This is the situation today, but I think that with smarter application loading techniques, we could actually remove the @preload_frameworks option, and maintain minimal memory usage.

Rack::Lock and the multi-threaded Bogeyman

Rack::Lock is a middleware that is inserted to the Rails middleware stack in order to protect our applications from the multi-threaded Bogeyman. This middleware is supposed to protect us from nasty race conditions and deadlocks by wrapping our requests with a mutex. The middleware locks a mutex at the beginning of the request, and unlocks the mutex when the request finishes.

To study the impact of this middleware, let’s write a controller that is not threadsafe, and see what happens with different combinations of webservers and different combinations of config.threadsafe!.

Here is the code we’ll use for comparing concurrency models and usage of Rack::Lock:

class UsersController < ApplicationController
  @counter = 0

  class << self
    attr_accessor :counter
  end

  trap(:INFO) {
    $stderr.puts "Count: #{UsersController.counter}"
  }

  def index
    counter = self.class.counter # read
    sleep(0.1)
    counter += 1                 # update
    sleep(0.1)
    self.class.counter = counter # write

    @users = User.all

    respond_to do |format|
      format.html # index.html.erb
      format.json { render json: @users }
    end
  end
end

This controller has a classic read-update-write race condition. Typically, you would see this code in the form of variable += 1, but in this case it’s expanded to each step along with a sleep in order to exacerbate the concurrency problems. Our code increments a counter every time the action is run, and we’ve set a trap so that we can ask the controller what the count is.

We’ll run the following code to test our controller:

require 'net/http'

uri = URI('http://localhost:9292/users')

100.times {
  5.times.map {
    Thread.new { Net::HTTP.get_response(uri) }
  }.each(&:join)
}

This code generates 500 requests, doing 5 requests simultaneously 100 times.

Rack::Lock and a mult-threaded webserver

First, let’s test against a threaded webserver with threadsafe! disabled. That means we’ll have Rack::Lock in our middleware stack. For the threaded examples, we’re going to use the puma webserver. Puma is set up to handle 16 concurrent requests by default, so we’ll just start the server in one window:

[aaron@higgins omglol]$ RAILS_ENV=production puma 
Puma 1.4.0 starting...
* Min threads: 0, max threads: 16
* Listening on tcp://0.0.0.0:9292
Use Ctrl-C to stop

Then run our test in the other and send a SIGINFO to the webserver:

[aaron@higgins omglol]$ time ruby multireq.rb 

real	1m46.591s
user	0m0.709s
sys	0m0.369s
[aaron@higgins omglol]$ kill -INFO 59717
[aaron@higgins omglol]$

If we look at the webserver terminal, we see the count is 500, just like we expected:

127.0.0.1 - - [16/Jun/2012 16:25:58] "GET /users HTTP/1.1" 200 - 0.8815
127.0.0.1 - - [16/Jun/2012 16:25:59] "GET /users HTTP/1.1" 200 - 1.0946
Count: 500

Now let’s retry our test, but enable config.threadsafe! so that Rack::Lock is not in our middleware:

[aaron@higgins omglol]$ time ruby multireq.rb 

real	0m24.452s
user	0m0.724s
sys	0m0.382s
[aaron@higgins omglol]$ kill -INFO 59753
[aaron@higgins omglol]$

This time the webserver logs are reporting “200”, not even close to the 500 we expected:

127.0.0.1 - - [16/Jun/2012 16:30:50] "GET /users HTTP/1.1" 200 - 0.2232
127.0.0.1 - - [16/Jun/2012 16:30:50] "GET /users HTTP/1.1" 200 - 0.4259
Count: 200

So we see that Rack::Lock is ensuring that our requests are running in a thread safe environment. You may be thinking to yourself “This is awesome! I don’t want to think about threading, let’s disable threadsafe! all the time!”, however let’s look at the cost of adding Rack::Lock. Did you notice the run times of our test program? The first run took 1 min 46 sec, where the second run took 24 sec. The reason is because Rack::Lock ensured that we have only one concurrent request at a time. If we can only handle one request at a time, it defeats the purpose of having a threaded webserver in the first place. Hence the option to remove Rack::Lock.

Rack::Lock and a mult-process webserver

Now let’s look at the impact Rack::Lock has on a multi-process webserver. For this test, we’re going to use the Unicorn webserver. We’ll use the same test program to generate 5 concurrent requests 100 times.

First let’s test with threadsafe! disabled, so Rack::Lock is in the middleware stack:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:45:48.942354 #59827]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:45:48.942688 #59827]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:45:48.943922 #59827]  INFO -- : master process ready
I, [2012-06-16T16:45:48.945477 #59829]  INFO -- : worker=0 spawned pid=59829
I, [2012-06-16T16:45:48.946027 #59829]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:45:51.983627 #59829]  INFO -- : worker=0 ready

Unicorn only forks one process by default, so we’ll increase it to 5 processes and run our test program:

[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ kill -SIGTTIN 59827
[aaron@higgins omglol]$ time ruby multireq.rb 

real	0m23.080s
user	0m0.634s
sys	0m0.320s
[aaron@higgins omglol]$ kill -INFO 59829 59843 59854 59865 59876
[aaron@higgins omglol]$

We have to run kill on multiple pids because we have multiple processes listening for requests. If we look at the logs:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:45:48.942354 #59827]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:45:48.942688 #59827]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:45:48.943922 #59827]  INFO -- : master process ready
I, [2012-06-16T16:45:48.945477 #59829]  INFO -- : worker=0 spawned pid=59829
I, [2012-06-16T16:45:48.946027 #59829]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:45:51.983627 #59829]  INFO -- : worker=0 ready
I, [2012-06-16T16:46:54.379332 #59827]  INFO -- : worker=1 spawning...
I, [2012-06-16T16:46:54.382832 #59843]  INFO -- : worker=1 spawned pid=59843
I, [2012-06-16T16:46:54.384204 #59843]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:56.624781 #59827]  INFO -- : worker=2 spawning...
I, [2012-06-16T16:46:56.635782 #59854]  INFO -- : worker=2 spawned pid=59854
I, [2012-06-16T16:46:56.636441 #59854]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:57.703947 #59827]  INFO -- : worker=3 spawning...
I, [2012-06-16T16:46:57.708788 #59865]  INFO -- : worker=3 spawned pid=59865
I, [2012-06-16T16:46:57.709620 #59865]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:46:58.091562 #59843]  INFO -- : worker=1 ready
I, [2012-06-16T16:46:58.799433 #59827]  INFO -- : worker=4 spawning...
I, [2012-06-16T16:46:58.804126 #59876]  INFO -- : worker=4 spawned pid=59876
I, [2012-06-16T16:46:58.804822 #59876]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:47:01.281589 #59854]  INFO -- : worker=2 ready
I, [2012-06-16T16:47:02.292327 #59865]  INFO -- : worker=3 ready
I, [2012-06-16T16:47:02.989091 #59876]  INFO -- : worker=4 ready
Count: 100
Count: 100
Count: 100
Count: 100
Count: 100

We see the count totals to 500. Great! No surprises, we expected a total of 500.

Now let’s run the same test but with threadsafe! enabled. We learned from our previous tests that we’ll get a race condition, so let’s see the race condition in action in a multi-process environment. We enable threadsafe mode to eliminate Rack::Lock, and fire up our webserver:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:53:48.480272 #59920]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:53:48.480630 #59920]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:53:48.482540 #59920]  INFO -- : master process ready
I, [2012-06-16T16:53:48.484182 #59921]  INFO -- : worker=0 spawned pid=59921
I, [2012-06-16T16:53:48.484672 #59921]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:53:51.666293 #59921]  INFO -- : worker=0 ready

Now increase to 5 processes and run our test:

[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ kill -SIGTTIN 59920
[aaron@higgins omglol]$ time ruby multireq.rb 

real	0m22.920s
user	0m0.641s
sys	0m0.327s
[aaron@higgins omglol]$ kill -INFO 59932 59921 59943 59953 59958

Finally, take a look at our webserver output:

[aaron@higgins omglol]$ unicorn -E production
I, [2012-06-16T16:53:48.480272 #59920]  INFO -- : listening on addr=0.0.0.0:8080 fd=5
I, [2012-06-16T16:53:48.480630 #59920]  INFO -- : worker=0 spawning...
I, [2012-06-16T16:53:48.482540 #59920]  INFO -- : master process ready
I, [2012-06-16T16:53:48.484182 #59921]  INFO -- : worker=0 spawned pid=59921
I, [2012-06-16T16:53:48.484672 #59921]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:53:51.666293 #59921]  INFO -- : worker=0 ready
I, [2012-06-16T16:54:56.393218 #59920]  INFO -- : worker=1 spawning...
I, [2012-06-16T16:54:56.420914 #59932]  INFO -- : worker=1 spawned pid=59932
I, [2012-06-16T16:54:56.421824 #59932]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:54:57.962304 #59920]  INFO -- : worker=2 spawning...
I, [2012-06-16T16:54:57.966149 #59943]  INFO -- : worker=2 spawned pid=59943
I, [2012-06-16T16:54:57.966804 #59943]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:54:59.799125 #59920]  INFO -- : worker=3 spawning...
I, [2012-06-16T16:54:59.803206 #59953]  INFO -- : worker=3 spawned pid=59953
I, [2012-06-16T16:54:59.803816 #59953]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:55:00.927141 #59920]  INFO -- : worker=4 spawning...
I, [2012-06-16T16:55:00.931436 #59958]  INFO -- : worker=4 spawned pid=59958
I, [2012-06-16T16:55:00.932026 #59958]  INFO -- : Refreshing Gem list
I, [2012-06-16T16:55:01.808953 #59932]  INFO -- : worker=1 ready
I, [2012-06-16T16:55:05.292524 #59943]  INFO -- : worker=2 ready
I, [2012-06-16T16:55:06.491235 #59953]  INFO -- : worker=3 ready
I, [2012-06-16T16:55:06.955906 #59958]  INFO -- : worker=4 ready
Count: 100
Count: 100
Count: 100
Count: 100
Count: 100

Strange. Our counts total 500 again despite the fact that we clearly saw this code has a horrible race condition. The fact of the matter is that we don’t need Rack::Lock in a multi-process environment. We don’t need the lock because the socket is our lock. In a multi-process environment, when one process is handling a request, it cannot listen for another request at the same time (you would need threads to do this). That means that wrapping a mutex around the request is useless overhead.

Conclusion

I think this blurgh post is getting too long, so let’s wrap it up. The first three options that config.threadsafe! controls (@preload_frameworks, @cache_classes, and @dependency_loading) are either already used in a multi-process environment, or would have little to no overhead if used in a multi-process environment. The final configuration option, @allow_concurrency is completely useless in a multi-process environment.

In a multi-threaded environment, the first three options that config.threadsafe! controls are either already used by default or are absolutely necessary for a multi-threaded environment. Rack::Lock cripples a multi-threaded server such that @allow_concurrency should always be enabled in a multi-threaded environment. In other words, if you’re using code that is not thread safe, you should either fix that code, or consider moving to the multi-process model.

Because enabling config.threadsafe! would have little to no impact in a multi-process environment, and is absolutely necessary in a multi-threaded environment, I think that we should enable this flag by default in new Rails applications with the intention of removing the flag in future versions of Rails.

The End!

<3<3<3<3<3