All PostsInside Skylight Learn about Skylight

Parallelizing Queries with Rails 7's `load_async`

As you're likely well aware, Rails 7 was released last month bringing a number of new features with it. One of the features we're most excited about is load_async. This features allows for multiple Active Record queries to be executed in parallel which can be a great tool for speeding up slow requests.

Since Rails introduces an entirely new infrastructure for load_async, Skylight's existing integration wasn't capturing all of these queries. But don't worry, because the brand new Skylight 5.3 handles these correctly!

To see how this works in practice, consider the following scenario:

class UsersController < ApplicationController
  def index
    @users = User.slow.all
    @apps = App.slow.all
    @invoices = Invoice.slow.all
  end
end
app/controllers/users_controller.rb
<h1>Users</h1>
<ul>
  <%- for user in @users -%>
      <li><%= user.name %></li>
  <%- end -%>
</ul>

<h1>Apps</h1>
<ul>
  <%- for app in @apps -%>
    <li><%= app.title %></li>
  <%- end -%>
</ul>

<h1>Invoice</h1>
<ul>
  <%- for invoice in @invoices -%>
    <li><%= invoice.amount %></li>
  <%- end -%>
</ul>
app/views/users/index.html.erb

This is a pretty straightforward example, but let's break down how Rails will process it. When /users is requested, we'll enter the UsersController#index action. This will initialize our three instance variables, but the queries will not actually be executed until the results are needed.

💡
Note that we're using a custom scope called slow. This simulates a slow query with pg_sleep like such: scope :slow, -> { where("SELECT true FROM pg_sleep(1)") }

Once Rails begins rendering the view it will hit the @users for loop. It will attempt to coerce @users to an array by calling to_a . This causes the query to execute synchronously, meaning that we'll have to wait for the query to complete before rendering can progress. Once the query has executed we'll continue on, repeating the process for @apps and @invoices.

Here's what it looks like in Skylight:

Event sequence without load_async

Unsurprisingly, the bulk of our time is spent in these very slow queries which each take a full second to execute with the entire request taking over 3 seconds total.

We could try to change things by calling to_a in the controller action instead as such:

@users = User.slow.all.to_a
@apps = App.slow.all.to_a
@invoices = Invoice.slow.all.to_a
Event sequence with to_a

However, this only moves the work slightly earlier in the process. We still won't see any performance benefits since each query still has to execute sequentially. In general, calling to_a like this isn't recommended.

Enter load_async

As you probably guessed, load_async is going to help solve this problem. Using load_async we can rewrite this as:

@users = User.slow.all.load_async
@apps = App.slow.all.load_async
@invoices = Invoice.slow.all.load_async

When load_async is called, the query starts executing immediately on a global thread pool. So in our example, all three queries will execute in parallel. When we hit the view, we'll still have to wait for @users to be loaded, but while we're waiting, @apps and @invoices are also loading. Once we hit those in the rendering process they'll either already be loaded or at least well on their way there!

Event Sequence with load_async

We can see that this is indeed what happened. We're still blocked on the users query since we didn't actually make it any faster, but we can see that we no longer have to wait for the apps or invoices queries. Our total request now only takes a bit over 1 second, which is a significant improvement.

💡
Before this will work, you do need to configure the thread pool executor. This can be done in your Rails application config by setting config.active_record.async_query_executor to :global_thread_pool. There are a handful of other options so it's worth taking a look at the Rails documentation

A Word of Warning

If we still want to make this request faster—and we should since 1 second response time is still pretty bad!— then we could work on speeding up the users query.

Event Sequence with optimized users query

Unfortunately, this really any better and our whole request still takes over 1 second to complete. So what happened?

As before, the queries all executed in parallel. However, the apps query is still slow so even though our users query finished much faster we still end up blocked waiting for the apps query to finish.

As with almost all performance optimizations, load_async isn't a panacea. We're still only going to be as fast as our slowest query. However, as we saw in our initial work, there can still be big benefits by running this queries sequentially.

One Final Detail

One important configuration option I didn't mention was the concurrency option. By default the global executor that we configured will only execute a maximum of 4 queries simultaneously. When the pool is full, the queries become synchronous, behaving as they would if load_async was not used. Having a low default is good to ensure that your database isn't overloaded, but if your database can handle the additional connections and load it may be worth increasing this value with config.active_record.global_executor_concurrency. (Check out the documentation for the correct options if you're using an alternate pool.)

Conclusion

While load_async won't solve all your performance problems, it's definitely something you should consider in any case where you have multiple sequential queries. As we saw, running queries simultaneously can bring significant benefits over running them sequentially. Enjoy your new found performance optimizing potential!

Skylight

Start your free month or refer a friend and get a $50 credit.