Rails Multi-Database Best Practices Roundup

Seamless second database integration for Rails.

2016-01-11: We just finished a new version of SecondBase, our own gem that provides support for Rails to manage dual databases by extending ActiveRecord tasks that create, migrate, and test your application. It supports Rails 4.x and up. Check it out on GitHub.

Since landing my first job programming with Ruby, most Rails applications I have worked with have managed two or more database connections. Often times these connections are both readable and writable. Collectively they represent a comprehensive business domain. Today I would like to cover everything I know about this topic, without going into the bike shed of micro services or schema management gems.

The goal is simple! To give you the tools you need to keep your application running smoothly when connecting to multiple databases.

Topics & Scenarios

Each topic or scenario below is tested within a demo Rails application that you can checkout on GitHub. See the resources section at the bottom of this post for more information. Examples assume that Rails 3.2.x to 4.2.x are the same behavior. When this is not the case, it will be explicitly pointed out.

Reasons for multiple database connections vastly differ. To illustrate a common setup, this demo application will use PostgreSQL as the base "legacy" connection. Just imagine this base is something like Oracle or SQL Server. Our second connection is going to be MySQL. This is the database we want to migrate our primary new business logic onto.

To easily follow along, MySQL models will be prefixed as such and all ActiveRecord logs have been tagged with the database connection that executed the query. This setup will give us more than enough opportunity to explore and have fun. Lets go!

Topic 1: Connection Management

First and foremost we are going to need a base class that champions a database connection. This is critical so that each subclass shares the same connection/pool. For example:

class MysqlBase < ActiveRecord::Base
  establish_connection configurations['mysql'][Rails.env]
  self.abstract_class = true
end

Do not use establish_connection for other models. Only use it for the base connection class. Setting your base connection class as an abstract_class keeps ActiveRecord from querying the database for column information. Meaning this model has no attributes. Use this connection class as the superclass for each of your MySQL models. For example:

class User < MysqlBase
end
class Order < MysqlBase
end

Now, each model will will share the same connection pool. Benefits include:

Shared transaction blocks.
Reduced database connections.
Single point for connection management.

Topic 2: The ActiveRecord Query Cache

ActiveRecord's query cache can be a beneficial performance optimization. On by default, it caches SQL results for the same query during the request/response cycle. This caching really shines with child to parent associations where n children find the same parent and only hit the database once.

Unfortunately ActiveRecord can only cache your base connection. It knows not of other databases. So let's "cash in" on some of our simple code dividends and put that base connection class to work.

class ApplicationController < ActionController::Base

  around_filter :cache_other_db_connections

  private

  def cache_other_db_connections
    MysqlBase.connection.cache { yield }
  end

end

Yup, it is that easy! If you had multiple database connections, just nest them in the outer block and yield once to the controller's around filter.

Scenario 1: Implicit Multi-DB Transactions

So what happens when a model for one database creates or updates another model?

class MysqlUser < MysqlBase
  self.table_name = :users
  after_save :account_create

  private
  def account_create
    Account.create_from_user!(self)
  end
end

Our MysqlUser model is going to save a legacy account model using the after_save callback. Remember, ActiveRecord runs creates or saves inside of a transaction.

new_mysql_user.account_create = true
new_mysql_user.save

MySQL:  BEGIN
MySQL:  INSERT INTO `users` (...) VALUES (...)
 Base:  BEGIN
 Base:  INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
 Base:  COMMIT
MySQL:  COMMIT

So the outcome is pretty clear here. We have two distinct database transactions. One is not going to affect the other. What would happen if the account raised an exception when being created? Possibly due to a validation error?

new_mysql_user.account_create = true
new_mysql_user.account_fails_validation = true
new_mysql_user.save

MySQL:  BEGIN
MySQL:  INSERT INTO `users` (...) VALUES (...)
 Base:  BEGIN
 Base:  ROLLBACK
MySQL:  ROLLBACK

Not bad. We can see that ActiveRecord handles this simple setup quite well. Both the user and the account are rolled back.

Topic 3: The ActiveRecord::Rollback Exception

With our first scenario understood, it is time for another topic. Time to talk about the ActiveRecord::Rollback exception object. Please take a moment to read the ActiveRecord documentation.

Transaction uses this exception to distinguish a deliberate rollback from other exceptional situations. Normally, raising an exception will cause the transaction method to rollback the database transaction and pass on the exception. But if you raise an ActiveRecord::Rollback exception, then the database transaction will be rolled back, without passing on the exception.

Did you see that? Using an ActiveRecord::Rollback will NOT pass the exception to a higher transaction block. To illustrate what happens, let's take our last scenario example and change the account to raise an ActiveRecord::Rollback vs a validation exception.

new_mysql_user.account_create = true
new_mysql_user.account_raise_rollback = true
new_mysql_user.save

MySQL:  BEGIN
MySQL:  INSERT INTO `users` (...) VALUES (...)
 Base:  BEGIN
 Base:  ROLLBACK
MySQL:  COMMIT

And there we go! The MySQL user just carries along on its happy way and commits the change. I will have more rollback examples below, but I might go so far as to say that you may never want to use ActiveRecord::Rollback unless you have a very clear grasp on how to manage transactions.

Scenario 2: Explicit Multi-DB Transactions

Implicit transactions are great for one or two model instances, but sooner are later you will need to orchestrate more objects within a larger commit. Explicit transactions give you this control. Here is a simple multi-database example:

MysqlUser.transaction do
  MysqlUser.create! email: 'one@one.com'
  MysqlUser.create! email: 'two@two.com'
  Account.create! email: 'one@one.com'
  Account.create! email: 'two@two.com'
end

MySQL:  BEGIN
MySQL:  INSERT INTO `users` (...) VALUES (...)
MySQL:  INSERT INTO `users` (...) VALUES (...)
 Base:  BEGIN
 Base:  INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
 Base:  COMMIT
 Base:  BEGIN
 Base:  INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
 Base:  COMMIT
MySQL:  COMMIT

The outer MySQL transaction was automatically joinable during the first two creates. This is because the model creates detected that the connection was already under a sharable transaction. The problem here is that the base connection had no joinable transaction block. Despite which connection we started the outer block with, the problem is the same. The solution is to open a transaction for each connection before doing any work.

MysqlUser.transaction do
  Account.transaction do
    MysqlUser.create! email: 'one@one.com'
    MysqlUser.create! email: 'two@two.com'
    Account.create! email: 'one@one.com'
    Account.create! email: 'two@two.com'
  end
end

MySQL:  BEGIN
 Base:  BEGIN
MySQL:  INSERT INTO `users` (...)
MySQL:  INSERT INTO `users` (...)
 Base:  INSERT INTO "accounts" (...) RETURNING "id"
 Base:  INSERT INTO "accounts" (...) RETURNING "id"
 Base:  COMMIT
MySQL:  COMMIT

Now we have a tidy atomic transaction across both databases. Any exception raised besides ActiveRecord::Rollback will automatically rollback both transactions. This pattern is so useful, I suggest making a helper that does the work for you. Again, this leverages the base connection class for the second database – I did promise it would come in handy.

ActiveRecord::Base.class_eval do
  def self.multi_transaction
    ActiveRecord::Base.transaction do
      MysqlBase.transaction { yield }
    end
  end
  def multi_transaction
    self.class.multi_transaction { yield }
  end
end

Now anytime you need to commit large changes, just wrap your code in ActiveRecord::Base.multi_transaction { ... } and sleep better knowing that your data is clean. If you want to learn more about ActiveRecord's transactions, please read the Rails API Docs. They cover advanced topics like isolation levels and savepoints. None of which are covered in this post.

Topic 4: Explicit Transactions Are Serious!

If you are doing explicit transactions, you likely mean business. If so, make your code serious by using ActiveRecord's bang methods. Doing so helps keep code explicit to the task at hand with automatic rollbacks to two or more databases. Examples include:

save!
create!
update_attributes!
update!
destroy!

If you must raise exceptions during explicit transactions to multiple databases, please stay away from using ActiveRecord::Rollback, this could complicate your code and in most cases need to be re-raised.

Topic 5: Shared Connection/Pools

Every now and then you may use a gem whose model(s) subclasses ActiveRecord::Base and you want to force that connection to another database. The Delayed::Job library is a good example. You could use establish_connection, but that would generate a distinct connection pool and keep transactions from being shared. Here is a bullet proof freedom patch to use with any model.

# In config/initializers/delayed_job.rb

Delayed::Backend::ActiveRecord::Job.class_eval do
  class << self

    def connection_pool
      MysqlBase.connection_pool
    end

    def retrieve_connection
      MysqlBase.retrieve_connection
    end

    def connected?
      MysqlBase.connected?
    end

    def remove_connection(klass = self)
      MysqlBase.remove_connection
    end

  end
end

This would change Delayed::Job from using our base connection to using our second MySQL connection. It is best to do patches like this in a Rails initializer. This guarantees gems are already reqired and in most cases have not yet connected to the database. If your model connects before these patches are applied, you could end up with an orphaned connection pool to the wrong database.

The patch works by overriding critical connection specification/handling class methods defined on ActiveRecord::Base. The first two methods connection_pool and retrieve_connection will make all your queries work. The second two will ensure all remaining connection management methods like remove_connection and clear_all_connections! do the same. But remember! If you need to do any connection management, you really only need do it on the connection class, in this case, MysqlBase.

Topic 6: Forked Processes & Clearing Connections

Most libraries that fork your Rails application process know how to reset ActiveRecord's base connections. This includes web servers like Passenger or Unicorn. However, it is up to you to do the same for any other database connections. This is easy to do by leveraging our connection class in the first topic.

ActiveRecord::Base.clear_all_connections!
MysqlBase.establish_connection MysqlBase.configurations['mysql'][Rails.env]

First we clear all connections. This is normally done before forking. Methods like clear_all_connections! and clear_active_connections! iterate across all connection pools. You only need to call this once, not per connection. Second, we call establish_connection on our MySQL connection class using the same configuration method we did in the first topic.

Topic 7: Rails Transactional Fixtures

If you are using transactional fixtures and you have fixtures for each connection, then ActiveRecord will do all the work for you by making sure each test is wrapped with transactional savepoints to each database. However, if you do not have fixtures for all connections then it is up to you to start the first transaction for each additional connection. Again, using our MySQL connection class as an example:

# Rails 3.2.x
before do
  MysqlBase.connection.increment_open_transactions
  MysqlBase.connection.transaction_joinable = false
  MysqlBase.connection.begin_db_transaction
end
after do
  if MysqlBase.connection.open_transactions != 0
    MysqlBase.connection.rollback_db_transaction
    MysqlBase.connection.decrement_open_transactions
  end
end

# Rails 4.2.x
before { MysqlBase.connection.begin_transaction joinable: false }
after  { MysqlBase.connection.rollback_transaction }

Please remember that ActiveRecord uses savepoints for nested transactions. Depending on your database, you will see SAVEPOINT and RELEASE SAVEPOINT for tests in your logs. Only the outer most transactions will use BEGIN and ROLLBACK.

Resources

Thanks for reading this far. Did I miss anything important or maybe you would like to ask a question? If so, drop a comment and I would be more than happy to expand on any topic.

Each topic or scenario here is fully tested in a demo rails application. Check it out on GitHub if you are interested in playing around with these or any other multi-database topics that interest you. If you find a new one or care to share your own, please drop a comment or a pull request.

Tested Multi-Database Rails Application