Since landing my first job programming with Ruby, most Rails applications I have worked with have managed two or more database connections. Often times these connections are both readable and writable. Collectively they represent a comprehensive business domain. Today I would like to cover everything I know about this topic, without going into the bike shed of micro services or schema management gems.
The goal is simple! To give you the tools you need to keep your application running smoothly when connecting to multiple databases.
Each topic or scenario below is tested within a demo Rails application that you can checkout on GitHub. See the resources section at the bottom of this post for more information. Examples assume that Rails 3.2.x to 4.2.x are the same behavior. When this is not the case, it will be explicitly pointed out.
Reasons for multiple database connections vastly differ. To illustrate a common setup, this demo application will use PostgreSQL as the base "legacy" connection. Just imagine this base is something like Oracle or SQL Server. Our second connection is going to be MySQL. This is the database we want to migrate our primary new business logic onto.
To easily follow along, MySQL models will be prefixed as such and all ActiveRecord logs have been tagged with the database connection that executed the query. This setup will give us more than enough opportunity to explore and have fun. Lets go!
First and foremost we are going to need a base class that champions a database connection. This is critical so that each subclass shares the same connection/pool. For example:
class MysqlBase < ActiveRecord::Base
establish_connection configurations['mysql'][Rails.env]
self.abstract_class = true
end
Do not use establish_connection for other models. Only use it for the base connection class. Setting your base connection class as an abstract_class
keeps ActiveRecord from querying the database for column information. Meaning this model has no attributes. Use this connection class as the superclass for each of your MySQL models. For example:
class User < MysqlBase
end
class Order < MysqlBase
end
Now, each model will will share the same connection pool. Benefits include:
ActiveRecord's query cache can be a beneficial performance optimization. On by default, it caches SQL results for the same query during the request/response cycle. This caching really shines with child to parent associations where n
children find the same parent and only hit the database once.
Unfortunately ActiveRecord can only cache your base connection. It knows not of other databases. So let's "cash in" on some of our simple code dividends and put that base connection class to work.
class ApplicationController < ActionController::Base
around_filter :cache_other_db_connections
private
def cache_other_db_connections
MysqlBase.connection.cache { yield }
end
end
Yup, it is that easy! If you had multiple database connections, just nest them in the outer block and yield once to the controller's around filter.
So what happens when a model for one database creates or updates another model?
class MysqlUser < MysqlBase
self.table_name = :users
after_save :account_create
private
def account_create
Account.create_from_user!(self)
end
end
Our MysqlUser
model is going to save a legacy account model using the after_save
callback. Remember, ActiveRecord runs creates or saves inside of a transaction.
new_mysql_user.account_create = true
new_mysql_user.save
MySQL: BEGIN
MySQL: INSERT INTO `users` (...) VALUES (...)
Base: BEGIN
Base: INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
Base: COMMIT
MySQL: COMMIT
So the outcome is pretty clear here. We have two distinct database transactions. One is not going to affect the other. What would happen if the account raised an exception when being created? Possibly due to a validation error?
new_mysql_user.account_create = true
new_mysql_user.account_fails_validation = true
new_mysql_user.save
MySQL: BEGIN
MySQL: INSERT INTO `users` (...) VALUES (...)
Base: BEGIN
Base: ROLLBACK
MySQL: ROLLBACK
Not bad. We can see that ActiveRecord handles this simple setup quite well. Both the user and the account are rolled back.
With our first scenario understood, it is time for another topic. Time to talk about the ActiveRecord::Rollback
exception object. Please take a moment to read the ActiveRecord documentation.
Transaction uses this exception to distinguish a deliberate rollback from other exceptional situations. Normally, raising an exception will cause the transaction method to rollback the database transaction and pass on the exception. But if you raise an ActiveRecord::Rollback exception, then the database transaction will be rolled back, without passing on the exception.
Did you see that? Using an ActiveRecord::Rollback
will NOT pass the exception to a higher transaction block. To illustrate what happens, let's take our last scenario example and change the account to raise an ActiveRecord::Rollback
vs a validation exception.
new_mysql_user.account_create = true
new_mysql_user.account_raise_rollback = true
new_mysql_user.save
MySQL: BEGIN
MySQL: INSERT INTO `users` (...) VALUES (...)
Base: BEGIN
Base: ROLLBACK
MySQL: COMMIT
And there we go! The MySQL user just carries along on its happy way and commits the change. I will have more rollback examples below, but I might go so far as to say that you may never want to use ActiveRecord::Rollback
unless you have a very clear grasp on how to manage transactions.
Implicit transactions are great for one or two model instances, but sooner are later you will need to orchestrate more objects within a larger commit. Explicit transactions give you this control. Here is a simple multi-database example:
MysqlUser.transaction do
MysqlUser.create! email: 'one@one.com'
MysqlUser.create! email: 'two@two.com'
Account.create! email: 'one@one.com'
Account.create! email: 'two@two.com'
end
MySQL: BEGIN
MySQL: INSERT INTO `users` (...) VALUES (...)
MySQL: INSERT INTO `users` (...) VALUES (...)
Base: BEGIN
Base: INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
Base: COMMIT
Base: BEGIN
Base: INSERT INTO "accounts" (...) VALUES (...) RETURNING "id"
Base: COMMIT
MySQL: COMMIT
The outer MySQL transaction was automatically joinable during the first two creates. This is because the model creates detected that the connection was already under a sharable transaction. The problem here is that the base connection had no joinable transaction block. Despite which connection we started the outer block with, the problem is the same. The solution is to open a transaction for each connection before doing any work.
MysqlUser.transaction do
Account.transaction do
MysqlUser.create! email: 'one@one.com'
MysqlUser.create! email: 'two@two.com'
Account.create! email: 'one@one.com'
Account.create! email: 'two@two.com'
end
end
MySQL: BEGIN
Base: BEGIN
MySQL: INSERT INTO `users` (...)
MySQL: INSERT INTO `users` (...)
Base: INSERT INTO "accounts" (...) RETURNING "id"
Base: INSERT INTO "accounts" (...) RETURNING "id"
Base: COMMIT
MySQL: COMMIT
Now we have a tidy atomic transaction across both databases. Any exception raised besides ActiveRecord::Rollback
will automatically rollback both transactions. This pattern is so useful, I suggest making a helper that does the work for you. Again, this leverages the base connection class for the second database – I did promise it would come in handy.
ActiveRecord::Base.class_eval do
def self.multi_transaction
ActiveRecord::Base.transaction do
MysqlBase.transaction { yield }
end
end
def multi_transaction
self.class.multi_transaction { yield }
end
end
Now anytime you need to commit large changes, just wrap your code in ActiveRecord::Base.multi_transaction { ... }
and sleep better knowing that your data is clean. If you want to learn more about ActiveRecord's transactions, please read the Rails API Docs. They cover advanced topics like isolation levels and savepoints. None of which are covered in this post.
If you are doing explicit transactions, you likely mean business. If so, make your code serious by using ActiveRecord's bang methods. Doing so helps keep code explicit to the task at hand with automatic rollbacks to two or more databases. Examples include:
save!
create!
update_attributes!
update!
destroy!
If you must raise exceptions during explicit transactions to multiple databases, please stay away from using ActiveRecord::Rollback
, this could complicate your code and in most cases need to be re-raised.
Every now and then you may use a gem whose model(s) subclasses ActiveRecord::Base
and you want to force that connection to another database. The Delayed::Job
library is a good example. You could use establish_connection
, but that would generate a distinct connection pool and keep transactions from being shared. Here is a bullet proof freedom patch to use with any model.
# In config/initializers/delayed_job.rb
Delayed::Backend::ActiveRecord::Job.class_eval do
class << self
def connection_pool
MysqlBase.connection_pool
end
def retrieve_connection
MysqlBase.retrieve_connection
end
def connected?
MysqlBase.connected?
end
def remove_connection(klass = self)
MysqlBase.remove_connection
end
end
end
This would change Delayed::Job from using our base connection to using our second MySQL connection. It is best to do patches like this in a Rails initializer. This guarantees gems are already reqired and in most cases have not yet connected to the database. If your model connects before these patches are applied, you could end up with an orphaned connection pool to the wrong database.
The patch works by overriding critical connection specification/handling class methods defined on ActiveRecord::Base. The first two methods connection_pool
and retrieve_connection
will make all your queries work. The second two will ensure all remaining connection management methods like remove_connection
and clear_all_connections!
do the same. But remember! If you need to do any connection management, you really only need do it on the connection class, in this case, MysqlBase
.
Most libraries that fork your Rails application process know how to reset ActiveRecord's base connections. This includes web servers like Passenger or Unicorn. However, it is up to you to do the same for any other database connections. This is easy to do by leveraging our connection class in the first topic.
ActiveRecord::Base.clear_all_connections!
MysqlBase.establish_connection MysqlBase.configurations['mysql'][Rails.env]
First we clear all connections. This is normally done before forking. Methods like clear_all_connections!
and clear_active_connections!
iterate across all connection pools. You only need to call this once, not per connection. Second, we call establish_connection
on our MySQL connection class using the same configuration method we did in the first topic.
If you are using transactional fixtures and you have fixtures for each connection, then ActiveRecord will do all the work for you by making sure each test is wrapped with transactional savepoints to each database. However, if you do not have fixtures for all connections then it is up to you to start the first transaction for each additional connection. Again, using our MySQL connection class as an example:
# Rails 3.2.x
before do
MysqlBase.connection.increment_open_transactions
MysqlBase.connection.transaction_joinable = false
MysqlBase.connection.begin_db_transaction
end
after do
if MysqlBase.connection.open_transactions != 0
MysqlBase.connection.rollback_db_transaction
MysqlBase.connection.decrement_open_transactions
end
end
# Rails 4.2.x
before { MysqlBase.connection.begin_transaction joinable: false }
after { MysqlBase.connection.rollback_transaction }
Please remember that ActiveRecord uses savepoints for nested transactions. Depending on your database, you will see SAVEPOINT
and RELEASE SAVEPOINT
for tests in your logs. Only the outer most transactions will use BEGIN
and ROLLBACK
.
Thanks for reading this far. Did I miss anything important or maybe you would like to ask a question? If so, drop a comment and I would be more than happy to expand on any topic.
Each topic or scenario here is fully tested in a demo rails application. Check it out on GitHub if you are interested in playing around with these or any other multi-database topics that interest you. If you find a new one or care to share your own, please drop a comment or a pull request.