From 8574c3e4fa6d13193bd0e98205b7e75ee6ee6543 Mon Sep 17 00:00:00 2001 From: eileencodes Date: Mon, 3 Jun 2019 11:24:13 -0400 Subject: Document multiple databases in Rails This file documents how to use multiple databases, what features are supported, what features are coming soon, and caveats. --- guides/source/active_record_multiple_databases.md | 269 ++++++++++++++++++++++ guides/source/documents.yaml | 5 + 2 files changed, 274 insertions(+) create mode 100644 guides/source/active_record_multiple_databases.md (limited to 'guides/source') diff --git a/guides/source/active_record_multiple_databases.md b/guides/source/active_record_multiple_databases.md new file mode 100644 index 0000000000..3e445b57a1 --- /dev/null +++ b/guides/source/active_record_multiple_databases.md @@ -0,0 +1,269 @@ +**DO NOT READ THIS FILE ON GITHUB, GUIDES ARE PUBLISHED ON https://guides.rubyonrails.org.** + +Multiple Databases with Active Record +===================================== + +This guide covers using multiple databases with your Rails application. + +After reading this guide you will know: + +* How to setup your application for multiple databases. +* How automatic connection switching works. +* What features are supported and what's still a work in progress. + +-------------------------------------------------------------------------------- + +As an application grows in popularity and usage you'll need to scale the application +to support your new users and their data. One way in which your application may need +to scale is on the database level. Rails now has support for multiple databases +so you don't have to store your data all in one place. + +At this time the following features are supported: + +* Multiple primary databases and a replica for each +* Automatic connection switching for the model you're working with +* Automatic swapping between the primary and replica depending on the HTTP verb +and recent writes +* Rails tasks for creating, dropping, migrating, and interacting with the multiple +databases + +The following features are not (yet) supported: + +* Sharding +* Joining across clusters +* Load balancing replicas + +## Setting up your application + +While Rails tries to do most of the work for you there are still some steps you'll +need to do to get your application ready for multiple databases. + +Let's say we have an application with a single primary database and we need to add a +new database for some new tables we're adding. The name of the new database will be +"animals". + +The database.yml looks like this: + +```yaml +production: + database: my_primary_database + user: root + adapter: mysql +``` + +Let's add a replica for the primary, a new writer called animals and a replica for that +as well. To do this we need to change our database.yml from a 2-tier to a 3-tier config. + +```yaml +production: + primary: + database: my_primary_database + user: root + adapter: mysql + primary_replica: + database: my_primary_database + user: root_readonly + adapter: mysql + replica: true + animals: + database: my_animals_database + user: animals_root + adapter: mysql + migrations_paths: db/animals_migrate + animals_replica: + database: my_animals_database + user: animals_readonly + adapter: mysql + replica: true +``` + +When using multiple databases there are a few important settings. + +First, the database name for the primary and replica should be the same because they contain +the same data. Second, the username for the primary and replica should be different, and the +replica user's permissions should be to to read and not write. + +When using a replica database you need to add a `replica: true` entry to the replica in the +`database.yml`. This is because Rails otherwise has no way of knowing which one is a replica +and which one is the primary. + +Lastly, for new primary databases you need to set the `migrations_paths` to the directory +where you will store migrations for that database. We'll look more at `migrations_paths` +later on in this guide. + +Now that we have a new database, let's set up the model. In order to use the new database we +need to create a new abstract class and connect to the animals databases. + +```ruby +class AnimalsBase < ApplicationRecord + self.abstract_class = true + + connects_to database: { writing: :animals, reading: :animals_replica } +end +``` + Then we need to +update `ApplicationRecord` to be aware of our new replica. + +```ruby +class ApplicationRecord < ActiveRecord::Base + self.abstract_class = true + + connects_to database: { writing: :primary, reading: :primary_replica } +end +``` + +By default Rails expects the database roles to be `writing` and `reading` for the primary +and replica respectively. If you have a legacy system you may already have roles set up that +you don't want to change. In that case you can set a new role name in your application config. + +```ruby +config.active_record.writing_role = :default +config.active_record.reading_role = :readonly +``` + +Now that we have the database.yml and the new model set up it's time to create the databases. +Rails 6.0 ships with all the rails tasks you need to use multiple databases in Rails. + +You can run `rails -T` to see all the commands you're able to run. You should see the following: + +``` +$ rails -T +rails db:create # Creates the database from DATABASE_URL or config/database.yml for the ... +rails db:create:animals # Create animals database for current environment +rails db:create:primary # Create primary database for current environment +rails db:drop # Drops the database from DATABASE_URL or config/database.yml for the cu... +rails db:drop:animals # Drop animals database for current environment +rails db:drop:primary # Drop primary database for current environment +rails db:migrate # Migrate the database (options: VERSION=x, VERBOSE=false, SCOPE=blog) +rails db:migrate:animals # Migrate animals database for current environment +rails db:migrate:primary # Migrate primary database for current environment +rails db:migrate:status # Display status of migrations +rails db:migrate:status:animals # Display status of migrations for animals database +rails db:migrate:status:primary # Display status of migrations for primary database +``` + +Running a command like `rails db:create` will create both the primary and animals databases. +Note that there is no command for creating the users and you'll need to do that manually +to support the readonly users for your replicas. If you want to create just the animals +database you can run `rails db:create:animals`. + +## Migrations + +Migrations for multiple databases should live in their own folders prefixed with the +name of the database key in the configuration. + +You also need to set the `migrations_paths` in the database configurations to tell Rails +where to find the migrations. + +For example the `animals` database would look in the `db/animals_migrate` directory and +`primary` would look in `db/migrate`. Rails generators now take a `--database` option +so that the file is generated in the correct directory. The command can be run like so: + +``` +$ rails g migration CreateDogs name:string --database animals +``` + +## Activating automatic connection switching + +Finally, in order to use the read-only replica in your application you'll need to activate +the middleware for automatic switching. + +Automatic switching allows the application to switch from the primary to replica or replica +to primary based on the HTTP verb and whether there was a recent write. + +If the application is receiving a POST, PUT, DELETE, or PATCH request the application will +automatically write to the primary. For the specified time after the write the application +will read from the replica. For a GET or HEAD request the application will read from the +replica unless there was a recent write. + +To activate the automatic connection switching middleware, add or uncomment the following +lines in your application config. + +```ruby +config.active_record.database_selector = { delay: 2.seconds } +config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver +config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session +``` + +Rails guarantees "read your own write" and will send your GET or HEAD request to the +primary if it's within the `delay` window. By default the delay is set to 2 seconds. You +should change this based on your database infrastructure. Rails doesn't guarantee "read +a recent write" for other users within the delay window and will send GET and HEAD requests +to the replicas unless they wrote recently. + +The automatic connection switching in Rails is relatively primitive and deliberatly doesn't +do a whole lot. The goal was a system that demonstrated how to do automatic connection +switching that was flexible enough to be customizable by app developers. + +The setup in Rails allows you to easily change how the switching is done and what +parameters it's based on. Let's say you want to use a cookie instead of a session to +decide when to swap connections. You can write your own class: + +```ruby +class MyCookieResolver + # code for your cookie class +end +``` + +And then pass it to the middleware: + +```ruby +config.active_record.database_selector = { delay: 2.seconds } +config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver +config.active_record.database_resolver_context = MyCookieResovler +``` + +## Using manual connection switching + +There are some cases where you may want your application to connect to a primary or a replica +and the automatic connection switching isn't adequate. For example, you may know that for a +particular request you always want to send the request to a replica, even when you are in a +POST request path. + +To do this Rails provides a `connected_to` method that will switch to the connection you +need. + +```ruby +ActiveRecord::Base.connected_to(role: :reading) do + # all code in this block will be connected to the reading role +end +``` + +The "role" in the `connected_to` call looks up the connections that are connected on that +connection handler (or role). The `reading` connection handler will hold all the connections +that were connected via `connects_to` with the role name of `reading`. + +There also may be a case where you have a database that you don't always want to connect to +on application boot but may need for a slow query or analytics. After defining that database +in the database.yml you can connect by passing a database argument to `connected_to` + +```ruby +ActiveRecord::Base.connected_to(database: { reading_slow: :animals_slow_replica }) do + # do something while connected to the slow replica +end +``` + +The `database` argument for `connected_to` will take a symbol or a config hash. + +Note that `connected_to` with a role will look up an existing connection and switch +using the connection specification name. This means that if you pass an unknown role +like `connected_to(role: :nonexistent)` you will get an error like that says +`ActiveRecord::ConnectionNotEstablished (No connection pool with 'AnimalsBase' found +for the 'nonexistent' role.)` + +## Caveats + +As noted at the top Rails doesn't (yet) support sharding. We had to do a lot of work +to support multiple databases for Rails 6.0. The lack of support for sharding isn't +an oversight, but does require additional work that didn't make it in for 6.0. For now +if you need sharding it may be advisable to continue using one of the many gems +that supports this. + +Rails also doesn't support automatic load balancing of replicas. This is very +depentent on your infrastructure. We may implement basic, primitive load balancing +in the future, but for an application at scale this should be something your application +handles outside of Rails. + +Lastly, you cannot join across databases. Rails 6.1 will support using `has_many` +relationships and creating 2 queries instead of joining, but Rails 6.0 will require +you to split the joins into 2 selects manually. diff --git a/guides/source/documents.yaml b/guides/source/documents.yaml index 1e67b2bce7..90674e8456 100644 --- a/guides/source/documents.yaml +++ b/guides/source/documents.yaml @@ -160,6 +160,11 @@ work_in_progress: true url: active_record_postgresql.html description: This guide covers PostgreSQL specific usage of Active Record. + - + name: Multiple Databases with Active Record + work_in_progress: true + url: active_record_multiple_databases.html + description: This guide covers using multiple databases in your application. - name: Extending Rails -- cgit v1.2.3