From 0abcec416b6ec11faffa03d40e5661c0a4a8b092 Mon Sep 17 00:00:00 2001 From: Eileen Uchitelle Date: Thu, 17 Jan 2019 13:33:48 -0500 Subject: Adds basic automatic database switching to Rails The following PR adds behavior to Rails to allow an application to automatically switch it's connection from the primary to the replica. A request will be sent to the replica if: * The request is a read request (`GET` or `HEAD`) * AND It's been 2 seconds since the last write to the database (because we don't want to send a user to a replica if the write hasn't made it to the replica yet) A request will be sent to the primary if: * It's not a GET/HEAD request (ie is a POST, PATCH, etc) * Has been less than 2 seconds since the last write to the database The implementation that decides when to switch reads (the 2 seconds) is "safe" to use in production but not recommended without adequate testing with your infrastructure. At GitHub in addition to the a 5 second delay we have a curcuit breaker that checks the replication delay and will send the query to a replica before the 5 seconds has passed. This is specific to our application and therefore not something Rails should be doing for you. You'll need to test and implement more robust handling of when to switch based on your infrastructure. The auto switcher in Rails is meant to be a basic implementation / API that acts as a guide for how to implement autoswitching. The impementation here is meant to be strict enough that you know how to implement your own resolver and operations classes but flexible enough that we're not telling you how to do it. The middleware is not included automatically and can be installed in your application with the classes you want to use for the resolver and operations passed in. If you don't pass any classes into the middleware the Rails default Resolver and Session classes will be used. The Resolver decides what parameters define when to switch, Operations sets timestamps for the Resolver to read from. For example you may want to use cookies instead of a session so you'd implement a Resolver::Cookies class and pass that into the middleware via configuration options. ``` config.active_record.database_selector = { delay: 2.seconds } config.active_record.database_resolver = MyResolver config.active_record.database_operations = MyResolver::MyCookies ``` Your classes can inherit from the existing classes and reimplment the methods (or implement more methods) that you need to do the switching. You only need to implement methods that you want to change. For example if you wanted to set the session token for the last read from a replica you would reimplement the `read_from_replica` method in your resolver class and implement a method that updates a new timestamp in your operations class. --- activerecord/CHANGELOG.md | 33 ++++++++ activerecord/lib/active_record.rb | 7 ++ activerecord/lib/active_record/core.rb | 1 + .../active_record/middleware/database_selector.rb | 74 ++++++++++++++++++ .../middleware/database_selector/resolver.rb | 88 ++++++++++++++++++++++ .../database_selector/resolver/session.rb | 45 +++++++++++ activerecord/lib/active_record/railtie.rb | 8 ++ activerecord/test/cases/database_selector_test.rb | 77 +++++++++++++++++++ .../templates/config/environments/production.rb.tt | 20 +++++ 9 files changed, 353 insertions(+) create mode 100644 activerecord/lib/active_record/middleware/database_selector.rb create mode 100644 activerecord/lib/active_record/middleware/database_selector/resolver.rb create mode 100644 activerecord/lib/active_record/middleware/database_selector/resolver/session.rb create mode 100644 activerecord/test/cases/database_selector_test.rb diff --git a/activerecord/CHANGELOG.md b/activerecord/CHANGELOG.md index 67cf5881d3..654caafc92 100644 --- a/activerecord/CHANGELOG.md +++ b/activerecord/CHANGELOG.md @@ -1,3 +1,36 @@ +* Allow applications to automatically switch connections. + + Adds a middleware and configuration options that can be used in your + application to automatically switch between the writing and reading + database connections. + + `GET` and `HEAD` requests will read from the replica unless there was + a write in the last 2 seconds, otherwise they will read from the primary. + Non-get requests will always write to the primary. The middleware accepts + an argument for a Resolver class and a Operations class where you are able + to change how the auto-switcher works to be most beneficial for your + application. + + To use the middleware in your application you can use the following + configuration options: + + ``` + config.active_record.database_selector = { delay: 2.seconds } + config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver + config.active_record.database_operations = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session + ``` + + To change the database selection strategy, pass a custom class to the + configuration options: + + ``` + config.active_record.database_selector = { delay: 10.seconds } + config.active_record.database_resolver = MyResolver + config.active_record.database_operations = MyResolver::MyCookies + ``` + + *Eileen M. Uchitelle* + * MySQL: Support `:size` option to change text and blob size. *Ryuta Kamizono* diff --git a/activerecord/lib/active_record.rb b/activerecord/lib/active_record.rb index de94f9693f..7d66158f47 100644 --- a/activerecord/lib/active_record.rb +++ b/activerecord/lib/active_record.rb @@ -74,6 +74,7 @@ module ActiveRecord autoload :Translation autoload :Validations autoload :SecureToken + autoload :DatabaseSelector, "active_record/middleware/database_selector" eager_autoload do autoload :ActiveRecordError, "active_record/errors" @@ -153,6 +154,12 @@ module ActiveRecord end end + module Middleware + extend ActiveSupport::Autoload + + autoload :DatabaseSelector, "active_record/middleware/database_selector" + end + module Tasks extend ActiveSupport::Autoload diff --git a/activerecord/lib/active_record/core.rb b/activerecord/lib/active_record/core.rb index 369d63e40a..519acd7605 100644 --- a/activerecord/lib/active_record/core.rb +++ b/activerecord/lib/active_record/core.rb @@ -101,6 +101,7 @@ module ActiveRecord # environment where dumping schema is rarely needed. mattr_accessor :dump_schema_after_migration, instance_writer: false, default: true + mattr_accessor :database_selector, instance_writer: false ## # :singleton-method: # Specifies which database schemas to dump when calling db:structure:dump. diff --git a/activerecord/lib/active_record/middleware/database_selector.rb b/activerecord/lib/active_record/middleware/database_selector.rb new file mode 100644 index 0000000000..06529bc1c9 --- /dev/null +++ b/activerecord/lib/active_record/middleware/database_selector.rb @@ -0,0 +1,74 @@ +# frozen_string_literal: true + +require "active_record/middleware/database_selector/resolver" + +module ActiveRecord + module Middleware + # The DatabaseSelector Middleware provides a framework for automatically + # swapping from the primary to the replica database connection. Rails + # provides a basic framework to determine when to swap and allows for + # applications to write custom strategy classes to override the default + # behavior. + # + # The resolver class defines when the application should switch (i.e. read + # from the primary if a write occurred less than 2 seconds ago) and an + # operations class that sets a value that helps the resolver class decide + # when to switch. + # + # Rails default middleware uses the request's session to set a timestamp + # that informs the application when to read from a primary or read from a + # replica. + # + # To use the DatabaseSelector in your application with default settings add + # the following options to your environment config: + # + # config.active_record.database_selector = { delay: 2.seconds } + # config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver + # config.active_record.database_operations = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session + # + # New applications will include these lines commented out in the production.rb. + # + # The default behavior can be changed by setting the config options to a + # custom class: + # + # config.active_record.database_selector = { delay: 2.seconds } + # config.active_record.database_resolver = MyResolver + # config.active_record.database_operations = MyResolver::MySession + class DatabaseSelector + def initialize(app, resolver_klass = Resolver, operations_klass = Resolver::Session) + @app = app + @resolver_klass = resolver_klass + @operations_klass = operations_klass + end + + attr_reader :resolver_klass, :operations_klass + + # Middleware that determines which database connection to use in a mutliple + # database application. + def call(env) + request = ActionDispatch::Request.new(env) + + select_database(request) do + @app.call(env) + end + end + + private + + def select_database(request, &blk) + operations = operations_klass.build(request) + database_resolver = resolver_klass.call(operations) + + if reading_request?(request) + database_resolver.read(&blk) + else + database_resolver.write(&blk) + end + end + + def reading_request?(request) + request.get? || request.head? + end + end + end +end diff --git a/activerecord/lib/active_record/middleware/database_selector/resolver.rb b/activerecord/lib/active_record/middleware/database_selector/resolver.rb new file mode 100644 index 0000000000..1c056e2156 --- /dev/null +++ b/activerecord/lib/active_record/middleware/database_selector/resolver.rb @@ -0,0 +1,88 @@ +# frozen_string_literal: true + +require "active_record/middleware/database_selector/resolver/session" + +module ActiveRecord + module Middleware + class DatabaseSelector + # The Resolver class is used by the DatabaseSelector middleware to + # determine which database the request should use. + # + # To change the behavior of the Resolver class in your application, + # create a custom resolver class that inherts from + # DatabaseSelector::Resolver and implements the methods that need to + # be changed. + # + # By default the Resolver class will send read traffic to the replica + # if it's been 2 seconds since the last write. + class Resolver # :nodoc: + SEND_TO_REPLICA_DELAY = 2.seconds + + def self.call(resolver) + new(resolver) + end + + def initialize(resolver) + @resolver = resolver + @instrumenter = ActiveSupport::Notifications.instrumenter + end + + attr_reader :resolver, :instrumenter + + def read(&blk) + if read_from_primary? + read_from_primary(&blk) + else + read_from_replica(&blk) + end + end + + def write(&blk) + write_to_primary(&blk) + end + + private + + def read_from_primary(&blk) + ActiveRecord::Base.connection.while_preventing_writes do + ActiveRecord::Base.connected_to(role: :writing) do + instrumenter.instrument("database_selector.active_record.read_from_primary") do + yield + end + end + end + end + + def read_from_replica(&blk) + ActiveRecord::Base.connected_to(role: :reading) do + instrumenter.instrument("database_selector.active_record.read_from_replica") do + yield + end + end + end + + def write_to_primary(&blk) + ActiveRecord::Base.connected_to(role: :writing) do + instrumenter.instrument("database_selector.active_record.wrote_to_primary") do + resolver.update_last_write_timestamp + yield + end + end + end + + def read_from_primary? + !time_since_last_write_ok? + end + + def send_to_replica_delay + (ActiveRecord::Base.database_selector && ActiveRecord::Base.database_selector[:delay]) || + SEND_TO_REPLICA_DELAY + end + + def time_since_last_write_ok? + Time.now - resolver.last_write_timestamp >= send_to_replica_delay + end + end + end + end +end diff --git a/activerecord/lib/active_record/middleware/database_selector/resolver/session.rb b/activerecord/lib/active_record/middleware/database_selector/resolver/session.rb new file mode 100644 index 0000000000..33e0af5ee4 --- /dev/null +++ b/activerecord/lib/active_record/middleware/database_selector/resolver/session.rb @@ -0,0 +1,45 @@ +# frozen_string_literal: true + +module ActiveRecord + module Middleware + class DatabaseSelector + class Resolver + # The session class is used by the DatabaseSelector::Resolver to save + # timestamps of the last write in the session. + # + # The last_write is used to determine whether it's safe to read + # from the replica or the request needs to be sent to the primary. + class Session # :nodoc: + def self.build(request) + new(request.session) + end + + # Converts time to a timestamp that represents milliseconds since + # epoch. + def self.convert_time_to_timestamp(time) + time.to_i * 1000 + time.usec / 1000 + end + + # Converts milliseconds since epoch timestamp into a time object. + def self.convert_timestamp_to_time(timestamp) + timestamp ? Time.at(timestamp / 1000, (timestamp % 1000) * 1000) : Time.at(0) + end + + def initialize(session) + @session = session + end + + attr_reader :session + + def last_write_timestamp + self.class.convert_timestamp_to_time(session[:last_write]) + end + + def update_last_write_timestamp + session[:last_write] = self.class.convert_time_to_timestamp(Time.now) + end + end + end + end + end +end diff --git a/activerecord/lib/active_record/railtie.rb b/activerecord/lib/active_record/railtie.rb index 6346a95d57..a981fa97d9 100644 --- a/activerecord/lib/active_record/railtie.rb +++ b/activerecord/lib/active_record/railtie.rb @@ -88,6 +88,14 @@ module ActiveRecord end end + initializer "active_record.database_selector" do + if config.active_record.database_selector + resolver = config.active_record.delete(:database_resolver) + operations = config.active_record.delete(:database_operations) + config.app_middleware.use ActiveRecord::Middleware::DatabaseSelector, resolver, operations + end + end + initializer "Check for cache versioning support" do config.after_initialize do |app| ActiveSupport.on_load(:active_record) do diff --git a/activerecord/test/cases/database_selector_test.rb b/activerecord/test/cases/database_selector_test.rb new file mode 100644 index 0000000000..f961fa754b --- /dev/null +++ b/activerecord/test/cases/database_selector_test.rb @@ -0,0 +1,77 @@ +# frozen_string_literal: true + +require "cases/helper" +require "models/person" +require "action_dispatch" + +module ActiveRecord + class DatabaseSelectorTest < ActiveRecord::TestCase + setup do + @session_store = {} + @session = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session.new(@session_store) + end + + def test_empty_session + assert_equal Time.at(0), @session.last_write_timestamp + end + + def test_writing_the_session_timestamps + assert @session.update_last_write_timestamp + + session2 = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session.new(@session_store) + assert_equal @session.last_write_timestamp, session2.last_write_timestamp + end + + def test_writing_session_time_changes + assert @session.update_last_write_timestamp + + before = @session.last_write_timestamp + sleep(0.1) + + assert @session.update_last_write_timestamp + assert_not_equal before, @session.last_write_timestamp + end + + def test_read_from_replicas + @session_store[:last_write] = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session.convert_time_to_timestamp(Time.now - 5.seconds) + + resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver.new(@session) + + called = false + resolver.read do + called = true + assert ActiveRecord::Base.connected_to?(role: :reading) + end + assert called + end + + def test_read_from_primary + @session_store[:last_write] = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session.convert_time_to_timestamp(Time.now) + + resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver.new(@session) + + called = false + resolver.read do + called = true + assert ActiveRecord::Base.connected_to?(role: :writing) + end + assert called + end + + def test_the_middleware_chooses_writing_role_with_POST_request + middleware = ActiveRecord::Middleware::DatabaseSelector.new(lambda { |env| + assert ActiveRecord::Base.connected_to?(role: :writing) + [200, {}, ["body"]] + }) + assert_equal [200, {}, ["body"]], middleware.call("REQUEST_METHOD" => "POST") + end + + def test_the_middleware_chooses_reading_role_with_GET_request + middleware = ActiveRecord::Middleware::DatabaseSelector.new(lambda { |env| + assert ActiveRecord::Base.connected_to?(role: :reading) + [200, {}, ["body"]] + }) + assert_equal [200, {}, ["body"]], middleware.call("REQUEST_METHOD" => "GET") + end + end +end diff --git a/railties/lib/rails/generators/rails/app/templates/config/environments/production.rb.tt b/railties/lib/rails/generators/rails/app/templates/config/environments/production.rb.tt index 08befd9196..94f7dd0c79 100644 --- a/railties/lib/rails/generators/rails/app/templates/config/environments/production.rb.tt +++ b/railties/lib/rails/generators/rails/app/templates/config/environments/production.rb.tt @@ -98,4 +98,24 @@ Rails.application.configure do # Do not dump schema after migrations. config.active_record.dump_schema_after_migration = false <%- end -%> + + # Inserts middleware to perform automatic connection switching. + # The `database_selector` hash is used to pass options to the DatabaseSelector + # middleware. The `delay` is used to determine how long to wait after a write + # to send a subsequent read to the primary. + # + # The `database_resolver` class is used by the middleware to determine which + # database is appropriate to use based on the time delay. + # + # The `database_operations` class is used by the middleware to set timestamps + # for the last write to the primary. The resolver uses the operations class + # timestamps to determine how long to wait before reading from the replica. + # + # By default Rails will store a last write timestamp in the session. The + # DatabaseSelector middleware is designed as such you can define your own + # strategy for connection switching and pass that into the middleware through + # these configuration options. + # config.active_record.database_selector = { delay: 2.seconds } + # config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver + # config.active_record.database_operations = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session end -- cgit v1.2.3