From 4f7ab453e4afbf2302291d320967f38735de2e31 Mon Sep 17 00:00:00 2001 From: Matthew Draper Date: Thu, 29 Dec 2016 21:55:42 +1030 Subject: Start on a guide for the Executor & Load Interlock --- guides/source/threading_and_code_execution.md | 262 ++++++++++++++++++++++++++ 1 file changed, 262 insertions(+) create mode 100644 guides/source/threading_and_code_execution.md (limited to 'guides/source/threading_and_code_execution.md') diff --git a/guides/source/threading_and_code_execution.md b/guides/source/threading_and_code_execution.md new file mode 100644 index 0000000000..b35b2fdee4 --- /dev/null +++ b/guides/source/threading_and_code_execution.md @@ -0,0 +1,262 @@ +**DO NOT READ THIS FILE ON GITHUB, GUIDES ARE PUBLISHED ON http://guides.rubyonrails.org.** + +Threading and Code Execution in Rails +===================================== + +After reading this guide, you will know: + +* What code Rails will automatically execute concurrently +* How to integrate manual concurrency with Rails internals +* How to wrap all application code +* How to affect application reloading + +-------------------------------------------------------------------------------- + +Automatic Concurrency +--------------------- + +Rails automatically allows various operations to be performed at the same time. + +When using a threaded web server, such as the default Puma, multiple HTTP +requests will be served simultaneously, with each request provided its own +controller instance. + +Threaded Active Job adapters, including the built-in Async, will likewise +execute several jobs at the same time. Action Cable channels are managed this +way too. + +These mechanisms all involve multiple threads, each managing work for a unique +instance of some object (controller, job, channel), while sharing the global +process space (such as classes and their configurations, and global variables). +As long as your code doesn't modify any of those shared things, it can mostly +ignore that other threads exist. + +The rest of this guide describes the mechanisms Rails uses to make it "mostly +ignorable", and how extensions and applications with special needs can use them. + +Executor +-------- + +The Rails Executor separates application code from framework code: any time the +framework invokes code you've written in your application, it will be wrapped by +the Executor. + +The Executor consists of two callbacks: `to_run` and `to_complete`. The Run +callback is called before the application code, and the Complete callback is +called after. + +### Default callbacks + +In a default Rails application, the Executor callbacks are used to: + +* track which threads are in safe positions for autoloading and reloading +* enable and disable the Active Record query cache +* return acquired Active Record connections to the pool +* constrain internal cache lifetimes + +### Wrapping application code + +If you're writing a library or component that will invoke application code, you +should wrap it with a call to the executor: + +```ruby +Rails.application.executor.wrap do + # call application code here +end +``` + +TIP: If you repeatedly invoke application code from a long-running process, you +may want to wrap using the Reloader instead. + +Each thread should be wrapped before it runs application code, so if your +application manually delegates work to other threads, such as via `Thread.new` +or Concurrent Ruby features that use thread pools, you should immediately wrap +the block: + +```ruby +Thread.new do + Rails.application.executor.wrap do + # your code here + end +end +``` + +NOTE: Concurrent Ruby uses a `ThreadPoolExecutor`, which it sometimes configures +with an `executor` option. Despite the name, it is unrelated. + +The Executor is safely re-entrant; if it is already active on the current +thread, `wrap` is a no-op. + +If it's impractical to physically wrap the application code in a block (for +example, the Rack API makes this problematic), you can also use the `run!` / +`complete!` pair. + +### Concurrency + +The Executor will put the current thread into `running` mode in the Load +Interlock. This operation will block temporarily if another thread is currently +either autoloading a constant or unloading/reloading the application. + +Reloader +-------- + +Like the Executor, the Reloader also wraps application code. If the Executor is +not already active on the current thread, the Reloader will invoke it for you, +so you only need to call one. This also guarantees that everything the Reloader +does, including all its callback invocations, occurs wrapped inside the +Executor. + +```ruby +Rails.application.reloader.wrap do + # call application code here +end +``` + +### Callbacks + +Before entering the wrapped block, the Reloader will check whether the running +application needs to be reloaded -- for example, because a model's source file has +been modified. If it determines a reload is required, it will wait until it's +safe, and then do so, before continuing. When the application is configured to +always reload regardless of whether any changes are detected, the reload is +instead performed at the end of the block. + +The Reloader also provides `to_run` and `to_complete` callbacks; they are +invoked at the same points as those of the Executor, but only when the current +execution has initiated an application reload. When no reload is deemed +necessary, the Reloader will invoke the wrapped block with no other callbacks. + +### Class Unload + +The most significant part of the reloading process is the Class Unload, where +all autoloaded classes are removed, ready to be loaded again. This will occur +immediately before either the Run or Complete callback, depending on the +`reload_classes_only_on_change` setting. + +Often, additional reloading actions need to be performed either just before or +just after the Class Unload, so the Reloader also provides `before_class_unload` +and `after_class_unload` callbacks. + +### Concurrency + +Only long-running "top level" processes should invoke the Reloader, because if +it determines a reload is needed, it will block until all other threads have +completed and left any Executor block. + +If this were to occur in a "child" thread, with a waiting parent inside the +Executor, it would cause an unavoidable deadlock: the reload must occur before +the child thread is executed, but it cannot be safely performed while the parent +thread is mid-execution. + +Child threads should use the Executor instead. + +Load Interlock +-------------- + +The Load Interlock allows autoloading and reloading to be enabled in a +multi-threaded runtime environment. + +When one thread is performing an autoload by evaluating the class definition +from the appropriate file, it is important no other thread encounters a +reference to the partially-defined constant. + +Similarly, it is only safe to perform an unload/reload when no application code +is in mid-execution: after the reload, the `User` constant, for example, may +point to a different class. Without this rule, a poorly-timed reload would mean +`User.new.class == User`, or even `User == User`, could be false. + +Both of these constraints are addressed by the Load Interlock. It keeps track of +which threads are currently running application code, loading a class, or +unloading autoloaded constants. + +Only one thread may load or unload at a time, and to do either, it must wait +until no other threads are running application code. If a thread is waiting to +perform a load, it doesn't prevent other threads from loading (in fact, they'll +cooperate, and each perform their queued load in turn, before all resuming +running together). + +### `permit_concurrent_loads` + +The Executor automatically acquires a `running` lock for the duration of its +block, and autoload knows when to upgrade to a `load` lock, and switch back to +`running` again afterwards. + +Other blocking operations performed inside the Executor block (which includes +all application code), however, can needlessly retain the `running` lock. If +another thread encounters a constant it must autoload, this can cause a +deadlock. + +For example, assuming `User` is not yet loaded, the following will deadlock: + +```ruby +Rails.application.executor.wrap do + th = Thread.new do + Rails.application.executor.wrap do + User # inner thread waits here; it cannot load + # User while another thread is running + end + end + + th.join # outer thread waits here, holding 'running' lock +end +``` + +To prevent this deadlock, the outer thread can `permit_concurrent_loads`. By +calling this method, the thread guarantees it will not dereference any +possibly-autoloaded constant inside the supplied block. The safest way to meet +that promise is to put it as close as possible to the blocking call only: + +```ruby +Rails.application.executor.wrap do + th = Thread.new do + Rails.application.executor.wrap do + User # inner thread can acquire the load lock, + # load User, and continue + end + end + + ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + th.join # outer thread waits here, but has no lock + end +end +``` + +Another example, using Concurrent Ruby: + +```ruby +Rails.application.executor.wrap do + futures = 3.times.collect do |i| + Concurrent::Future.execute do + Rails.application.executor.wrap do + # do work here + end + end + end + + values = ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + futures.collect(&:value) + end +end +``` + + +### ActionDispatch::DebugLocks + +If your application is deadlocking and you think the Load Interlock may be +involved, you can temporarily add the ActionDispatch::DebugLocks middleware to +`config/application.rb`: + +```ruby +config.middleware.insert_before Rack::Sendfile, + ActionDispatch::DebugLocks +``` + +If you then restart the application and re-trigger the deadlock condition, +`/rails/locks` will show a summary of all threads currently known to the +interlock, which lock level they are holding or awaiting, and their current +backtrace. + +Generally a deadlock will be caused by the interlock conflicting with some other +external lock or blocking I/O call. Once you find it, you can wrap it with +`permit_concurrent_loads`. + -- cgit v1.2.3