Caching with Rails: An overview
===============================
Everyone caches. This guide will teach you what you need to know about
avoiding that expensive round-trip to your database and returning what you
need to return to those hungry web clients in the shortest time possible.
== Basic Caching
This is an introduction to the three types of caching techniques that Rails
provides by default without the use of any third party plugins.
To get started make sure `config.action_controller.perform_caching` is set
to `true` for your environment. This flag is normally set in the
corresponding config/environments/*.rb and caching is disabled by default
there for development and test, and enabled for production.
[source, ruby]
-----------------------------------------------------
config.action_controller.perform_caching = true
-----------------------------------------------------
=== Page Caching
Page caching is a Rails mechanism which allows the request for a generated
page to be fulfilled by the webserver, without ever having to go through the
Rails stack at all. Obviously, this is super-fast. Unfortunately, it can't be
applied to every situation (such as pages that need authentication) and since
the webserver is literally just serving a file from the filesystem, cache
expiration is an issue that needs to be dealt with.
So, how do you enable this super-fast cache behavior? Simple, let's say you
have a controller called ProductsController and a 'list' action that lists all
the products
[source, ruby]
-----------------------------------------------------
class ProductsController < ActionController
caches_page :index
def index; end
end
-----------------------------------------------------
The first time anyone requests products/index, Rails will generate a file
called `index.html` and the webserver will then look for that file before it
passes the next request for products/index to your Rails application.
By default, the page cache directory is set to Rails.public_path (which is
usually set to `RAILS_ROOT + "/public"`) and this can be configured by
changing the configuration setting `config.action_controller.page_cache_directory`.
Changing the default from /public helps avoid naming conflicts, since you may
want to put other static html in /public, but changing this will require web
server reconfiguration to let the web server know where to serve the cached
files from.
The Page Caching mechanism will automatically add a `.html` exxtension to
requests for pages that do not have an extension to make it easy for the
webserver to find those pages and this can be configured by changing the
configuration setting `config.action_controller.page_cache_extension`.
In order to expire this page when a new product is added we could extend our
example controler like this:
[source, ruby]
-----------------------------------------------------
class ProductsController < ActionController
caches_page :list
def list; end
def create
expire_page :action => :list
end
end
-----------------------------------------------------
If you want a more complicated expiration scheme, you can use cache sweepers
to expire cached objects when things change. This is covered in the section on Sweepers.
[More: caching paginated results? more examples? Walk-through of page caching?]
=== Action Caching
One of the issues with Page Caching is that you cannot use it for pages that
require to restrict access somehow. This is where Action Caching comes in.
Action Caching works like Page Caching except for the fact that the incoming
web request does go from the webserver to the Rails stack and Action Pack so
that before filters can be run on it before the cache is served, so that
authentication and other restrictions can be used while still serving the
result of the output from a cached copy.
Clearing the cache works in the exact same way as with Page Caching.
Let's say you only wanted authenticated users to edit or create a Product
object, but still cache those pages:
[source, ruby]
-----------------------------------------------------
class ProductsController < ActionController
before_filter :authenticate, :only => [ :edit, :create ]
caches_page :list
caches_action :edit
def list; end
def create
expire_page :action => :list
expire_action :action => :edit
end
def edit; end
end
-----------------------------------------------------
And you can also use `:if` (or `:unless`) to pass a Proc that specifies when the
action should be cached. Also, you can use `:layout => false` to cache without
layout so that dynamic information in the layout such as logged in user info
or the number of items in the cart can be left uncached. This feature is
available as of Rails 2.2.
[More: more examples? Walk-through of Action Caching from request to response?
Description of Rake tasks to clear cached files? Show example of
subdomain caching? Talk about :cache_path, :if and assing blocks/Procs
to expire_action?]
=== Fragment Caching
Life would be perfect if we could get away with caching the entire contents of
a page or action and serving it out to the world. Unfortunately, dynamic web
applications usually build pages with a variety of components not all of which
have the same caching characteristics. In order to address such a dynamically
created page where different parts of the page need to be cached and expired
differently Rails provides a mechanism called Fragment Caching.
Fragment Caching allows a fragment of view logic to be wrapped in a cache
block and served out of the cache store when the next request comes in.
As an example, if you wanted to show all the orders placed on your website
in real time and didn't want to cache that part of the page, but did want
to cache the part of the page which lists all products available, you
could use this piece of code:
[source, ruby]
-----------------------------------------------------
<% Order.find_recent.each do |o| %>
<%= o.buyer.name %> bought <% o.product.name %>
<% end %>
<% cache do %>
All available products:
<% Product.find(:all).each do |p| %>
<%= link_to p.name, product_url(p) %>
<% end %>
<% end %>
-----------------------------------------------------
The cache block in our example will bind to the action that called it and is
written out to the same place as the Action Cache, which means that if you
want to cache multiple fragments per action, you should provide an `action_suffix` to the cache call:
[source, ruby]
-----------------------------------------------------
<% cache(:action => 'recent', :action_suffix => 'all_products') do %>
All available products:
-----------------------------------------------------
and you can expire it using the `expire_fragment` method, like so:
[source, ruby]
-----------------------------------------------------
expire_fragment(:controller => 'products', :action => 'recent', :action_suffix => 'all_products)
-----------------------------------------------------
If you don't want the cache block to bind to the action that called it, You can
also use globally keyed fragments by calling the cache method with a key, like
so:
[source, ruby]
-----------------------------------------------------
<% cache(:key => ['all_available_products', @latest_product.created_at].join(':')) do %>
All available products:
-----------------------------------------------------
This fragment is then available to all actions in the ProductsController using
the key and can be expired the same way:
[source, ruby]
-----------------------------------------------------
expire_fragment(:key => ['all_available_products', @latest_product.created_at].join(':'))
-----------------------------------------------------
[More: more examples? description of fragment keys and expiration, etc? pagination?]
=== Sweepers
Cache sweeping is a mechanism which allows you to get around having a ton of
expire_{page,action,fragment} calls in your code by moving all the work
required to expire cached content into a `ActionController::Caching::Sweeper`
class that is an Observer and looks for changes to an object via callbacks,
and when a change occurs it expires the caches associated with that object n
an around or after filter.
Continuing with our Product controller example, we could rewrite it with a
sweeper such as the following:
[source, ruby]
-----------------------------------------------------
class StoreSweeper < ActionController::Caching::Sweeper
observe Product # This sweeper is going to keep an eye on the Product model
# If our sweeper detects that a Product was created call this
def after_create(product)
expire_cache_for(product)
end
# If our sweeper detects that a Product was updated call this
def after_update(product)
expire_cache_for(product)
end
# If our sweeper detects that a Product was deleted call this
def after_destroy(product)
expire_cache_for(product)
end
private
def expire_cache_for(record)
# Expire the list page now that we added a new product
expire_page(:controller => '#{record}', :action => 'list')
# Expire a fragment
expire_fragment(:controller => '#{record}', :action => 'recent', :action_suffix => 'all_products')
end
end
-----------------------------------------------------
Then we add it to our controller to tell it to call the sweeper when certain
actions are called. So, if we wanted to expire the cached content for the
list and edit actions when the create action was called, we could do the
following:
[source, ruby]
-----------------------------------------------------
class ProductsController < ActionController
before_filter :authenticate, :only => [ :edit, :create ]
caches_page :list
caches_action :edit
cache_sweeper :store_sweeper, :only => [ :create ]
def list; end
def create
expire_page :action => :list
expire_action :action => :edit
end
def edit; end
end
-----------------------------------------------------
[More: more examples? better sweepers?]
=== SQL Caching
Query caching is a Rails feature that caches the result set returned by each
query so that if Rails encounters the same query again for that request, it
will used the cached result set as opposed to running the query against the
database again.
For example:
[source, ruby]
-----------------------------------------------------
class ProductsController < ActionController
before_filter :authenticate, :only => [ :edit, :create ]
caches_page :list
caches_action :edit
cache_sweeper :store_sweeper, :only => [ :create ]
def list
# Run a find query
Product.find(:all)
...
# Run the same query again
Product.find(:all)
end
def create
expire_page :action => :list
expire_action :action => :edit
end
def edit; end
end
-----------------------------------------------------
In the 'list' action above, the result set returned by the first
Product.find(:all) will be cached and will be used to avoid querying the
database again the second time that finder is called.
Query caches are created at the start of an action and destroyed at the end of
that action and thus persist only for the duration of the action.
=== Cache stores
Rails provides different stores for the cached data for action and fragment
caches. Page caches are always stored on disk.
The cache stores provided include:
1) Memory store: Cached data is stored in the memory allocated to the Rails
process, which is fine for WEBrick and for FCGI (if you
don't care that each FCGI process holds its own fragment
store). It's not suitable for CGI as the process is thrown
away at the end of each request. It can potentially also
take up a lot of memory since each process keeps all the
caches in memory.
[source, ruby]
-----------------------------------------------------
ActionController::Base.cache_store = :memory_store
-----------------------------------------------------
2) File store: Cached data is stored on the disk, this is the default store
and the default path for this store is: /tmp/cache. Works
well for all types of environments and allows all processes
running from the same application directory to access the
cached content.
[source, ruby]
-----------------------------------------------------
ActionController::Base.cache_store = :file_store, "/path/to/cache/directory"
-----------------------------------------------------
3) DRb store: Cached data is stored in a separate shared DRb process that all
servers communicate with. This works for all environments and
only keeps one cache around for all processes, but requires
that you run and manage a separate DRb process.
[source, ruby]
-----------------------------------------------------
ActionController::Base.cache_store = :drb_store, "druby://localhost:9192"
-----------------------------------------------------
4) MemCached store: Works like DRbStore, but uses Danga's MemCache instead.
Rails uses the bundled memcached-client gem by default.
[source, ruby]
-----------------------------------------------------
ActionController::Base.cache_store = :mem_cache_store, "localhost"
-----------------------------------------------------
5) Custom store: You can define your own cache store (new in Rails 2.1)
[source, ruby]
-----------------------------------------------------
ActionController::Base.cache_store = MyOwnStore.new("parameter")
-----------------------------------------------------
+Note: config.cache_store can be used in place of
ActionController::Base.cache_store in your Rails::Initializer.run block in
environment.rb+
== Conditional GET support
Conditional GETs are a facility of the HTTP spec that provide a way for web
servers to tell browsers that the response to a GET request hasn’t changed
since the last request and can be safely pulled from the browser cache.
They work by using the HTTP_IF_NONE_MATCH and HTTP_IF_MODIFIED_SINCE headers to
pass back and forth both a unique content identifier and the timestamp of when
the content was last changed. If the browser makes a request where the content
identifier (etag) or last modified since timestamp matches the server’s version
then the server only needs to send back an empty response with a not modified
status.
It is the server’s (i.e. our) responsibility to look for a last modified
timestamp and the if-none-match header and determine whether or not to send
back the full response. With conditional-get support in rails this is a pretty
easy task:
[source, ruby]
-----------------------------------------------------
class ProductsController < ApplicationController
def show
@product = Product.find(params[:id])
# If the request is stale according to the given timestamp and etag value
# (i.e. it needs to be processed again) then execute this block
if stale?(:last_modified => @product.updated_at.utc, :etag => @product)
respond_to do |wants|
# ... normal response processing
end
end
# If the request is fresh (i.e. it's not modified) then you don't need to do
# anything. The default render checks for this using the parameters
# used in the previous call to stale? and will automatically send a
# :not_modified. So that's it, you're done.
end
-----------------------------------------------------
If you don’t have any special response processing and are using the default
rendering mechanism (i.e. you’re not using respond_to or calling render
yourself) then you’ve got an easy helper in fresh_when:
[source, ruby]
-----------------------------------------------------
class ProductsController < ApplicationController
# This will automatically send back a :not_modified if the request is fresh,
# and will render the default template (product.*) if it's stale.
def show
@product = Product.find(params[:id])
fresh_when :last_modified => @product.published_at.utc, :etag => @article
end
end
-----------------------------------------------------
== Advanced Caching
Along with the built-in mechanisms outlined above, a number of excellent
plugins exist to help with finer grained control over caching. These include
Chris Wanstrath's excellent cache_fu plugin (more info here:
http://errtheblog.com/posts/57-kickin-ass-w-cachefu) and Evan Weaver's
interlock plugin (more info here:
http://blog.evanweaver.com/articles/2007/12/13/better-rails-caching/). Both
of these plugins play nice with memcached and are a must-see for anyone
seriously considering optimizing their caching needs.