aboutsummaryrefslogtreecommitdiffstats
path: root/activerecord/lib
diff options
context:
space:
mode:
authorschneems <richard.schneeman+foo@gmail.com>2018-09-07 21:41:55 -0500
committerschneems <richard.schneeman+foo@gmail.com>2018-10-17 11:05:05 -0500
commit04454839a1a07cacac58cdf756a6b8e3adde0ef5 (patch)
treee45385ac4bd770874fa76e0237987969184cf790 /activerecord/lib
parentead868315f9b0fedb351c9b451aa1f66a2dc8038 (diff)
downloadrails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.tar.gz
rails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.tar.bz2
rails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.zip
Use raw time string from DB to generate ActiveRecord#cache_version
Currently, the `updated_at` field is used to generate a `cache_version`. Some database adapters return this timestamp value as a string that must then be converted to a Time value. This process requires a lot of memory and even more CPU time. In the case where this value is only being used for a cache version, we can skip the Time conversion by using the string value directly. - This PR preserves existing cache format by converting a UTC string from the database to `:usec` format. - Some databases return an already converted Time object, in those instances, we can directly use `created_at`. - The `updated_at_before_type_cast` can be a value that comes from either the database or the user. We only want to optimize the case where it is from the database. - If the format of the cache version has been changed, we cannot apply this optimization, and it is skipped. - If the format of the time in the database is not UTC, then we cannot use this optimization, and it is skipped. Some databases (notably PostgreSQL) returns a variable length nanosecond value in the time string. If the value ends in a zero, then it is truncated For instance instead of `2018-10-12 05:00:00.000000` the value `2018-10-12 05:00:00` is returned. We detect this case and pad the remaining zeros to ensure consistent cache version generation. Before: Total allocated: 743842 bytes (6626 objects) After: Total allocated: 702955 bytes (6063 objects) (743842 - 702955) / 743842.0 # => 5.4% ⚡️⚡️⚡️⚡️⚡️ Using the CodeTriage application and derailed benchmarks this PR shows between 9-11% (statistically significant) performance improvement versus the commit before it. Special thanks to @lsylvester for helping to figure out a way to preserve the usec format and for helping with many implementation details.
Diffstat (limited to 'activerecord/lib')
-rw-r--r--activerecord/lib/active_record/integration.rb48
1 files changed, 46 insertions, 2 deletions
diff --git a/activerecord/lib/active_record/integration.rb b/activerecord/lib/active_record/integration.rb
index 456689ec6d..43f6afbb42 100644
--- a/activerecord/lib/active_record/integration.rb
+++ b/activerecord/lib/active_record/integration.rb
@@ -96,8 +96,14 @@ module ActiveRecord
# Note, this method will return nil if ActiveRecord::Base.cache_versioning is set to
# +false+ (which it is by default until Rails 6.0).
def cache_version
- if cache_versioning && timestamp = try(:updated_at)
- timestamp.utc.to_s(:usec)
+ return unless cache_versioning
+ return unless has_attribute?("updated_at")
+
+ timestamp = updated_at_before_type_cast
+ if can_use_fast_cache_version?(timestamp)
+ raw_timestamp_to_cache_version(timestamp)
+ elsif timestamp = updated_at
+ timestamp.utc.to_s(cache_timestamp_format)
end
end
@@ -151,5 +157,43 @@ module ActiveRecord
end
end
end
+
+ private
+ # Detects if the value before type cast
+ # can be used to generate a cache_version.
+ #
+ # The fast cache version only works with a
+ # string value directly from the database.
+ #
+ # We also must check if the timestamp format has been changed
+ # or if the timezone is not set to UTC then
+ # we cannot apply our transformations correctly.
+ def can_use_fast_cache_version?(timestamp)
+ timestamp.is_a?(String) &&
+ cache_timestamp_format == :usec &&
+ default_timezone == :utc &&
+ !updated_at_came_from_user?
+ end
+
+ # Converts a raw database string to `:usec`
+ # format.
+ #
+ # Example:
+ #
+ # timestamp = "2018-10-15 20:02:15.266505"
+ # raw_timestamp_to_cache_version(timestamp)
+ # # => "20181015200215266505"
+ #
+ # Postgres truncates trailing zeros, https://bit.ly/2QUlXiZ
+ # to account for this we pad the output with zeros
+ def raw_timestamp_to_cache_version(timestamp)
+ key = timestamp.delete("- :.")
+ padding = 20 - key.length
+ if padding != 0
+ key << "0" * padding
+ else
+ key
+ end
+ end
end
end