diff options
author | schneems <richard.schneeman+foo@gmail.com> | 2018-09-07 21:41:55 -0500 |
---|---|---|
committer | schneems <richard.schneeman+foo@gmail.com> | 2018-10-17 11:05:05 -0500 |
commit | 04454839a1a07cacac58cdf756a6b8e3adde0ef5 (patch) | |
tree | e45385ac4bd770874fa76e0237987969184cf790 /activerecord/lib | |
parent | ead868315f9b0fedb351c9b451aa1f66a2dc8038 (diff) | |
download | rails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.tar.gz rails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.tar.bz2 rails-04454839a1a07cacac58cdf756a6b8e3adde0ef5.zip |
Use raw time string from DB to generate ActiveRecord#cache_version
Currently, the `updated_at` field is used to generate a `cache_version`. Some database adapters return this timestamp value as a string that must then be converted to a Time value. This process requires a lot of memory and even more CPU time. In the case where this value is only being used for a cache version, we can skip the Time conversion by using the string value directly.
- This PR preserves existing cache format by converting a UTC string from the database to `:usec` format.
- Some databases return an already converted Time object, in those instances, we can directly use `created_at`.
- The `updated_at_before_type_cast` can be a value that comes from either the database or the user. We only want to optimize the case where it is from the database.
- If the format of the cache version has been changed, we cannot apply this optimization, and it is skipped.
- If the format of the time in the database is not UTC, then we cannot use this optimization, and it is skipped.
Some databases (notably PostgreSQL) returns a variable length nanosecond value in the time string. If the value ends in a zero, then it is truncated For instance instead of `2018-10-12 05:00:00.000000` the value `2018-10-12 05:00:00` is returned. We detect this case and pad the remaining zeros to ensure consistent cache version generation.
Before: Total allocated: 743842 bytes (6626 objects)
After: Total allocated: 702955 bytes (6063 objects)
(743842 - 702955) / 743842.0 # => 5.4% ⚡️⚡️⚡️⚡️⚡️
Using the CodeTriage application and derailed benchmarks this PR shows between 9-11% (statistically significant) performance improvement versus the commit before it.
Special thanks to @lsylvester for helping to figure out a way to preserve the usec format and for helping with many implementation details.
Diffstat (limited to 'activerecord/lib')
-rw-r--r-- | activerecord/lib/active_record/integration.rb | 48 |
1 files changed, 46 insertions, 2 deletions
diff --git a/activerecord/lib/active_record/integration.rb b/activerecord/lib/active_record/integration.rb index 456689ec6d..43f6afbb42 100644 --- a/activerecord/lib/active_record/integration.rb +++ b/activerecord/lib/active_record/integration.rb @@ -96,8 +96,14 @@ module ActiveRecord # Note, this method will return nil if ActiveRecord::Base.cache_versioning is set to # +false+ (which it is by default until Rails 6.0). def cache_version - if cache_versioning && timestamp = try(:updated_at) - timestamp.utc.to_s(:usec) + return unless cache_versioning + return unless has_attribute?("updated_at") + + timestamp = updated_at_before_type_cast + if can_use_fast_cache_version?(timestamp) + raw_timestamp_to_cache_version(timestamp) + elsif timestamp = updated_at + timestamp.utc.to_s(cache_timestamp_format) end end @@ -151,5 +157,43 @@ module ActiveRecord end end end + + private + # Detects if the value before type cast + # can be used to generate a cache_version. + # + # The fast cache version only works with a + # string value directly from the database. + # + # We also must check if the timestamp format has been changed + # or if the timezone is not set to UTC then + # we cannot apply our transformations correctly. + def can_use_fast_cache_version?(timestamp) + timestamp.is_a?(String) && + cache_timestamp_format == :usec && + default_timezone == :utc && + !updated_at_came_from_user? + end + + # Converts a raw database string to `:usec` + # format. + # + # Example: + # + # timestamp = "2018-10-15 20:02:15.266505" + # raw_timestamp_to_cache_version(timestamp) + # # => "20181015200215266505" + # + # Postgres truncates trailing zeros, https://bit.ly/2QUlXiZ + # to account for this we pad the output with zeros + def raw_timestamp_to_cache_version(timestamp) + key = timestamp.delete("- :.") + padding = 20 - key.length + if padding != 0 + key << "0" * padding + else + key + end + end end end |