aboutsummaryrefslogtreecommitdiffstats
path: root/activesupport
diff options
context:
space:
mode:
authorschneems <richard.schneeman@gmail.com>2016-04-20 15:39:45 -0500
committerschneems <richard.schneeman@gmail.com>2016-04-20 15:39:45 -0500
commit54243fecfc8867c79aa4adbfd9948ecd8dcbd8fc (patch)
treea17e31a5c3efb23817bd8b145dc817ded42ead38 /activesupport
parent697384df36a939e565b7c08725017d49dc83fe40 (diff)
downloadrails-54243fecfc8867c79aa4adbfd9948ecd8dcbd8fc.tar.gz
rails-54243fecfc8867c79aa4adbfd9948ecd8dcbd8fc.tar.bz2
rails-54243fecfc8867c79aa4adbfd9948ecd8dcbd8fc.zip
Speed up String#blank? Regex
Follow up on https://github.com/rails/rails/commit/697384df36a939e565b7c08725017d49dc83fe40#commitcomment-17184696. The regex to detect a blank string `/\A[[:space:]]*\z/` will loop through every character in the string to ensure that all of them are a `:space:` type. We can invert this logic and instead look for any non-`:space:` characters. When that happens, we would return on the first character found and the regex engine does not need to keep looking. Thanks @nellshamrell for the regex talk at LSRC. By defining a "blank" string as any string that does not have a non-whitespace character (yes, double negative) we can get a substantial speed bump. Also an inline regex is (barely) faster than a regex in a constant, since it skips the constant lookup. A regex literal is frozen by default. ```ruby require 'benchmark/ips' def string_generate str = " abcdefghijklmnopqrstuvwxyz\t".freeze str[rand(0..(str.length - 1))] * rand(0..23) end strings = 100.times.map { string_generate } ALL_WHITESPACE_STAR = /\A[[:space:]]*\z/ Benchmark.ips do |x| x.report('current regex ') { strings.each {|str| str.empty? || ALL_WHITESPACE_STAR === str } } x.report('+ instead of * ') { strings.each {|str| str.empty? || /\A[[:space:]]+\z/ === str } } x.report('not a non-whitespace char') { strings.each {|str| str.empty? || !(/[[:^space:]]/ === str) } } x.compare! end # Warming up -------------------------------------- # current regex # 1.744k i/100ms # not a non-whitespace char # 2.264k i/100ms # Calculating ------------------------------------- # current regex # 18.078k (± 8.9%) i/s - 90.688k # not a non-whitespace char # 23.580k (± 7.1%) i/s - 117.728k # Comparison: # not a non-whitespace char: 23580.3 i/s # current regex : 18078.2 i/s - 1.30x slower ``` This makes the method roughly 30% faster `(23.580 - 18.078)/18.078 * 100`. cc/ @fxn
Diffstat (limited to 'activesupport')
-rw-r--r--activesupport/lib/active_support/core_ext/object/blank.rb9
1 files changed, 3 insertions, 6 deletions
diff --git a/activesupport/lib/active_support/core_ext/object/blank.rb b/activesupport/lib/active_support/core_ext/object/blank.rb
index 71d411b6d6..f7efa1e01a 100644
--- a/activesupport/lib/active_support/core_ext/object/blank.rb
+++ b/activesupport/lib/active_support/core_ext/object/blank.rb
@@ -112,12 +112,9 @@ class String
#
# @return [true, false]
def blank?
- # In practice, the majority of blank strings are empty. As of this writing
- # checking for empty? is about 3.5x faster than matching against the regexp
- # in MRI, so we call the predicate first, and then fallback.
- #
- # The penalty for blank strings with whitespace or present ones is marginal.
- empty? || BLANK_RE === self
+ # Regex check is slow, only check non-empty strings.
+ # A string not blank if it contains a single non-space string.
+ empty? || !(/[[:^space:]]/ === self)
end
end