diff options
author | Cory Gwin @gwincr11 <gwincr11@github.com> | 2017-11-03 21:26:36 -0400 |
---|---|---|
committer | Cory Gwin @gwincr11 <gwincr11@github.com> | 2017-11-17 20:45:43 -0600 |
commit | 7b44ffa21efe3b9608254701ffdf44f743b1a324 (patch) | |
tree | 14425d4c642443336983ffa7c202650131aaac91 /activesupport/lib | |
parent | 015239a729d7a247278659e7aa1116a3eddc1dc7 (diff) | |
download | rails-7b44ffa21efe3b9608254701ffdf44f743b1a324.tar.gz rails-7b44ffa21efe3b9608254701ffdf44f743b1a324.tar.bz2 rails-7b44ffa21efe3b9608254701ffdf44f743b1a324.zip |
Add support for multiple encodings in String.blank?
Motivation:
- When strings are encoded with `.encode("UTF-16LE")` `.blank?` throws
an `Encoding::CompatibilityError` exception.
- We tested multiple implementation to see what the fastest
implementation was, rescueing the execption seems to be the fastest
option we could find.
Related Issues:
- #28953
Changes:
- Add a rescue to catch the exception.
- Added a `Concurrent::Map` to store a cache of encoded regex objects
for requested encoding types.
- Use the new `Concurrent::Map` cache to return the correct regex for
the string being checked.
Diffstat (limited to 'activesupport/lib')
-rw-r--r-- | activesupport/lib/active_support/core_ext/object/blank.rb | 11 |
1 files changed, 10 insertions, 1 deletions
diff --git a/activesupport/lib/active_support/core_ext/object/blank.rb b/activesupport/lib/active_support/core_ext/object/blank.rb index e42ad852dd..2ca431ab10 100644 --- a/activesupport/lib/active_support/core_ext/object/blank.rb +++ b/activesupport/lib/active_support/core_ext/object/blank.rb @@ -1,6 +1,7 @@ # frozen_string_literal: true require "active_support/core_ext/regexp" +require "concurrent/map" class Object # An object is blank if it's false, empty, or a whitespace string. @@ -102,6 +103,9 @@ end class String BLANK_RE = /\A[[:space:]]*\z/ + ENCODED_BLANKS = Concurrent::Map.new do |h, enc| + h[enc] = Regexp.new(BLANK_RE.source.encode(enc), BLANK_RE.options | Regexp::FIXEDENCODING) + end # A string is blank if it's empty or contains whitespaces only: # @@ -119,7 +123,12 @@ class String # The regexp that matches blank strings is expensive. For the case of empty # strings we can speed up this method (~3.5x) with an empty? call. The # penalty for the rest of strings is marginal. - empty? || BLANK_RE.match?(self) + empty? || + begin + BLANK_RE.match?(self) + rescue Encoding::CompatibilityError + ENCODED_BLANKS[self.encoding].match?(self) + end end end |