diff options
author | Michael Koziarski <michael@koziarski.com> | 2009-08-31 12:16:22 -0700 |
---|---|---|
committer | Michael Koziarski <michael@koziarski.com> | 2009-09-04 09:25:38 +1200 |
commit | 9a73630d935e360f3dc896e50dd673afb97cf3b5 (patch) | |
tree | e404f7dbfc142a10b45b758f0cea23812abc9f23 /activesupport/lib/active_support/multibyte/utils.rb | |
parent | 5e6dab8b34152bc48c89032d20e5bda1511e28fb (diff) | |
download | rails-9a73630d935e360f3dc896e50dd673afb97cf3b5.tar.gz rails-9a73630d935e360f3dc896e50dd673afb97cf3b5.tar.bz2 rails-9a73630d935e360f3dc896e50dd673afb97cf3b5.zip |
Add verify and clean methods to ActiveSupport::Multibyte.
When accepting character input from outside of your application you can't
blindly trust that all strings are properly encoded. With these methods
you can check incoming strings and clean them up if necessary.
Signed-off-by: Michael Koziarski <michael@koziarski.com>
Conflicts:
activesupport/lib/active_support/multibyte.rb
Diffstat (limited to 'activesupport/lib/active_support/multibyte/utils.rb')
-rw-r--r-- | activesupport/lib/active_support/multibyte/utils.rb | 61 |
1 files changed, 61 insertions, 0 deletions
diff --git a/activesupport/lib/active_support/multibyte/utils.rb b/activesupport/lib/active_support/multibyte/utils.rb new file mode 100644 index 0000000000..acef84da91 --- /dev/null +++ b/activesupport/lib/active_support/multibyte/utils.rb @@ -0,0 +1,61 @@ +# encoding: utf-8 + +module ActiveSupport #:nodoc: + module Multibyte #:nodoc: + if Kernel.const_defined?(:Encoding) + # Returns a regular expression that matches valid characters in the current encoding + def self.valid_character + VALID_CHARACTER[Encoding.default_internal.to_s] + end + else + def self.valid_character + case $KCODE + when 'UTF8' + VALID_CHARACTER['UTF-8'] + when 'SJIS' + VALID_CHARACTER['Shift_JIS'] + end + end + end + + if 'string'.respond_to?(:valid_encoding?) + # Verifies the encoding of a string + def self.verify(string) + string.valid_encoding? + end + else + def self.verify(string) + if expression = valid_character + for c in string.split(//) + return false unless valid_character.match(c) + end + end + true + end + end + + # Verifies the encoding of the string and raises an exception when it's not valid + def self.verify!(string) + raise EncodingError.new("Found characters with invalid encoding") unless verify(string) + end + + if 'string'.respond_to?(:force_encoding) + # Removes all invalid characters from the string. + # + # Note: this method is a no-op in Ruby 1.9 + def self.clean(string) + string + end + else + def self.clean(string) + if expression = valid_character + stripped = []; for c in string.split(//) + stripped << c if valid_character.match(c) + end; stripped.join + else + string + end + end + end + end +end
\ No newline at end of file |