aboutsummaryrefslogtreecommitdiffstats
path: root/activesupport/lib/active_support/multibyte
Commit message (Collapse)AuthorAgeFilesLines
* applies remaining conventions across the projectXavier Noria2016-08-061-10/+9
|
* normalizes indentation and whitespace across the projectXavier Noria2016-08-061-17/+17
|
* modernizes hash syntax in activesupportXavier Noria2016-08-061-1/+1
|
* applies new string literal convention in activesupport/libXavier Noria2016-08-062-19/+19
| | | | | The current code base is not uniform. After some discussion, we have chosen to go with double quotes by default.
* systematic revision of =~ usage in ASXavier Noria2016-07-221-1/+2
| | | | | Where appropriate prefer the more concise Regexp#match?, String#include?, String#start_with?, and String#end_with?
* Merge pull request #12877 from aroben/extended-graphemesRafael França2015-12-311-13/+38
|\ | | | | Support extended grapheme clusters and UAX 29
| * Support extended grapheme clusters and UAX 29Adam Roben2013-11-131-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | http://www.unicode.org/reports/tr29/tr29-21.html is the version of UAX 29 that corresponds to Unicode 6.2.0. Unicode.unpack_graphemes now implements all the rules listed there, including the ones for extended grapheme clusters. I added a new optional test, test/multibyte_grapheme_break_conformance.rb, that is heavily based on test/multibyte_normalization_conformance.rb, which runs the Unicode test suite.
| * Refactor Unicode.unpack_graphemes slightlyAdam Roben2013-11-131-13/+23
| | | | | | | | This will make it easier to add the rest of the rules listed in UAX 29.
* | Update #20737 to address feedbackSean Griffin2015-10-201-2/+5
| | | | | | | | | | | | | | Given that this pull request affects a mutable value, we need to test for and document the affects on the receiver in this case. Additionally, this pull request was missing a CHANGELOG entry.
* | Fixed slice! behavior: return nil for out-of-bound parametersGourav Tiwari2015-10-201-1/+2
| |
* | [ci skip] default_normalization_form accessing from UnicodeGaurav Sharma2015-09-291-1/+1
| |
* | File encoding is defaulted to utf-8 in Ruby >= 2.1Akira Matsuda2015-09-182-2/+0
| |
* | Update Unicode Version to 8.0.0Anshul Sharma2015-09-041-1/+1
| |
* | replace each with each_key when only the key is neededAaron Lasseigne2015-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using each_key is faster and more intention revealing. Calculating ------------------------------------- each 31.378k i/100ms each_key 33.790k i/100ms ------------------------------------------------- each 450.225k (± 7.0%) i/s - 2.259M each_key 494.459k (± 6.3%) i/s - 2.467M Comparison: each_key: 494459.4 i/s each: 450225.1 i/s - 1.10x slower
* | String#freeze optimizationsschneems2015-07-301-1/+1
| |
* | Freeze string literals when not mutated.schneems2015-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I wrote a utility that helps find areas where you could optimize your program using a frozen string instead of a string literal, it's called [let_it_go](https://github.com/schneems/let_it_go). After going through the output and adding `.freeze` I was able to eliminate the creation of 1,114 string objects on EVERY request to [codetriage](codetriage.com). How does this impact execution? To look at memory: ```ruby require 'get_process_mem' mem = GetProcessMem.new GC.start GC.disable 1_114.times { " " } before = mem.mb after = mem.mb GC.enable puts "Diff: #{after - before} mb" ``` Creating 1,114 string objects results in `Diff: 0.03125 mb` of RAM allocated on every request. Or 1mb every 32 requests. To look at raw speed: ```ruby require 'benchmark/ips' number_of_objects_reduced = 1_114 Benchmark.ips do |x| x.report("freeze") { number_of_objects_reduced.times { " ".freeze } } x.report("no-freeze") { number_of_objects_reduced.times { " " } } end ``` We get the results ``` Calculating ------------------------------------- freeze 1.428k i/100ms no-freeze 609.000 i/100ms ------------------------------------------------- freeze 14.363k (± 8.5%) i/s - 71.400k no-freeze 6.084k (± 8.1%) i/s - 30.450k ``` Now we can do some maths: ```ruby ips = 6_226k # iterations / 1 second call_time_before = 1.0 / ips # seconds per iteration ips = 15_254 # iterations / 1 second call_time_after = 1.0 / ips # seconds per iteration diff = call_time_before - call_time_after number_of_objects_reduced * diff * 100 # => 0.4530373333993266 miliseconds saved per request ``` So we're shaving off 1 second of execution time for every 220 requests. Is this going to be an insane speed boost to any Rails app: nope. Should we merge it: yep. p.s. If you know of a method call that doesn't modify a string input such as [String#gsub](https://github.com/schneems/let_it_go/blob/b0e2da69f0cca87ab581022baa43291cdf48638c/lib/let_it_go/core_ext/string.rb#L37) please [give me a pull request to the appropriate file](https://github.com/schneems/let_it_go/blob/b0e2da69f0cca87ab581022baa43291cdf48638c/lib/let_it_go/core_ext/string.rb#L37), or open an issue in LetItGo so we can track and freeze more strings. Keep those strings Frozen ![](https://www.dropbox.com/s/z4dj9fdsv213r4v/let-it-go.gif?dl=1)
* | Merge pull request #20297 from gouravtiwari/patch-9Claudio B.2015-05-261-0/+6
|\ \ | | | | | | Added multibyte slice! example to doc [ci skip]
| * | Added multibyte slice! example to doc [ci skip]Gourav Tiwari2015-05-261-0/+6
| | |
* | | Remove redundant 'like' from doc of slice! method [ci skip]Mehmet Emin İNAÇ2015-05-261-1/+1
|/ /
* | String already respond_to scrub at Ruby 2.2Rafael Mendonça França2015-01-041-2/+1
| |
* | Update to Unicode 7.0.0Benjamin Fleischer2014-11-151-1/+1
| | | | | | | | | | | | | | | | 7.0.0 was released on June 16, 2014 http://unicode-inc.blogspot.com.ar/2014/10/unicode-version-70-complete-text-of.html ruby bin/generate_tables
* | As of Unicode 6.3, Mongolian Vowel Separator is not whitespaceMatthew Draper2014-09-151-1/+0
| | | | | | | | | | Ruby 2.2 knows this, and no longer matches it with [[:space:]], so it's not a good candidate for testing String#squish.
* | Preload UnicodeDatabase outside the loopAkira Matsuda2014-08-181-0/+1
| | | | | | | | | | | | This fixes random multibyte_chars_test fail under Ruby 1.9.3. I don't know why the tests fail. And I really don't know why this fixes. Maybe we need some more investigation...
* | formatAkira Matsuda2014-08-181-2/+1
| |
* | Prevent using String#scrub on RubiniusRobin Dupret2014-07-301-1/+2
| | | | | | | | | | | | Rubinius' has built-in support for String#scrub but it doesn't have yet support for ASCII-incompatible chars so for now, we should rely on the old implementation of #tidy_bytes.
* | Fix tidy_bytes for JRubyJustin Coyne2014-02-101-3/+3
| | | | | | | | | | The previous implementation was broken because JRuby (1.7.10) doesn't have a code converter for UTF-8 to UTF8-MAC.
* | use feature detection to decide which implementation to useAaron Patterson2014-02-081-1/+1
| | | | | | | | Decouple the code from the particular Ruby version.
* | Update to Unicode 6.3.0Norman Clarke2013-12-271-1/+1
| | | | | | | | | | | | 6.3.0 was released on September 30, 2013. http://unicode-inc.blogspot.com.ar/2013/09/announcing-unicode-standard-version-63.html
* | Use String#scrub when available to tidy bytesNorman Clarke2013-12-261-35/+35
|/
* Initializing Codepoint object with default valuesHitendra Singh2013-09-201-0/+7
|
* Drying up method_missing codeHitendra Singh2013-09-201-2/+1
|
* compatability => compatibilityVipul A M2013-05-261-3/+3
|
* Use ruby's Encoding support for tidy_bytesBurke Libbey2013-05-081-39/+19
| | | | | | | | | | The previous implementation was quite slow. This leverages some of the transcoding abilities built into Ruby 1.9 instead. It is roughly 96% faster. The roundtrip through UTF_8_MAC here is because ruby won't let you transcode from UTF_8 to UTF_8. I chose the closest encoding I could find as an intermediate.
* Update to latest Unicode data.Norman Clarke2013-02-101-1/+1
| | | | Release notes at: http://www.unicode.org/versions/Unicode6.2.0/
* Revert "Use flat_map { } instead of map {}.flatten"Santiago Pastorino2012-10-051-2/+2
| | | | | | | | | | | This reverts commit abf8de85519141496a6773310964ec03f6106f3f. We should take a deeper look to those cases flat_map doesn't do deep flattening. irb(main):002:0> [[[1,3], [1,2]]].map{|i| i}.flatten => [1, 3, 1, 2] irb(main):003:0> [[[1,3], [1,2]]].flat_map{|i| i} => [[1, 3], [1, 2]]
* Use flat_map { } instead of map {}.flattenSantiago Pastorino2012-10-051-2/+2
|
* update AS/log_subscriber and AS/multibyte docs [ci skip]Francesco Rodriguez2012-09-142-45/+67
|
* Avoid unnecessary catching of Exception instead of StandardError.Dylan Smith2012-06-171-1/+1
|
* fix warning in Ruby2.0.0takkanm2012-06-111-1/+1
| | | | | | | ``` rails/activesupport/lib/active_support/multibyte/chars.rb:136: warning: character class has duplicated range: /\b('?[\S])/ ```
* make AS::Multibyte::Chars work w/o multibyte core extSergey Nartimov2012-05-281-1/+1
| | | | | | Use ActiveSupport::Multibyte::Chars.new instead of String#mb_chars. It allows to use ActiveSupport::Multibyte::Chars without requiring String multibyte core extension.
* removing unnecessary 'examples' noise from activesupportFrancesco Rodriguez2012-05-132-14/+0
|
* Use respond_to_missing? for CharsMarc-Andre Lafortune2012-05-051-2/+2
|
* Update Unicode database to recently-released 6.1.Norman Clarke2012-02-031-1/+1
| | | | http://www.geek.com/articles/geek-pick/unicode-6-1-released-complete-with-emoji-characters-and-a-pile-of-poo-2012022/
* Build fix when running isolated testArun Agrawal2012-02-011-0/+1
|
* Added as_json method for multibyte stringsDmitriy Vorotilin2012-02-011-0/+4
|
* Improve doc consistencyNorman Clarke2012-01-061-3/+3
|
* Implement Chars#swapcase.Norman Clarke2012-01-062-0/+16
|
* Use friendlier method nameNorman Clarke2012-01-051-2/+2
|
* Use friendlier method names for upcasing/downcasingNorman Clarke2012-01-052-12/+20
|
* Use more descriptive method namesNorman Clarke2012-01-052-8/+8
|