aboutsummaryrefslogtreecommitdiffstats
path: root/activesupport/lib/active_support/multibyte/unicode.rb
Commit message (Collapse)AuthorAgeFilesLines
* Enable `Layout/EmptyLinesAroundAccessModifier` copRyuta Kamizono2019-06-131-1/+0
| | | | | | | | | | | We sometimes say "✂️ newline after `private`" in a code review (e.g. https://github.com/rails/rails/pull/18546#discussion_r23188776, https://github.com/rails/rails/pull/34832#discussion_r244847195). Now `Layout/EmptyLinesAroundAccessModifier` cop have new enforced style `EnforcedStyle: only_before` (https://github.com/rubocop-hq/rubocop/pull/7059). That cop and enforced style will reduce the our code review cost.
* Deprecate Unicode's #pack_graphemes and #unpack_graphemes methodsFrancesco Rodríguez2018-10-181-0/+10
| | | | in favor of `array.flatten.pack("U*")` and `string.scan(/\X/).map(&:codepoints)`, respectively.
* Deprecate Unicode#normalize and Chars#normalize (#34202)Francesco Rodríguez2018-10-121-11/+22
|
* Deprecate Unicode#downcase/upcase/swapcase.Francesco Rodríguez2018-10-121-10/+9
| | | | Use String methods directly instead.
* Remove `AS::Multibyte`'s unicode tableFumiaki MATSUSHIMA2018-02-201-272/+15
|
* Enable autocorrect for `Lint/EndAlignment` copKoichi ITO2018-01-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ### Summary This PR changes .rubocop.yml. Regarding the code using `if ... else ... end`, I think the coding style that Rails expects is as follows. ```ruby var = if cond a else b end ``` However, the current .rubocop.yml setting does not offense for the following code. ```ruby var = if cond a else b end ``` I think that the above code expects offense to be warned. Moreover, the layout by autocorrect is unnatural. ```ruby var = if cond a else b end ``` This PR adds a setting to .rubocop.yml to make an offense warning and autocorrect as expected by the coding style. And this change also fixes `case ... when ... end` together. Also this PR itself is an example that arranges the layout using `rubocop -a`. ### Other Information Autocorrect of `Lint/EndAlignment` cop is `false` by default. https://github.com/bbatsov/rubocop/blob/v0.51.0/config/default.yml#L1443 This PR changes this value to `true`. Also this PR has changed it together as it is necessary to enable `Layout/ElseAlignment` cop to make this behavior.
* [Active Support] `rubocop -a --only Layout/EmptyLineAfterMagicComment`Koichi ITO2017-07-111-0/+1
|
* Use frozen-string-literal in ActiveSupportKir Shatrov2017-07-091-0/+1
|
* Revert "Merge pull request #29540 from kirs/rubocop-frozen-string"Matthew Draper2017-07-021-1/+0
| | | | | This reverts commit 3420a14590c0e6915d8b6c242887f74adb4120f9, reversing changes made to afb66a5a598ce4ac74ad84b125a5abf046dcf5aa.
* Enforce frozen string in RubocopKir Shatrov2017-07-011-0/+1
|
* Define path with __dir__bogdanvlviv2017-05-231-1/+1
| | | | | | ".. with __dir__ we can restore order in the Universe." - by @fxn Related to 5b8738c2df003a96f0e490c43559747618d10f5f
* Update Unicode Version to 9.0.0Fumiaki MATSUSHIMA2017-01-281-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | 9.0.0 was released on June 21, 2016 http://blog.unicode.org/2016/06/announcing-unicode-standard-version-90.html http://www.unicode.org/versions/Unicode9.0.0/ There are some changes about grapheme cluster in Unicode 9.0.0: http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundary_Rules ------------ I noticed that `unpack_graphemes` returns [Other] when the argument is Other ÷ Prepend (it must be [Other, Prepend]). But in [Unicode 8.0.0's Prepend has no characters](http://www.unicode.org/reports/tr29/tr29-27.html#Prepend) so we don't have to backport following patch: ```diff should_break = + if pos == eoc + true ```
* No need to nodoc private methodsAkira Matsuda2016-12-241-1/+1
|
* Add more rubocop rules about whitespacesRafael Mendonça França2016-10-291-14/+14
|
* Remove dead constantsFumiaki MATSUSHIMA2016-09-061-30/+0
| | | | It seems that we forgot to remove some codes on https://github.com/rails/rails/commit/7ab47751068c6480e7e44fc9265a7e690dd4af3b
* fixes remaining RuboCop issues [Vipul A M, Xavier Noria]Xavier Noria2016-09-011-12/+12
|
* Add three new rubocop rulesRafael Mendonça França2016-08-161-1/+1
| | | | | | | | Style/SpaceBeforeBlockBraces Style/SpaceInsideBlockBraces Style/SpaceInsideHashLiteralBraces Fix all violations in the repository.
* applies remaining conventions across the projectXavier Noria2016-08-061-10/+9
|
* normalizes indentation and whitespace across the projectXavier Noria2016-08-061-17/+17
|
* applies new string literal convention in activesupport/libXavier Noria2016-08-061-8/+8
| | | | | The current code base is not uniform. After some discussion, we have chosen to go with double quotes by default.
* Merge pull request #12877 from aroben/extended-graphemesRafael França2015-12-311-13/+38
|\ | | | | Support extended grapheme clusters and UAX 29
| * Support extended grapheme clusters and UAX 29Adam Roben2013-11-131-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | http://www.unicode.org/reports/tr29/tr29-21.html is the version of UAX 29 that corresponds to Unicode 6.2.0. Unicode.unpack_graphemes now implements all the rules listed there, including the ones for extended grapheme clusters. I added a new optional test, test/multibyte_grapheme_break_conformance.rb, that is heavily based on test/multibyte_normalization_conformance.rb, which runs the Unicode test suite.
| * Refactor Unicode.unpack_graphemes slightlyAdam Roben2013-11-131-13/+23
| | | | | | | | This will make it easier to add the rest of the rules listed in UAX 29.
* | [ci skip] default_normalization_form accessing from UnicodeGaurav Sharma2015-09-291-1/+1
| |
* | File encoding is defaulted to utf-8 in Ruby >= 2.1Akira Matsuda2015-09-181-1/+0
| |
* | Update Unicode Version to 8.0.0Anshul Sharma2015-09-041-1/+1
| |
* | replace each with each_key when only the key is neededAaron Lasseigne2015-08-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using each_key is faster and more intention revealing. Calculating ------------------------------------- each 31.378k i/100ms each_key 33.790k i/100ms ------------------------------------------------- each 450.225k (± 7.0%) i/s - 2.259M each_key 494.459k (± 6.3%) i/s - 2.467M Comparison: each_key: 494459.4 i/s each: 450225.1 i/s - 1.10x slower
* | String#freeze optimizationsschneems2015-07-301-1/+1
| |
* | Freeze string literals when not mutated.schneems2015-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I wrote a utility that helps find areas where you could optimize your program using a frozen string instead of a string literal, it's called [let_it_go](https://github.com/schneems/let_it_go). After going through the output and adding `.freeze` I was able to eliminate the creation of 1,114 string objects on EVERY request to [codetriage](codetriage.com). How does this impact execution? To look at memory: ```ruby require 'get_process_mem' mem = GetProcessMem.new GC.start GC.disable 1_114.times { " " } before = mem.mb after = mem.mb GC.enable puts "Diff: #{after - before} mb" ``` Creating 1,114 string objects results in `Diff: 0.03125 mb` of RAM allocated on every request. Or 1mb every 32 requests. To look at raw speed: ```ruby require 'benchmark/ips' number_of_objects_reduced = 1_114 Benchmark.ips do |x| x.report("freeze") { number_of_objects_reduced.times { " ".freeze } } x.report("no-freeze") { number_of_objects_reduced.times { " " } } end ``` We get the results ``` Calculating ------------------------------------- freeze 1.428k i/100ms no-freeze 609.000 i/100ms ------------------------------------------------- freeze 14.363k (± 8.5%) i/s - 71.400k no-freeze 6.084k (± 8.1%) i/s - 30.450k ``` Now we can do some maths: ```ruby ips = 6_226k # iterations / 1 second call_time_before = 1.0 / ips # seconds per iteration ips = 15_254 # iterations / 1 second call_time_after = 1.0 / ips # seconds per iteration diff = call_time_before - call_time_after number_of_objects_reduced * diff * 100 # => 0.4530373333993266 miliseconds saved per request ``` So we're shaving off 1 second of execution time for every 220 requests. Is this going to be an insane speed boost to any Rails app: nope. Should we merge it: yep. p.s. If you know of a method call that doesn't modify a string input such as [String#gsub](https://github.com/schneems/let_it_go/blob/b0e2da69f0cca87ab581022baa43291cdf48638c/lib/let_it_go/core_ext/string.rb#L37) please [give me a pull request to the appropriate file](https://github.com/schneems/let_it_go/blob/b0e2da69f0cca87ab581022baa43291cdf48638c/lib/let_it_go/core_ext/string.rb#L37), or open an issue in LetItGo so we can track and freeze more strings. Keep those strings Frozen ![](https://www.dropbox.com/s/z4dj9fdsv213r4v/let-it-go.gif?dl=1)
* | String already respond_to scrub at Ruby 2.2Rafael Mendonça França2015-01-041-2/+1
| |
* | Update to Unicode 7.0.0Benjamin Fleischer2014-11-151-1/+1
| | | | | | | | | | | | | | | | 7.0.0 was released on June 16, 2014 http://unicode-inc.blogspot.com.ar/2014/10/unicode-version-70-complete-text-of.html ruby bin/generate_tables
* | As of Unicode 6.3, Mongolian Vowel Separator is not whitespaceMatthew Draper2014-09-151-1/+0
| | | | | | | | | | Ruby 2.2 knows this, and no longer matches it with [[:space:]], so it's not a good candidate for testing String#squish.
* | Preload UnicodeDatabase outside the loopAkira Matsuda2014-08-181-0/+1
| | | | | | | | | | | | This fixes random multibyte_chars_test fail under Ruby 1.9.3. I don't know why the tests fail. And I really don't know why this fixes. Maybe we need some more investigation...
* | formatAkira Matsuda2014-08-181-2/+1
| |
* | Prevent using String#scrub on RubiniusRobin Dupret2014-07-301-1/+2
| | | | | | | | | | | | Rubinius' has built-in support for String#scrub but it doesn't have yet support for ASCII-incompatible chars so for now, we should rely on the old implementation of #tidy_bytes.
* | Fix tidy_bytes for JRubyJustin Coyne2014-02-101-3/+3
| | | | | | | | | | The previous implementation was broken because JRuby (1.7.10) doesn't have a code converter for UTF-8 to UTF8-MAC.
* | use feature detection to decide which implementation to useAaron Patterson2014-02-081-1/+1
| | | | | | | | Decouple the code from the particular Ruby version.
* | Update to Unicode 6.3.0Norman Clarke2013-12-271-1/+1
| | | | | | | | | | | | 6.3.0 was released on September 30, 2013. http://unicode-inc.blogspot.com.ar/2013/09/announcing-unicode-standard-version-63.html
* | Use String#scrub when available to tidy bytesNorman Clarke2013-12-261-35/+35
|/
* Initializing Codepoint object with default valuesHitendra Singh2013-09-201-0/+7
|
* compatability => compatibilityVipul A M2013-05-261-3/+3
|
* Use ruby's Encoding support for tidy_bytesBurke Libbey2013-05-081-39/+19
| | | | | | | | | | The previous implementation was quite slow. This leverages some of the transcoding abilities built into Ruby 1.9 instead. It is roughly 96% faster. The roundtrip through UTF_8_MAC here is because ruby won't let you transcode from UTF_8 to UTF_8. I chose the closest encoding I could find as an intermediate.
* Update to latest Unicode data.Norman Clarke2013-02-101-1/+1
| | | | Release notes at: http://www.unicode.org/versions/Unicode6.2.0/
* Revert "Use flat_map { } instead of map {}.flatten"Santiago Pastorino2012-10-051-2/+2
| | | | | | | | | | | This reverts commit abf8de85519141496a6773310964ec03f6106f3f. We should take a deeper look to those cases flat_map doesn't do deep flattening. irb(main):002:0> [[[1,3], [1,2]]].map{|i| i}.flatten => [1, 3, 1, 2] irb(main):003:0> [[[1,3], [1,2]]].flat_map{|i| i} => [[1, 3], [1, 2]]
* Use flat_map { } instead of map {}.flattenSantiago Pastorino2012-10-051-2/+2
|
* update AS/log_subscriber and AS/multibyte docs [ci skip]Francesco Rodriguez2012-09-141-21/+31
|
* Avoid unnecessary catching of Exception instead of StandardError.Dylan Smith2012-06-171-1/+1
|
* removing unnecessary 'examples' noise from activesupportFrancesco Rodriguez2012-05-131-3/+0
|
* Update Unicode database to recently-released 6.1.Norman Clarke2012-02-031-1/+1
| | | | http://www.geek.com/articles/geek-pick/unicode-6-1-released-complete-with-emoji-characters-and-a-pile-of-poo-2012022/
* Implement Chars#swapcase.Norman Clarke2012-01-061-0/+8
|