diff options
author | Kasper Timm Hansen <kaspth@gmail.com> | 2016-01-18 19:57:10 +0100 |
---|---|---|
committer | Kasper Timm Hansen <kaspth@gmail.com> | 2016-01-18 19:57:10 +0100 |
commit | 426b3127dd60e874a463ea022c85bc3eaf6ea4b3 (patch) | |
tree | 5b5c727cb31ca133b9b024c3646d943ff6fad5aa | |
parent | b505d45a5cd2343eb5b662cda5f48b4a63b397f1 (diff) | |
parent | da26934313a31ae530b7537aba8a7662152f4dfe (diff) | |
download | rails-426b3127dd60e874a463ea022c85bc3eaf6ea4b3.tar.gz rails-426b3127dd60e874a463ea022c85bc3eaf6ea4b3.tar.bz2 rails-426b3127dd60e874a463ea022c85bc3eaf6ea4b3.zip |
Merge pull request #23099 from vipulnsward/change_start_at_end_at
Changed options for find_each and variants to have options start/finish
-rw-r--r-- | activerecord/CHANGELOG.md | 4 | ||||
-rw-r--r-- | activerecord/lib/active_record/relation/batches.rb | 67 | ||||
-rw-r--r-- | activerecord/lib/active_record/relation/batches/batch_enumerator.rb | 12 | ||||
-rw-r--r-- | activerecord/test/cases/batches_test.rb | 45 | ||||
-rw-r--r-- | guides/source/active_record_querying.md | 20 |
5 files changed, 56 insertions, 92 deletions
diff --git a/activerecord/CHANGELOG.md b/activerecord/CHANGELOG.md index e14a9a972c..3d251f95cc 100644 --- a/activerecord/CHANGELOG.md +++ b/activerecord/CHANGELOG.md @@ -634,7 +634,7 @@ * Add `ActiveRecord::Relation#in_batches` to work with records and relations in batches. - Available options are `of` (batch size), `load`, `begin_at`, and `end_at`. + Available options are `of` (batch size), `load`, `start`, and `finish`. Examples: @@ -1282,7 +1282,7 @@ *Yves Senn* -* `find_in_batches` now accepts an `:end_at` parameter that complements the `:start` +* `find_in_batches` now accepts an `:finish` parameter that complements the `:start` parameter to specify where to stop batch processing. *Vipul A M* diff --git a/activerecord/lib/active_record/relation/batches.rb b/activerecord/lib/active_record/relation/batches.rb index 221bc73680..54587ae18e 100644 --- a/activerecord/lib/active_record/relation/batches.rb +++ b/activerecord/lib/active_record/relation/batches.rb @@ -29,15 +29,15 @@ module ActiveRecord # # ==== Options # * <tt>:batch_size</tt> - Specifies the size of the batch. Default to 1000. - # * <tt>:begin_at</tt> - Specifies the primary key value to start from, inclusive of the value. - # * <tt>:end_at</tt> - Specifies the primary key value to end at, inclusive of the value. + # * <tt>:start</tt> - Specifies the primary key value to start from, inclusive of the value. + # * <tt>:finish</tt> - Specifies the primary key value to end at, inclusive of the value. # This is especially useful if you want multiple workers dealing with # the same processing queue. You can make worker 1 handle all the records # between id 0 and 10,000 and worker 2 handle from 10,000 and beyond - # (by setting the +:begin_at+ and +:end_at+ option on each worker). + # (by setting the +:start+ and +:finish+ option on each worker). # # # Let's process for a batch of 2000 records, skipping the first 2000 rows - # Person.find_each(begin_at: 2000, batch_size: 2000) do |person| + # Person.find_each(start: 2000, batch_size: 2000) do |person| # person.party_all_night! # end # @@ -48,22 +48,15 @@ module ActiveRecord # # NOTE: You can't set the limit either, that's used to control # the batch sizes. - def find_each(begin_at: nil, end_at: nil, batch_size: 1000, start: nil) - if start - begin_at = start - ActiveSupport::Deprecation.warn(<<-MSG.squish) - Passing `start` value to find_each is deprecated, and will be removed in Rails 5.1. - Please pass `begin_at` instead. - MSG - end + def find_each(start: nil, finish: nil, batch_size: 1000) if block_given? - find_in_batches(begin_at: begin_at, end_at: end_at, batch_size: batch_size) do |records| + find_in_batches(start: start, finish: finish, batch_size: batch_size) do |records| records.each { |record| yield record } end else - enum_for(:find_each, begin_at: begin_at, end_at: end_at, batch_size: batch_size) do + enum_for(:find_each, start: start, finish: finish, batch_size: batch_size) do relation = self - apply_limits(relation, begin_at, end_at).size + apply_limits(relation, start, finish).size end end end @@ -88,15 +81,15 @@ module ActiveRecord # # ==== Options # * <tt>:batch_size</tt> - Specifies the size of the batch. Default to 1000. - # * <tt>:begin_at</tt> - Specifies the primary key value to start from, inclusive of the value. - # * <tt>:end_at</tt> - Specifies the primary key value to end at, inclusive of the value. + # * <tt>:start</tt> - Specifies the primary key value to start from, inclusive of the value. + # * <tt>:finish</tt> - Specifies the primary key value to end at, inclusive of the value. # This is especially useful if you want multiple workers dealing with # the same processing queue. You can make worker 1 handle all the records # between id 0 and 10,000 and worker 2 handle from 10,000 and beyond - # (by setting the +:begin_at+ and +:end_at+ option on each worker). + # (by setting the +:start+ and +:finish+ option on each worker). # # # Let's process the next 2000 records - # Person.find_in_batches(begin_at: 2000, batch_size: 2000) do |group| + # Person.find_in_batches(start: 2000, batch_size: 2000) do |group| # group.each { |person| person.party_all_night! } # end # @@ -107,24 +100,16 @@ module ActiveRecord # # NOTE: You can't set the limit either, that's used to control # the batch sizes. - def find_in_batches(begin_at: nil, end_at: nil, batch_size: 1000, start: nil) - if start - begin_at = start - ActiveSupport::Deprecation.warn(<<-MSG.squish) - Passing `start` value to find_in_batches is deprecated, and will be removed in Rails 5.1. - Please pass `begin_at` instead. - MSG - end - + def find_in_batches(start: nil, finish: nil, batch_size: 1000) relation = self unless block_given? - return to_enum(:find_in_batches, begin_at: begin_at, end_at: end_at, batch_size: batch_size) do - total = apply_limits(relation, begin_at, end_at).size + return to_enum(:find_in_batches, start: start, finish: finish, batch_size: batch_size) do + total = apply_limits(relation, start, finish).size (total - 1).div(batch_size) + 1 end end - in_batches(of: batch_size, begin_at: begin_at, end_at: end_at, load: true) do |batch| + in_batches(of: batch_size, start: start, finish: finish, load: true) do |batch| yield batch.to_a end end @@ -153,18 +138,18 @@ module ActiveRecord # ==== Options # * <tt>:of</tt> - Specifies the size of the batch. Default to 1000. # * <tt>:load</tt> - Specifies if the relation should be loaded. Default to false. - # * <tt>:begin_at</tt> - Specifies the primary key value to start from, inclusive of the value. - # * <tt>:end_at</tt> - Specifies the primary key value to end at, inclusive of the value. + # * <tt>:start</tt> - Specifies the primary key value to start from, inclusive of the value. + # * <tt>:finish</tt> - Specifies the primary key value to end at, inclusive of the value. # # This is especially useful if you want to work with the # ActiveRecord::Relation object instead of the array of records, or if # you want multiple workers dealing with the same processing queue. You can # make worker 1 handle all the records between id 0 and 10,000 and worker 2 - # handle from 10,000 and beyond (by setting the +:begin_at+ and +:end_at+ + # handle from 10,000 and beyond (by setting the +:start+ and +:finish+ # option on each worker). # # # Let's process the next 2000 records - # Person.in_batches(of: 2000, begin_at: 2000).update_all(awesome: true) + # Person.in_batches(of: 2000, start: 2000).update_all(awesome: true) # # An example of calling where query method on the relation: # @@ -186,10 +171,10 @@ module ActiveRecord # # NOTE: You can't set the limit either, that's used to control the batch # sizes. - def in_batches(of: 1000, begin_at: nil, end_at: nil, load: false) + def in_batches(of: 1000, start: nil, finish: nil, load: false) relation = self unless block_given? - return BatchEnumerator.new(of: of, begin_at: begin_at, end_at: end_at, relation: self) + return BatchEnumerator.new(of: of, start: start, finish: finish, relation: self) end if logger && (arel.orders.present? || arel.taken.present?) @@ -197,7 +182,7 @@ module ActiveRecord end relation = relation.reorder(batch_order).limit(of) - relation = apply_limits(relation, begin_at, end_at) + relation = apply_limits(relation, start, finish) batch_relation = relation loop do @@ -225,9 +210,9 @@ module ActiveRecord private - def apply_limits(relation, begin_at, end_at) - relation = relation.where(table[primary_key].gteq(begin_at)) if begin_at - relation = relation.where(table[primary_key].lteq(end_at)) if end_at + def apply_limits(relation, start, finish) + relation = relation.where(table[primary_key].gteq(start)) if start + relation = relation.where(table[primary_key].lteq(finish)) if finish relation end diff --git a/activerecord/lib/active_record/relation/batches/batch_enumerator.rb b/activerecord/lib/active_record/relation/batches/batch_enumerator.rb index 153aae9584..c6e39814dd 100644 --- a/activerecord/lib/active_record/relation/batches/batch_enumerator.rb +++ b/activerecord/lib/active_record/relation/batches/batch_enumerator.rb @@ -3,11 +3,11 @@ module ActiveRecord class BatchEnumerator include Enumerable - def initialize(of: 1000, begin_at: nil, end_at: nil, relation:) #:nodoc: + def initialize(of: 1000, start: nil, finish: nil, relation:) #:nodoc: @of = of @relation = relation - @begin_at = begin_at - @end_at = end_at + @start = start + @finish = finish end # Looping through a collection of records from the database (using the @@ -34,7 +34,7 @@ module ActiveRecord def each_record return to_enum(:each_record) unless block_given? - @relation.to_enum(:in_batches, of: @of, begin_at: @begin_at, end_at: @end_at, load: true).each do |relation| + @relation.to_enum(:in_batches, of: @of, start: @start, finish: @finish, load: true).each do |relation| relation.to_a.each { |record| yield record } end end @@ -46,7 +46,7 @@ module ActiveRecord # People.in_batches.update_all('age = age + 1') [:delete_all, :update_all, :destroy_all].each do |method| define_method(method) do |*args, &block| - @relation.to_enum(:in_batches, of: @of, begin_at: @begin_at, end_at: @end_at, load: false).each do |relation| + @relation.to_enum(:in_batches, of: @of, start: @start, finish: @finish, load: false).each do |relation| relation.send(method, *args, &block) end end @@ -58,7 +58,7 @@ module ActiveRecord # relation.update_all(awesome: true) # end def each - enum = @relation.to_enum(:in_batches, of: @of, begin_at: @begin_at, end_at: @end_at, load: false) + enum = @relation.to_enum(:in_batches, of: @of, start: @start, finish: @finish, load: false) return enum.each { |relation| yield relation } if block_given? enum end diff --git a/activerecord/test/cases/batches_test.rb b/activerecord/test/cases/batches_test.rb index da65336305..3602ee7ba2 100644 --- a/activerecord/test/cases/batches_test.rb +++ b/activerecord/test/cases/batches_test.rb @@ -38,7 +38,7 @@ class EachTest < ActiveRecord::TestCase if Enumerator.method_defined? :size def test_each_should_return_a_sized_enumerator assert_equal 11, Post.find_each(batch_size: 1).size - assert_equal 5, Post.find_each(batch_size: 2, begin_at: 7).size + assert_equal 5, Post.find_each(batch_size: 2, start: 7).size assert_equal 11, Post.find_each(batch_size: 10_000).size end end @@ -101,16 +101,16 @@ class EachTest < ActiveRecord::TestCase def test_find_in_batches_should_start_from_the_start_option assert_queries(@total) do - Post.find_in_batches(batch_size: 1, begin_at: 2) do |batch| + Post.find_in_batches(batch_size: 1, start: 2) do |batch| assert_kind_of Array, batch assert_kind_of Post, batch.first end end end - def test_find_in_batches_should_end_at_the_end_option + def test_find_in_batches_should_finish_the_end_option assert_queries(6) do - Post.find_in_batches(batch_size: 1, end_at: 5) do |batch| + Post.find_in_batches(batch_size: 1, finish: 5) do |batch| assert_kind_of Array, batch assert_kind_of Post, batch.first end @@ -175,7 +175,7 @@ class EachTest < ActiveRecord::TestCase def test_find_in_batches_should_not_modify_passed_options assert_nothing_raised do - Post.find_in_batches({ batch_size: 42, begin_at: 1 }.freeze){} + Post.find_in_batches({ batch_size: 42, start: 1 }.freeze){} end end @@ -184,7 +184,7 @@ class EachTest < ActiveRecord::TestCase start_nick = nick_order_subscribers.second.nick subscribers = [] - Subscriber.find_in_batches(batch_size: 1, begin_at: start_nick) do |batch| + Subscriber.find_in_batches(batch_size: 1, start: start_nick) do |batch| subscribers.concat(batch) end @@ -311,15 +311,15 @@ class EachTest < ActiveRecord::TestCase def test_in_batches_should_start_from_the_start_option post = Post.order('id ASC').where('id >= ?', 2).first assert_queries(2) do - relation = Post.in_batches(of: 1, begin_at: 2).first + relation = Post.in_batches(of: 1, start: 2).first assert_equal post, relation.first end end - def test_in_batches_should_end_at_the_end_option + def test_in_batches_should_finish_the_end_option post = Post.order('id DESC').where('id <= ?', 5).first assert_queries(7) do - relation = Post.in_batches(of: 1, end_at: 5, load: true).reverse_each.first + relation = Post.in_batches(of: 1, finish: 5, load: true).reverse_each.first assert_equal post, relation.last end end @@ -371,7 +371,7 @@ class EachTest < ActiveRecord::TestCase def test_in_batches_should_not_modify_passed_options assert_nothing_raised do - Post.in_batches({ of: 42, begin_at: 1 }.freeze){} + Post.in_batches({ of: 42, start: 1 }.freeze){} end end @@ -380,7 +380,7 @@ class EachTest < ActiveRecord::TestCase start_nick = nick_order_subscribers.second.nick subscribers = [] - Subscriber.in_batches(of: 1, begin_at: start_nick) do |relation| + Subscriber.in_batches(of: 1, start: start_nick) do |relation| subscribers.concat(relation) end @@ -441,32 +441,11 @@ class EachTest < ActiveRecord::TestCase assert_equal 2, person.reload.author_id # incremented only once end - def test_find_in_batches_start_deprecated - assert_deprecated do - assert_queries(@total) do - Post.find_in_batches(batch_size: 1, start: 2) do |batch| - assert_kind_of Array, batch - assert_kind_of Post, batch.first - end - end - end - end - - def test_find_each_start_deprecated - assert_deprecated do - assert_queries(@total) do - Post.find_each(batch_size: 1, start: 2) do |post| - assert_kind_of Post, post - end - end - end - end - if Enumerator.method_defined? :size def test_find_in_batches_should_return_a_sized_enumerator assert_equal 11, Post.find_in_batches(:batch_size => 1).size assert_equal 6, Post.find_in_batches(:batch_size => 2).size - assert_equal 4, Post.find_in_batches(batch_size: 2, begin_at: 4).size + assert_equal 4, Post.find_in_batches(batch_size: 2, start: 4).size assert_equal 4, Post.find_in_batches(:batch_size => 3).size assert_equal 1, Post.find_in_batches(:batch_size => 10_000).size end diff --git a/guides/source/active_record_querying.md b/guides/source/active_record_querying.md index 674f498ae4..784be91845 100644 --- a/guides/source/active_record_querying.md +++ b/guides/source/active_record_querying.md @@ -348,7 +348,7 @@ end The `find_each` method accepts most of the options allowed by the regular `find` method, except for `:order` and `:limit`, which are reserved for internal use by `find_each`. -Three additional options, `:batch_size`, `:begin_at` and `:end_at`, are available as well. +Three additional options, `:batch_size`, `:start` and `:finish`, are available as well. **`:batch_size`** @@ -360,34 +360,34 @@ User.find_each(batch_size: 5000) do |user| end ``` -**`:begin_at`** +**`:start`** -By default, records are fetched in ascending order of the primary key, which must be an integer. The `:begin_at` option allows you to configure the first ID of the sequence whenever the lowest ID is not the one you need. This would be useful, for example, if you wanted to resume an interrupted batch process, provided you saved the last processed ID as a checkpoint. +By default, records are fetched in ascending order of the primary key, which must be an integer. The `:start` option allows you to configure the first ID of the sequence whenever the lowest ID is not the one you need. This would be useful, for example, if you wanted to resume an interrupted batch process, provided you saved the last processed ID as a checkpoint. For example, to send newsletters only to users with the primary key starting from 2000, and to retrieve them in batches of 5000: ```ruby -User.find_each(begin_at: 2000, batch_size: 5000) do |user| +User.find_each(start: 2000, batch_size: 5000) do |user| NewsMailer.weekly(user).deliver_now end ``` -**`:end_at`** +**`:finish`** -Similar to the `:begin_at` option, `:end_at` allows you to configure the last ID of the sequence whenever the highest ID is not the one you need. -This would be useful, for example, if you wanted to run a batch process, using a subset of records based on `:begin_at` and `:end_at` +Similar to the `:start` option, `:finish` allows you to configure the last ID of the sequence whenever the highest ID is not the one you need. +This would be useful, for example, if you wanted to run a batch process, using a subset of records based on `:start` and `:finish` For example, to send newsletters only to users with the primary key starting from 2000 up to 10000 and to retrieve them in batches of 5000: ```ruby -User.find_each(begin_at: 2000, end_at: 10000, batch_size: 5000) do |user| +User.find_each(start: 2000, finish: 10000, batch_size: 5000) do |user| NewsMailer.weekly(user).deliver_now end ``` Another example would be if you wanted multiple workers handling the same processing queue. You could have each worker handle 10000 records by setting the -appropriate `:begin_at` and `:end_at` options on each worker. +appropriate `:start` and `:finish` options on each worker. #### `find_in_batches` @@ -402,7 +402,7 @@ end ##### Options for `find_in_batches` -The `find_in_batches` method accepts the same `:batch_size`, `:begin_at` and `:end_at` options as `find_each`. +The `find_in_batches` method accepts the same `:batch_size`, `:start` and `:finish` options as `find_each`. Conditions ---------- |