├── lib ├── delete_in_batches │ └── version.rb └── delete_in_batches.rb ├── Gemfile ├── Rakefile ├── .gitignore ├── CHANGELOG.md ├── .github └── workflows │ └── build.yml ├── delete_in_batches.gemspec ├── LICENSE.txt ├── test ├── test_helper.rb └── delete_in_batches_test.rb └── README.md /lib/delete_in_batches/version.rb: -------------------------------------------------------------------------------- 1 | module DeleteInBatches 2 | VERSION = "0.2.1" 3 | end 4 | -------------------------------------------------------------------------------- /Gemfile: -------------------------------------------------------------------------------- 1 | source "https://rubygems.org" 2 | 3 | gemspec 4 | 5 | gem "rake" 6 | gem "minitest" 7 | gem "sqlite3" 8 | gem "pg" 9 | gem "mysql2" 10 | -------------------------------------------------------------------------------- /Rakefile: -------------------------------------------------------------------------------- 1 | require "bundler/gem_tasks" 2 | require "rake/testtask" 3 | 4 | task default: :test 5 | Rake::TestTask.new do |t| 6 | t.libs << "test" 7 | t.pattern = "test/**/*_test.rb" 8 | t.warning = false 9 | end 10 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.gem 2 | *.rbc 3 | .bundle 4 | .config 5 | .yardoc 6 | Gemfile.lock 7 | InstalledFiles 8 | _yardoc 9 | coverage 10 | doc/ 11 | lib/bundler/man 12 | pkg 13 | rdoc 14 | spec/reports 15 | test/tmp 16 | test/version_tmp 17 | tmp 18 | *.lock 19 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 0.2.1 (2021-03-06) 2 | 3 | - Added warning for non-Postgres databases 4 | - Added `sleep` option 5 | 6 | ## 0.2.0 (2020-05-27) 7 | 8 | - Improved Active Record integration 9 | - Dropped support for Rails < 5 10 | 11 | ## 0.1.0 (2018-09-05) 12 | 13 | - Added support for MySQL 14 | 15 | ## 0.0.2 (2014-01-27) 16 | 17 | - Added missing dependency 18 | 19 | ## 0.0.1 (2014-01-27) 20 | 21 | - First release 22 | -------------------------------------------------------------------------------- /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: build 2 | on: [push, pull_request] 3 | jobs: 4 | build: 5 | runs-on: ubuntu-latest 6 | steps: 7 | - uses: actions/checkout@v3 8 | - uses: ruby/setup-ruby@v1 9 | with: 10 | ruby-version: 3.2 11 | bundler-cache: true 12 | 13 | - run: bundle exec rake test 14 | 15 | - uses: ankane/setup-postgres@v1 16 | with: 17 | database: delete_in_batches_test 18 | - run: ADAPTER=postgresql bundle exec rake test 19 | 20 | - uses: ankane/setup-mysql@v1 21 | with: 22 | database: delete_in_batches_test 23 | - run: ADAPTER=mysql2 bundle exec rake test 24 | -------------------------------------------------------------------------------- /delete_in_batches.gemspec: -------------------------------------------------------------------------------- 1 | require_relative "lib/delete_in_batches/version" 2 | 3 | Gem::Specification.new do |spec| 4 | spec.name = "delete_in_batches" 5 | spec.version = DeleteInBatches::VERSION 6 | spec.summary = "Fast batch deletes for Active Record and Postgres" 7 | spec.homepage = "https://github.com/ankane/delete_in_batches" 8 | spec.license = "MIT" 9 | 10 | spec.author = "Andrew Kane" 11 | spec.email = "andrew@ankane.org" 12 | 13 | spec.files = Dir["*.{md,txt}", "{lib}/**/*"] 14 | spec.require_path = "lib" 15 | 16 | spec.required_ruby_version = ">= 2.4" 17 | 18 | spec.add_dependency "activerecord", ">= 5" 19 | end 20 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright (c) 2014-2021 Andrew Kane 2 | 3 | MIT License 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining 6 | a copy of this software and associated documentation files (the 7 | "Software"), to deal in the Software without restriction, including 8 | without limitation the rights to use, copy, modify, merge, publish, 9 | distribute, sublicense, and/or sell copies of the Software, and to 10 | permit persons to whom the Software is furnished to do so, subject to 11 | the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be 14 | included in all copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 19 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 20 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 22 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 23 | -------------------------------------------------------------------------------- /lib/delete_in_batches.rb: -------------------------------------------------------------------------------- 1 | # dependencies 2 | require "active_support" 3 | 4 | # modules 5 | require "delete_in_batches/version" 6 | 7 | module DeleteInBatches 8 | # TODO use keyword arguments 9 | def delete_in_batches(options = {}) 10 | batch_size = options[:batch_size] || 10000 11 | 12 | pk = "#{quoted_table_name}.#{quoted_primary_key}" 13 | sql_proc = proc { select(pk).limit(batch_size).to_sql } 14 | sql = connection.try(:unprepared_statement, &sql_proc) || sql_proc.call 15 | 16 | if %w(MySQL Mysql2 Mysql2Spatial).include?(connection.adapter_name) 17 | sql = "SELECT * FROM (#{sql}) AS t" 18 | end 19 | 20 | unless connection.adapter_name =~ /postg/i 21 | # TODO raise error 22 | warn "[delete_in_batches] Use in_batches(of: #{batch_size.to_i}).delete_all instead of this gem for non-Postgres databases" 23 | end 24 | 25 | while connection.delete("DELETE FROM #{quoted_table_name} WHERE #{pk} IN (#{sql})") == batch_size 26 | yield if block_given? 27 | sleep(options[:sleep]) if options[:sleep] 28 | end 29 | end 30 | end 31 | 32 | ActiveSupport.on_load(:active_record) do 33 | extend DeleteInBatches 34 | end 35 | -------------------------------------------------------------------------------- /test/test_helper.rb: -------------------------------------------------------------------------------- 1 | require "bundler/setup" 2 | Bundler.require(:default) 3 | require "minitest/autorun" 4 | require "minitest/pride" 5 | require "active_record" 6 | 7 | ActiveRecord::Base.logger = ActiveSupport::Logger.new(ENV["VERBOSE"] ? STDOUT : nil) 8 | 9 | # rails does this in activerecord/lib/active_record/railtie.rb 10 | ActiveRecord::Base.default_timezone = :utc 11 | ActiveRecord::Base.time_zone_aware_attributes = true 12 | 13 | # migrations 14 | case ENV["ADAPTER"] 15 | when "postgresql" 16 | ActiveRecord::Base.establish_connection(adapter: "postgresql", database: "delete_in_batches_test") 17 | when "mysql2" 18 | ActiveRecord::Base.establish_connection(adapter: "mysql2", database: "delete_in_batches_test") 19 | else 20 | ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:") 21 | end 22 | 23 | ActiveRecord::Schema.verbose = ENV["VERBOSE"] 24 | ActiveRecord::Schema.define do 25 | create_table :tweets, force: true do |t| 26 | t.integer :user_id 27 | end 28 | 29 | create_table :users, force: true do |t| 30 | end 31 | end 32 | 33 | class Tweet < ActiveRecord::Base 34 | belongs_to :user 35 | end 36 | 37 | class User < ActiveRecord::Base 38 | has_many :tweets 39 | end 40 | -------------------------------------------------------------------------------- /test/delete_in_batches_test.rb: -------------------------------------------------------------------------------- 1 | require_relative "test_helper" 2 | 3 | class TestDeleteInBatches < Minitest::Test 4 | def setup 5 | User.delete_all 6 | Tweet.delete_all 7 | end 8 | 9 | def test_basic 10 | create_tweets 11 | Tweet.create!(user_id: 2) 12 | 13 | Tweet.where(user_id: 1).delete_in_batches(batch_size: 2) 14 | 15 | assert_equal 1, Tweet.count 16 | assert_equal 2, Tweet.first.user_id 17 | end 18 | 19 | def test_all 20 | Tweet.create!(user_id: 1) 21 | 22 | Tweet.delete_in_batches 23 | 24 | assert_equal 0, Tweet.count 25 | end 26 | 27 | def test_sleep 28 | create_tweets 29 | 30 | started_at = Time.now 31 | Tweet.where(user_id: 1).delete_in_batches(batch_size: 2, sleep: 0.01) 32 | assert_operator(Time.now - started_at, :>=, 0.05) 33 | end 34 | 35 | def test_progress 36 | create_tweets 37 | 38 | i = 0 39 | Tweet.where(user_id: 1).delete_in_batches(batch_size: 2) do 40 | i += 1 41 | end 42 | 43 | assert_equal 5, i 44 | end 45 | 46 | def test_association 47 | user = User.create! 48 | user.tweets.create! 49 | 50 | user.tweets.delete_in_batches 51 | 52 | assert_equal 0, user.tweets.count 53 | end 54 | 55 | def test_join 56 | user = User.create! 57 | user.tweets.create! 58 | 59 | Tweet.joins(:user).where(users: {id: user.id}).delete_in_batches 60 | 61 | assert_equal 0, Tweet.count 62 | end 63 | 64 | def create_tweets 65 | tweets = 10.times.map { {user_id: 1} } 66 | if Tweet.respond_to?(:insert_all) 67 | Tweet.insert_all(tweets) 68 | else 69 | Tweet.create!(tweets) 70 | end 71 | end 72 | end 73 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # delete_in_batches 2 | 3 | :fire: Fast batch deletes for Active Record and Postgres 4 | 5 | [![Build Status](https://github.com/ankane/delete_in_batches/workflows/build/badge.svg?branch=master)](https://github.com/ankane/delete_in_batches/actions) 6 | 7 | ## Installation 8 | 9 | Add this line to your application’s Gemfile: 10 | 11 | ```ruby 12 | gem 'delete_in_batches' 13 | ``` 14 | 15 | ## How to Use 16 | 17 | Delete rows in batches 18 | 19 | ```ruby 20 | Tweet.where(user_id: 1).delete_in_batches 21 | ``` 22 | 23 | **Important:** Be sure to test your query before running it in production 24 | 25 | Change the batch size 26 | 27 | ```ruby 28 | Tweet.where(user_id: 1).delete_in_batches(batch_size: 50000) # defaults to 10000 29 | ``` 30 | 31 | Sleep between batches 32 | 33 | ```ruby 34 | Tweet.where(user_id: 1).delete_in_batches(sleep: 0.01) 35 | ``` 36 | 37 | Show progress 38 | 39 | ```ruby 40 | Tweet.where(user_id: 1).delete_in_batches do 41 | puts "Another batch deleted" 42 | end 43 | ``` 44 | 45 | Works with associations 46 | 47 | ```ruby 48 | user.tweets.delete_in_batches 49 | ``` 50 | 51 | To delete all rows in a table, `TRUNCATE` is fastest. 52 | 53 | ```ruby 54 | ActiveRecord::Base.connection.execute("TRUNCATE tweets") 55 | ``` 56 | 57 | ## History 58 | 59 | View the [changelog](https://github.com/ankane/delete_in_batches/blob/master/CHANGELOG.md) 60 | 61 | **Note:** This project originally had the description “the fastest way to delete 100k+ rows with ActiveRecord” but a single `DELETE` statement will likely be faster. See [this discussion](https://github.com/ankane/delete_in_batches/issues/4) for more details. 62 | 63 | ## Contributing 64 | 65 | Everyone is encouraged to help improve this project. Here are a few ways you can help: 66 | 67 | - [Report bugs](https://github.com/ankane/delete_in_batches/issues) 68 | - Fix bugs and [submit pull requests](https://github.com/ankane/delete_in_batches/pulls) 69 | - Write, clarify, or fix documentation 70 | - Suggest or add new features 71 | 72 | To get started with development: 73 | 74 | ```sh 75 | git clone https://github.com/ankane/delete_in_batches.git 76 | cd delete_in_batches 77 | bundle install 78 | bundle exec rake test 79 | ``` 80 | --------------------------------------------------------------------------------