Skip to content

Commit

Permalink
Reduce default batch sizes for background data migrations
Browse files Browse the repository at this point in the history
Having these large numbers it is easier to shoot yourself in the foot. Smaller numbers are generally safer
and people can override these globally or set per specific migration.
  • Loading branch information
fatkodima committed Jan 12, 2025
1 parent 374ff6f commit ce5da26
Show file tree
Hide file tree
Showing 5 changed files with 13 additions and 9 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
## master (unreleased)

- Reduce default batch sizes for background data migrations

`batch_size` was 20_000, now 1_000; `sub_batch_size` was 1_000, now 100

- Remove deprecated code
- Drop support for PostgreSQL < 12
- Drop support for Ruby < 3.0 and Rails < 7.0
Expand Down
4 changes: 2 additions & 2 deletions lib/generators/online_migrations/templates/initializer.rb.tt
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,10 @@ OnlineMigrations.configure do |config|
config.background_migrations.migrations_module = "OnlineMigrations::BackgroundMigrations"

# The number of rows to process in a single background migration run.
config.background_migrations.batch_size = 20_000
config.background_migrations.batch_size = 1_000

# The smaller batches size that the batches will be divided into.
config.background_migrations.sub_batch_size = 1000
config.background_migrations.sub_batch_size = 100

# The pause interval between each background migration job's execution (in seconds).
config.background_migrations.batch_pause = 0.seconds
Expand Down
8 changes: 4 additions & 4 deletions lib/online_migrations/background_migrations/config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ class Config
attr_accessor :migrations_module

# The number of rows to process in a single background migration run
# @return [Integer] defaults to 20_000
# @return [Integer] defaults to 1_000
#
attr_accessor :batch_size

# The smaller batches size that the batches will be divided into
# @return [Integer] defaults to 1000
# @return [Integer] defaults to 100
#
attr_accessor :sub_batch_size

Expand Down Expand Up @@ -61,8 +61,8 @@ class Config
def initialize
@migrations_path = "lib"
@migrations_module = "OnlineMigrations::BackgroundMigrations"
@batch_size = 20_000
@sub_batch_size = 1000
@batch_size = 1_000
@sub_batch_size = 100
@batch_pause = 0.seconds
@sub_batch_pause_ms = 100
@batch_max_attempts = 5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -335,8 +335,8 @@ def perform_action_on_relation_in_background(model_name, conditions, action, upd
# defaults to `SELECT MIN(batch_column_name)`
# @option options [Integer] :max_value Value in the column the batching will end at,
# defaults to `SELECT MAX(batch_column_name)`
# @option options [Integer] :batch_size (20_000) Number of rows to process in a single background migration run
# @option options [Integer] :sub_batch_size (1000) Smaller batches size that the batches will be divided into
# @option options [Integer] :batch_size (1_000) Number of rows to process in a single background migration run
# @option options [Integer] :sub_batch_size (100) Smaller batches size that the batches will be divided into
# @option options [Integer] :batch_pause (0) Pause interval between each background migration job's execution (in seconds)
# @option options [Integer] :sub_batch_pause_ms (100) Number of milliseconds to sleep between each sub_batch execution
# @option options [Integer] :batch_max_attempts (5) Maximum number of batch run attempts
Expand Down
2 changes: 1 addition & 1 deletion lib/online_migrations/schema_statements.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ module SchemaStatements
# @param column_name [String, Symbol]
# @param value value for the column. It is typically a literal. To perform a computed
# update, an Arel literal can be used instead
# @option options [Integer] :batch_size (1000) size of the batch
# @option options [Integer] :batch_size (1_000) size of the batch
# @option options [String, Symbol] :batch_column_name (primary key) option is for tables without primary key, in this
# case another unique integer column can be used. Example: `:user_id`
# @option options [Proc, Boolean] :progress (false) whether to show progress while running.
Expand Down

0 comments on commit ce5da26

Please sign in to comment.