Make backfill batch selection exclude rows inserted or updated after backfill start #652
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backfill only rows present at backfill start. This is third approach to solving #583. The previous two are:
This is the most direct approach to solving the problem. At the same time as the up/down triggers are created to perform a backfill, a
_pgroll_needs_backfill
column is also created on the table to be backfilled. The column has aDEFAULT
oftrue
; the constant default ensures that this extra column can be added quickly without a lengthyACCESS_EXCLUSIVE
lock. The column is removed when the the operation is rolled back or completed.The up/down triggers are modified to set
_pgroll_needs_backfill
to false whenever they update a row.The backfill itself is updated to select only rows having
_pgroll_needs_backfill
set totrue
- this ensures that only rows created before the triggers were installed are updated by the backfill. The backfill process still needs to read every row in the table, including those inserted/updated after backfill start, but only those rows created before backfill start will be updated.The main disadvantage of this approach is that backfill now requires an extra column to be created on the target table.
NOTE: We'd need to update some docs (especially the tutorial) to mention this new column if we go with this solution.