Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: merge back delete request when we are done processing all its splits #15968

Merged
merged 5 commits into from
Jan 28, 2025

Conversation

sandeepsukhani
Copy link
Contributor

What this PR does / why we need it:
By default, we split delete requests with line filters by 24-hour intervals. The splits are not visible to the user and are managed internally by creating a single entry per split in the database. The problem with that is that each split takes up space in the database and adds overhead every time we load requests. In this PR, I have added support for merging the splits back so that we have a single entry in the database for those requests. The splits would be merged only when we are done processing all the splits for a request.

Special notes for your reviewer:
I have also made the following few changes:

  • Changed the order of performing deletion and insertion from a boltdb write batch. Earlier we used to perform insertion before deletion which means if we have duplicate entries lined up in deletion and insertion, we will end up losing the entries since deletion updates were applied at the end. This change was required because we do have some similar database entries between sharded vs merged requests. I have checked the code, and this is the only place where we will have both inserts and deletes in the same batch, so this change should not cause any problems.
  • Removed code to parse deletion queries(selector) every time we load requests. Parsing deletion queries every time we load requests is one of the major contributors to increased memory usage. The parsed deletion query is used only when executing deletion, and we have checks in place to parse the query when required in those places.

Checklist

  • Tests updated

@sandeepsukhani sandeepsukhani requested a review from a team as a code owner January 27, 2025 08:32
@sandeepsukhani sandeepsukhani enabled auto-merge (squash) January 27, 2025 09:30
for _, req := range deleteRequests {
// do not consider requests which do not have an id. Request ID won't be set in some tests or there is a bug in our code for loading requests.
if req.RequestID == "" {
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log this information before continuing?

@sandeepsukhani sandeepsukhani merged commit 07ccf7c into main Jan 28, 2025
58 checks passed
@sandeepsukhani sandeepsukhani deleted the merge-back-delete-request-splits branch January 28, 2025 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants