Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.0: use Ra checkpoints in rabbit_fifo for sub-linear time recovery of QQs on boot #10487

Closed
wants to merge 9 commits into from

Conversation

the-mikedavis
Copy link
Member

@the-mikedavis the-mikedavis commented Feb 5, 2024

This is the companion PR for rabbitmq/ra#415

rabbit_fifo currently has an ad-hoc checkpointing system where it saves {release_cursor, RaftIdx, State} effects in-memory periodically and emits them as the release cursor moves up. By building checkpointing into ra we can save the checkpoints on disk instead, reducing QQ memory footprint and enabling us to recover even very long queues in logarithmic time (w.r.t. the length of the queue). See rabbitmq/ra#415 for in-depth details about checkpoints.

Connects #8261

@the-mikedavis the-mikedavis self-assigned this Feb 5, 2024
@mergify mergify bot added the bazel label Feb 5, 2024
@michaelklishin michaelklishin changed the title Use Ra checkpoints in rabbit_fifo Use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot Feb 5, 2024
@michaelklishin michaelklishin added this to the 4.0.0 milestone Feb 5, 2024
@michaelklishin michaelklishin changed the title Use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot 4.0: use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot Feb 5, 2024
@the-mikedavis the-mikedavis changed the title 4.0: use Ra checkpoints in rabbit_fifo for constant time recovery of QQs on boot 4.0: use Ra checkpoints in rabbit_fifo for sub-linear time recovery of QQs on boot Feb 8, 2024
@kjnilsson kjnilsson mentioned this pull request Feb 28, 2024
12 tasks
@the-mikedavis the-mikedavis changed the base branch from main to qq-v4 February 29, 2024 15:50
@the-mikedavis the-mikedavis force-pushed the md-ra-checkpoints branch 3 times, most recently from 6ef156b to 309600f Compare February 29, 2024 20:49
@kjnilsson kjnilsson force-pushed the qq-v4 branch 5 times, most recently from 75291c7 to bb89f2d Compare March 5, 2024 14:29
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from a479434 to b6d9b85 Compare March 8, 2024 10:56
@kjnilsson kjnilsson mentioned this pull request Apr 2, 2024
9 tasks
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from e5db089 to 4bd8c1f Compare April 30, 2024 16:27
kjnilsson added 2 commits May 1, 2024 09:12
Create the new version but not including any changes yet.

fix

QQ: force delete followers after leader has terminated.

Also try a longer sleep for mqtt_shared_SUITE so that the
delete operation stands a chance to time out and move on
to the forced deletion stage.

In some mixed machine version scenarios some followers will never
apply the poison pill command so we may as well force delete them
just in case.

QQ: skip test in amqp_client that cannot pass with mixed machine versions

QQ: remove dead code

Code relating to prior machine versions and state conversions.

formatting / readability

rabbit_fifo_prop_SUITE fixes
Also update rabbit_fifo_* suites to test more relevant code versions
where applicable.

add ff mock

QQ: always use the updated credit mode format

QQv4: use more compact consumer reference in settle, credit, return

This introudces a new type: consumer_key() which is either the consumer_id
or the raft index the checkout was processed at. If the consumer is
using one of the updated credit spec formats rabbit_fifo will use the
raft index as the primary key for the consumer such that the rabbit
fifo client can then use the more space efficient integer index
instead of the full consumer id in subsequent commands.

There is compatibility code to still accept the consumer id in
settle, return, discard and credit commands but this is slighlyt
slower and of course less space efficient.

The old form will be used in cases where the fifo client may have
already remove the local consumer state (as happens after a cancel).

Lots of test refactorings of the rabbit_fifo_SUITE to begin to use
the new forms.
@kjnilsson kjnilsson force-pushed the qq-v4 branch 4 times, most recently from 6be53b0 to 1bc6b22 Compare May 13, 2024 08:48
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from 6b296a1 to 93dfb0f Compare May 16, 2024 14:06
@kjnilsson kjnilsson force-pushed the qq-v4 branch 5 times, most recently from 7753e09 to 84b15d3 Compare June 13, 2024 08:32
@kjnilsson kjnilsson force-pushed the qq-v4 branch 2 times, most recently from 5928e6c to b487f7e Compare June 17, 2024 14:24
@kjnilsson kjnilsson force-pushed the qq-v4 branch 3 times, most recently from 90d8cfa to 4fc8565 Compare July 2, 2024 14:35
@ansd ansd force-pushed the qq-v4 branch 2 times, most recently from 5f1122e to 95d1994 Compare July 12, 2024 11:53
@the-mikedavis
Copy link
Member Author

This has been folded into #10637

@the-mikedavis the-mikedavis deleted the md-ra-checkpoints branch July 13, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants