Add a gauge for the effective machine version in ra_counters #426

the-mikedavis · 2024-03-12T16:22:26Z

This allows callers to cheaply query a server's current effective machine version.

Closes #424
This should be useful for rabbitmq/khepri#250

kjnilsson

One comment to look into. otherwise fine!

src/ra_server.erl

This allows callers to cheaply query a server's current effective machine version.

[Why] The "client" side of `khepri_machine` implemented in `process_sync_command/3` have a retry mechanism if `ra:process_command/3` returns an error such as `noproc`, `nodedown` or `shutdown`. However, this retry mechanism can't tell if the state machine already received the command and just couldn't reply, for instance because there is a node stopping or a change of leadership. Therefore, it's possible that the same command is submitted twice and thus processed twice. That's ok for idempotent commands, but it may not be alright for all transactions for example. That's why we need a deduplication mechanism that ensures the same command is not applied multiple times. [How] Two new commands are introduced to implement the deduplication system: * #dedup{} which is used to wrap the command to protect and assign a unique reference to it * #dedup_ack{} which is used at the end of the retry loop to let the state machine know that the "client" side received the reply When the state machine receives a command wrapped into a #dedup{} command, it will remember the reply for the initial processing of that command. For any subsequent copies of the same #dedup{} (based on the unique reference), the state machine will not apply the wrapped command and will simply returned the reply it remembered from the first application. Later when the state machine receives a #dedup_ack{}, it will drop the cached reply for that reference. Just in case the client never sends a #dedup_ack{}, the state machine will drop any expired cached entries. The expiration time is based on the command timeout. If it's infinity, it defaults to 15 minutes. This whole deduplication mechanism can be enabled or disabled through the new `protect_against_dups` command option which takes a boolean. This option is off by default, except for R/W transactions. Thus if the caller knows the transation is idempotent, it can decide to turn the dedup mechanism off. Because the state machine's state grows with a new field and handles two new commandes, we bump the machine version from 0 to 1. V2: We now use the `effective_machine_version` counter provided by `ra_counters:counters/2` if it is available as it is faster than querying the Ra server. If the counter is unavailable, we fall back to the query. The new counter is added by rabbitmq/ra#426 and will be used once a Ra release contains this change.

the-mikedavis requested a review from kjnilsson March 12, 2024 16:22

the-mikedavis self-assigned this Mar 12, 2024

the-mikedavis force-pushed the effective-machine-version-gauge branch from 6c1cfee to f0d0ef6 Compare March 12, 2024 16:24

the-mikedavis marked this pull request as draft March 13, 2024 16:37

the-mikedavis marked this pull request as ready for review March 14, 2024 17:47

kjnilsson requested changes Mar 28, 2024

View reviewed changes

src/ra_server.erl Outdated Show resolved Hide resolved

Add a gauge for the effective machine version in ra_counters

a66eac3

This allows callers to cheaply query a server's current effective machine version.

the-mikedavis force-pushed the effective-machine-version-gauge branch from f0d0ef6 to a66eac3 Compare March 28, 2024 13:35

kjnilsson approved these changes Mar 28, 2024

View reviewed changes

kjnilsson merged commit 0c4deea into main Mar 28, 2024
6 of 7 checks passed

the-mikedavis deleted the effective-machine-version-gauge branch March 28, 2024 13:48

the-mikedavis mentioned this pull request Mar 28, 2024

khepri_machine: Add command deduplication mechanism rabbitmq/khepri#250

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a gauge for the effective machine version in ra_counters #426

Add a gauge for the effective machine version in ra_counters #426

the-mikedavis commented Mar 12, 2024

kjnilsson left a comment

Add a gauge for the effective machine version in ra_counters #426

Add a gauge for the effective machine version in ra_counters #426

Conversation

the-mikedavis commented Mar 12, 2024

kjnilsson left a comment

Choose a reason for hiding this comment