Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generates erasure codes in-place using mutable references into shreds' payload #4609

Merged

Conversation

behzadnouri
Copy link

Problem

When making Merkle shreds from data, parity shards are first generated externally in a vector of vectors:
https://github.com/anza-xyz/agave/blob/6ff4dee5f/ledger/src/shred/merkle.rs#L1325

and then copied into coding shreds payload:
https://github.com/anza-xyz/agave/blob/6ff4dee5f/ledger/src/shred/merkle.rs#L1346

There are also many intermediate vector allocations in the process.

Summary of Changes

The commit avoids this and minimizes allocations by first initializing all data and coding shreds in all erasure batches in a single vector. Then the erasure codes are generated and populated in-place using mutable references into the coding shreds' payload.

This change is similar to #4356.
This commit modifies the leader code when generating shreds, whereas #4356 updated the non-leader side when recovering shreds from erasure codes.

@behzadnouri behzadnouri force-pushed the make-merkle-shreds-from-data-in-place branch 3 times, most recently from b066dff to 1492e51 Compare January 27, 2025 17:50
@behzadnouri behzadnouri requested a review from steviez January 28, 2025 21:44
@behzadnouri behzadnouri force-pushed the make-merkle-shreds-from-data-in-place branch from 1492e51 to cb2040a Compare January 29, 2025 02:39
…' payload

When making Merkle shreds from data, parity shards are first generated
externally in a vector of vectors:
https://github.com/anza-xyz/agave/blob/6ff4dee5f/ledger/src/shred/merkle.rs#L1325

and then copied into coding shreds payload:
https://github.com/anza-xyz/agave/blob/6ff4dee5f/ledger/src/shred/merkle.rs#L1346

There are also many intermediate vector allocations in the process.

The commit avoids this and minimizes allocations by first initializing
all data and coding shreds in all erasure batches in a single vector.
Then the erasure codes are generated and populated in-place using
mutable references into the coding shreds' payload.
@behzadnouri behzadnouri force-pushed the make-merkle-shreds-from-data-in-place branch from cb2040a to b107d27 Compare January 29, 2025 22:49
Copy link

@AshwinSekar AshwinSekar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just some comments for my understanding

.take(num_data_shreds);
make_shreds_data(&mut common_header_data, data_header, chunks).map(Shred::ShredData)
});
if let Some(Shred::ShredData(shred)) = shreds.last() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, at this point there must be at least one data shred here right? Is it better to expect?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that is true.
Though it is a fair sanity check, I am a bit hesitant to add new panics into the code, in case some subtle condition is missed.
I will add a debug_assert though in a follow up PR.

// Split the data into erasure batches and initialize
// data shreds from chunks of each batch.
let mut shreds = Vec::<ShredData>::new();
// data and coding shreds for each batch.
while data.len() >= 2 * chunk_size || data.len() == chunk_size {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding the || data.len() == chunk_size here is to avoid having to perform the proof size search in the last fec set calculation below?

Copy link

@AshwinSekar AshwinSekar Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also curious about this case chunk_size < data.len() < 2 * chunk_size:
I understand that generally bigger FEC sets are better, so If we have 63 shreds worth of entries the code below will create one FEC set with 63:63 rather than doing something like 32:32 + 31:32
Is there any downside to always doing this? i.e removing the loop and making 1 big set of sizedata.len() / data_buffer_size + another padding set if necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the || data.len() == chunk_size here is to avoid having to perform the proof size search in the last fec set calculation below?

right, though I guess it is very unlikely to be the case.

Is there any downside to always doing this?

The only downside I can think of right now is that with larger erasure batches the Merkle tree and so the Merkle proof becomes larger. So if the Merkle proof has one additional entry, it will take an additional 20 bytes off each shred.
In other words, if you have 2 erasure batches of both 32: 32, and the data shreds in both are full (no more room in their data buffer), then the aggregated data does not fit in a single 64:64 batch. Because each of those shreds in the 64:64 batch use an additional 20 bytes for the Merkle proof and have smaller capacity.

@behzadnouri behzadnouri merged commit 536c5bf into anza-xyz:master Jan 30, 2025
47 checks passed
@behzadnouri behzadnouri deleted the make-merkle-shreds-from-data-in-place branch January 30, 2025 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants