Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BucketListDB Random Eviction Cache #4632

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

SirTyson
Copy link
Contributor

@SirTyson SirTyson commented Jan 29, 2025

Description

Resolves #3696

Adds a random eviction cache to LiveBucket using DiskIndex. Currently, I've set the default to cache up to 25% of Bucket entries, which puts total stellar-core memory usage at about 5-6 GB by default.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@SirTyson SirTyson force-pushed the bl-random-eviction-cache branch from d8fafbb to 144938b Compare February 4, 2025 03:09
@SirTyson SirTyson marked this pull request as ready for review February 4, 2025 03:09
@SirTyson SirTyson requested review from dmkozh and sisuresh February 4, 2025 03:09
src/bucket/HotArchiveBucketIndex.h Outdated Show resolved Hide resolved
# than BUCKETLIST_DB_INDEX_CUTOFF. Note that this value does not impact
# Buckets smaller than BUCKETLIST_DB_INDEX_CUTOFF, as they are always
# completely held in memory.
BUCKETLIST_DB_CACHED_PERCENT = 25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how is one supposed to reason about this value. Are the RAM requirements roughly BL size * cached percent? Maybe it's worth adding a comment regarding this.
I'm also curious why do we use percentage here instead of e.g. the number of entries or even the rough RAM boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, usage is basically BL size * cached percentage. The reason I did percentage is primarily for easier implementation. Max number of items to cache is a poor metric, since we now have high variance in LedgerEntry size because of Soroban entries.

RAM based caps are more useful, but pose some implementation challenges. For example, suppose we need to cache a large WASM entry. If we use a random eviction cache, we might need to evict many entries to free up enough space for the WASM entry. It is also a little complicated to determine how you allocate RAM among the different buckets. Suppose you have a limit of 1 GB, how do we divy this up among the different buckets? You could do something like each bucket gets a cache size relative to how much of the total BucketList size the bucket makes up. However, we now have to initialize caches wrt total BucketList size, which is challenging since Bucket objects are created individually in a background thread well before they are required.

All of these are probably solvable issues, but I doubt anyone will actually modify this value. In practice we can probably manage this value ourselves as needed by changing the default value in point release should bucketList increase significantly. In practice this ratio is just the upper bound. Especially for smaller buckets, I imagine they will churn before filling up their cache anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max number of items to cache is a poor metric, since we now have high variance in LedgerEntry size because of Soroban entries.

But this value is set based on the entry count as well, isn't it? I just think it's much easier to argue about the RAM usage when you deal with a constant number of entries - given that this number is large enough and given the random eviction safe, I think it's safe to just multiply this by the mean entry size to get the RAM estimate. While I agree that this is unlikely to be changed, I find it hard to argue about this value even for ourselves, especially if BL grows and we just forget about this.

Copy link
Contributor

@dmkozh dmkozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to approve this; I think we can change the config semantics later if we want. This still needs to be rebased though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add persistent cache to BucketListDB
2 participants