Refactored account tree in order to store last `N` reverse updates, compute opening for recent blocks #643

polydez · 2025-01-24T06:28:42Z

In this PR we refactored account's in-memory storage in order to store last $N$ reverse updates alongside with the latest account's SMT. Updates are persistent (in files), old updates are deleted. We also reorganize storage files by putting them into the storage directory. Also implemented method compute_opening which constructs opening for account for any recent block.

This solution is not expected to be significantly slower than constructing opening for the SMT itself, but we will measure this in benchmarks.

We will need to reimplement calculation of reverse mutation set in order to keep transactional approach of applying changes (the current solution might cause inconsistent DB and in-memory states if writing of update failed).

…ofs for old blocks

# Conflicts: # Cargo.lock # Cargo.toml # crates/store/src/state.rs

Mirko-von-Leipzig · 2025-01-24T07:32:57Z

@polydez is this ready for review? Still marked as a draft so just checking.

polydez · 2025-01-24T08:00:03Z

@polydez is this ready for review? Still marked as a draft so just checking.

It's almost ready, thanks for noticing. I will update CHANGELOG.md and mark the PR as ready for review.

…o separate implementation, make code more suitable for testing and benchmarking

Mirko-von-Leipzig

Some initial comments; still need to review the actual tree.

Do we need persistent storage for this? Could we not just store this in-memory only in a ring-buffer? Or are these huge? If they're huge then we will also have disk IO issues I think.

Mirko-von-Leipzig · 2025-01-27T11:15:15Z

crates/store/Cargo.toml

@@ -17,6 +17,7 @@ workspace = true
 [dependencies]
 deadpool-sqlite = { version = "0.9.0", features = ["rt_tokio_1"] }
 hex = { version = "0.4" }
+miden-crypto = { version = "0.13" }


Should this not stay the workspace version so it gets the patch?

It gets the patch while the crate has the same version (version.minor.* by semver) as the patched one. We usually specify dependencies in Cargo.toml of specific crate, if we use them only there. But I'm not sure this is the best approach, I would rather specify all versions in BoM (workspace's Cargo.toml).

Yeah it does get a bit weird.

As far as I know, @bobbinth has different opinion on this, we maybe need to discuss which approach is better to use. I have motivation to specify all versions in workspace's Cargo.toml: I expect that this will help us to reduce possible incompatibilities between different crates' versions in our project (when the same structure/trait from different dependency versions is used across different crates of the same workspace). This approach is also called as "bill of materials" (BoM) and helps developers to manage dependency versions in subprojects of some complex projects. This won't help to eliminate all incompatibilities (when dependencies outside of the workspace depend on different dependency versions), but I do think, we should use it at least to simplify dependency management within our workspace(s).

Mirko-von-Leipzig · 2025-01-27T11:19:37Z

crates/store/src/account_tree.rs

+    }
+}
+
+#[allow(async_fn_in_trait)]


Are we sure we want to suppress this?

Yes, we're going to use the trait only in our code, so it's okay to suppress it (or maybe even better to change this to #[expect(async_fn_in_trait)]).

Other approaches would be to desugar async functions to just returning impl Future<...> or use async_trait crate (but I think, we shouldn't use async_trait anyway).

Mirko-von-Leipzig · 2025-01-27T11:20:32Z

crates/store/src/account_tree.rs

+pub trait PersistentUpdatesStorage {
+    async fn load(&self, block_num: BlockNumber) -> Result<Option<Update>, DatabaseError>;
+    async fn save(&self, block_num: BlockNumber, update: &Update) -> Result<(), DatabaseError>;
+    async fn remove(&self, block_num: BlockNumber) -> Result<(), DatabaseError>;
+}


Why do we need a trait? If its only for test purposes then I would prefer using something like tempfs for testing over having trait/generics

Yes, we need this trait for testing and benchmarking. This approach also allows us to mockup implementations of storage in tests.

polydez · 2025-01-28T04:04:21Z

I made benchmarking for my solution which computes opening for one of recent blocks, comparing with opening in the latest account SMT.

For 10k accounts with 10k updates in the 1st block and 1k updates per each of remaining 99 blocks:

SMT opening elapsed (for 10000 accounts, average): 108 ms
AccountTree `compute_opening` elapsed (for 10000 accounts, average): 321 ms

So the compute_opening over reverse updates works 3 times slower than opening in single SMT. This is slower than I expected, but I think it is pretty good since it takes just ~0.03 ms for computing opening of a single account (on M3 Max MacBook Pro).

polydez · 2025-01-30T06:20:18Z

Do we need persistent storage for this? Could we not just store this in-memory only in a ring-buffer? Or are these huge? If they're huge then we will also have disk IO issues I think.

I apologize, I forgot to answer this question. We need persistent storage in order to reconstruct in-memory updates on node restart. Otherwise, we would need to wait for $N$ blocks to be applied until will be able to serve our future endpoint for up to $N$ recent proofs.

Mirko-von-Leipzig · 2025-01-30T07:58:03Z

Do we need persistent storage for this? Could we not just store this in-memory only in a ring-buffer? Or are these huge? If they're huge then we will also have disk IO issues I think.

I apologize, I forgot to answer this question. We need persistent storage in order to reconstruct in-memory updates on node restart. Otherwise, we would need to wait for N blocks to be applied until will be able to serve our future endpoint for up to N recent proofs.

Is the alternative of rebuilding them on restart not possible? I'd prefer to minimize disk IO as much as possible - every new file is something that can lead to strange failures.

polydez · 2025-01-30T13:31:33Z

Is the alternative of rebuilding them on restart not possible? I'd prefer to minimize disk IO as much as possible - every new file is something that can lead to strange failures.

I was thinking about this and it seems to be too expensive: we need to construct reverse mutations sets and it's possible only when we know previous account's hash for each recent block. Blocks consist of all required information, but this might require to process blocks from the genesis up to the latest block because of accounts which might be changed only in first blocks. In the past we had previous account's hash in BlockAccountUpdate structure (if I remember correctly), but now we haven't.

polydez added 6 commits January 21, 2025 19:36

feat: implement account tree with reverse updates, computation of pro…

5b1511e

…ofs for old blocks

refactor: improve compute_opening code

9616c4f

chore: remove occasionally commited files

01486c1

refactor: put all storage file into the storage dir

99ba32b

fix: update index calculation bugs

68761f1

fix: add test, fix bugs

287eab3

polydez requested review from bobbinth and Mirko-von-Leipzig January 24, 2025 06:28

polydez self-assigned this Jan 24, 2025

polydez added 2 commits January 24, 2025 11:30

Merge branch 'next' into polydez-old-state-proofs

edb8f9b

# Conflicts: # Cargo.lock # Cargo.toml # crates/store/src/state.rs

fix: compilation errors

92bdeea

polydez changed the title ~~Refactored account tree in order to store last $N$ reverse updates, compute opening for recent blocks~~ Refactored account tree in order to store last N reverse updates, compute opening for recent blocks Jan 24, 2025

chore: update CHANGELOG.md

b9c9f9b

polydez marked this pull request as ready for review January 24, 2025 08:02

polydez added 3 commits January 24, 2025 13:16

refactor: tiny improvement

8d638f7

refactor: make persistent storage abstract, extract file operations t…

38ad68c

…o separate implementation, make code more suitable for testing and benchmarking

tests: update test

7327b8d

Mirko-von-Leipzig reviewed Jan 27, 2025

View reviewed changes

feat: add benchmark

a7e798b

feat: add benchmark to Makefile

7a2dd0d

refactor: expect instead of allow for async public trait warning

33e10e4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored account tree in order to store last `N` reverse updates, compute opening for recent blocks #643

Refactored account tree in order to store last `N` reverse updates, compute opening for recent blocks #643

polydez commented Jan 24, 2025

Mirko-von-Leipzig commented Jan 24, 2025

polydez commented Jan 24, 2025

Mirko-von-Leipzig left a comment

Mirko-von-Leipzig Jan 27, 2025

polydez Jan 27, 2025

Mirko-von-Leipzig Jan 27, 2025

polydez Jan 27, 2025

Mirko-von-Leipzig Jan 27, 2025

polydez Jan 27, 2025

polydez Jan 27, 2025 •

edited

Loading

Mirko-von-Leipzig Jan 27, 2025

polydez Jan 27, 2025

polydez commented Jan 28, 2025 •

edited

Loading

polydez commented Jan 30, 2025

Mirko-von-Leipzig commented Jan 30, 2025

polydez commented Jan 30, 2025 •

edited

Loading

Refactored account tree in order to store last N reverse updates, compute opening for recent blocks #643

Are you sure you want to change the base?

Refactored account tree in order to store last N reverse updates, compute opening for recent blocks #643

Conversation

polydez commented Jan 24, 2025

Mirko-von-Leipzig commented Jan 24, 2025

polydez commented Jan 24, 2025

Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez Jan 27, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez commented Jan 28, 2025 • edited Loading

polydez commented Jan 30, 2025

Mirko-von-Leipzig commented Jan 30, 2025

polydez commented Jan 30, 2025 • edited Loading

Refactored account tree in order to store last `N` reverse updates, compute opening for recent blocks #643

Refactored account tree in order to store last `N` reverse updates, compute opening for recent blocks #643

polydez Jan 27, 2025 •

edited

Loading

polydez commented Jan 28, 2025 •

edited

Loading

polydez commented Jan 30, 2025 •

edited

Loading