feat: store seed command in stress test CLI #657

SantiagoPittella · 2025-01-29T22:17:07Z

@bobbinth i'm applying your suggestion of a new structure (your comment here). The only thing I changed is that I create all the accounts before hand.

I added some queries to check the size of each table, and the avg size of each entry in every table. This is the current output of the binary:

For example, this is the current output for 100k accounts:

Creating new faucet account...
Generated 100000 accounts in 19.377 seconds
Creating notes...
Created notes: inserted 393 blocks with avg insertion time 641 ms
Store file size every 50 blocks:
Block 0: 4096 bytes
Block 50: 2645048 bytes
Block 100: 72965512 bytes
Block 150: 137780152 bytes
Block 200: 201362360 bytes
Block 250: 264911800 bytes
Block 300: 328420280 bytes
Block 350: 392027064 bytes
Block 400: 455551928 bytes
Average growth rate: 1159154.7888040713 bytes per blocks
Total time: 274.344 seconds
DB Stats for account_deltas: 4096 bytes, 4096.0 bytes/entry
DB Stats for block_headers: 122880 bytes, 311.088607594937 bytes/entry
DB Stats for account_fungible_asset_deltas: 4096 bytes,  bytes/entry
DB Stats for notes: 460816384 bytes, 4608.16384 bytes/entry
DB Stats for account_non_fungible_asset_updates: 4096 bytes,  bytes/entry
DB Stats for nullifiers: 4370432 bytes, 43.7218087234894 bytes/entry
DB Stats for account_storage_map_updates: 4096 bytes,  bytes/entry
DB Stats for settings: 4096 bytes, 2048.0 bytes/entry
DB Stats for account_storage_slot_updates: 4096 bytes, 1365.33333333333 bytes/entry
DB Stats for transactions: 5644288 bytes, 56.244337488665 bytes/entry
DB Stats for accounts: 6062080 bytes, 60.6438446609712 bytes/entry

This PR was built on top of #621 .

bobbinth · 2025-01-30T08:49:57Z

I'm getting like 1.56 blocks/second. I'm now researching what can be the bottleneck.

Do you know where the bottlenecks are? Taking a brief look at the code it seems like we are instantiating a lot of random number generators - these could be quite expensive (especially the RPO ones). So, I'd switch to lighter versions and also try to use the same instance as much as possible.

Store file size every 1k batches:
0: 4096 bytes
1000: 4096 bytes
2000: 78749696 bytes
3000: 160059392 bytes
4000: 238931968 bytes
5000: 318918656 bytes
6000: 397672448 bytes
7000: 477810688 bytes
Average growth rate: 1215792.854961832 bytes per batch

Something doesn't seem right here:

Why does the first batch of 1000 blocks does not affect storage size? I guess after that we get consistent growth of about 75MB per 1000 blocks.
7000 blocks with 256 accounts created per block should result in 1.8M accounts. Where does the 100K accounts number come from?
Related to the above, if there are only 100K accounts, database size of almost 500MB doesn't really make a lot of sense (this would imply almost for 5MB per account). If it is more like 1.8M then it seems a bit low (about 270 bytes per account) - though, maybe it is possible.

Mirko-von-Leipzig · 2025-01-30T09:34:55Z

Something doesn't seem right here:

Why does the first batch of 1000 blocks does not affect storage size? I guess after that we get consistent growth of about 75MB per 1000 blocks.

I didn't check how size is measured; but the write could be stuck in the WAL file still?

SantiagoPittella · 2025-01-30T14:33:09Z

Why does the first batch of 1000 blocks does not affect storage size? I guess after that we get consistent growth of about 75MB per 1000 blocks.

I was checking only the size of miden-store.sqlite3; I'm now running it again for 1M accounts using the size of both files (as Mirko mentioned) combined as the total size.

Though I want to clarify that it is each 1000 batches, each block is 16 batches in the implementation so we keep track of the increase of size every 62.5 blocks.

7000 blocks with 256 accounts created per block should result in 1.8M accounts. Where does the 100K accounts number come from?

The 7k are batches, so it is like ~440 blocks. We are using 255 accounts per block + 1 tx to mint assets to each one of the accounts.

Related to the above, if there are only 100K accounts, database size of almost 500MB doesn't really make a lot of sense (this would imply almost for 5MB per account). If it is more like 1.8M then it seems a bit low (about 270 bytes per account) - though, maybe it is possible.

I will come back with the results of this new run with 1M accounts and with the store size fixed. Currently it is taking like ~1 hour to run for 1M accounts (for the total process).

SantiagoPittella · 2025-01-30T14:56:50Z

here you can visualize a flamegraph for 1M accounts (removed the preview because it was too big):
https://github.com/user-attachments/assets/f7639a8c-63cc-4384-99cb-fedc50e8cc58

SantiagoPittella · 2025-01-30T20:17:56Z

I added a couple more of metrics, and re run a couple times with different number of accounts.

I'm consistently getting 4550~ish bytes/account

SantiagoPittella · 2025-01-30T20:41:40Z

I'm consistently getting 4550~ish bytes/account

It is worth mention that this number is the result of doing total_db_size / total_number_account so it includes blocks and any other stuff that we store in the DB.

I ran the following query in the DB and got this results:

SELECT
     name,
     SUM(pgsize) AS size_bytes,
     (SUM(pgsize) * 1.0) / (SELECT COUNT(*) FROM accounts) AS bytes_per_row
 FROM dbstat
 WHERE name = 'accounts';

And got this results:
accounts| 6025216 bytes | 60.2750645245193 bytes per account

bobbinth · 2025-01-30T20:50:37Z

I added a couple more of metrics, and re run a couple times with different number of accounts.

I'm consistently getting 4550~ish bytes/account

4.5KB is better than 5MB - but still looks pretty high. Part of this is nullifiers and notes - but I don't see how these contribute more than 1KB (about 40 bytes for nullifiers + 80 bytes for notes + 500 bytes for note authentication paths). But it is also possible I'm missing something.

Another possibility is that SQLite doesn't do compaction, and maybe there is a lot of "slack" in the file.

igamigo · 2025-01-30T21:17:00Z

There's also the overhead of block headers and indices as well, but those are also probably too small to consider.
One test that you can do is run a VACUUM command to see if the file sizes change considerably after the initial seeding finishes.

SantiagoPittella · 2025-01-31T14:58:44Z

I added some queries to check the size of each table, and the avg size of each entry in every table. This is the current output of the binary:

Creating new faucet account...
Generated 100000 accounts in 19.377 seconds
Creating notes...
Created notes: inserted 393 blocks with avg insertion time 641 ms
Store file size every 50 blocks:
Block 0: 4096 bytes
Block 50: 2645048 bytes
Block 100: 72965512 bytes
Block 150: 137780152 bytes
Block 200: 201362360 bytes
Block 250: 264911800 bytes
Block 300: 328420280 bytes
Block 350: 392027064 bytes
Block 400: 455551928 bytes
Average growth rate: 1159154.7888040713 bytes per blocks
Average space used per account: 4545.705054133613 bytes
Total time: 274.344 seconds
DB Stats for account_deltas: 4096 bytes, 4096.0 bytes/entry
DB Stats for block_headers: 122880 bytes, 311.088607594937 bytes/entry
DB Stats for account_fungible_asset_deltas: 4096 bytes,  bytes/entry
DB Stats for notes: 460816384 bytes, 4608.16384 bytes/entry
DB Stats for account_non_fungible_asset_updates: 4096 bytes,  bytes/entry
DB Stats for nullifiers: 4370432 bytes, 43.7218087234894 bytes/entry
DB Stats for account_storage_map_updates: 4096 bytes,  bytes/entry
DB Stats for settings: 4096 bytes, 2048.0 bytes/entry
DB Stats for account_storage_slot_updates: 4096 bytes, 1365.33333333333 bytes/entry
DB Stats for transactions: 5644288 bytes, 56.244337488665 bytes/entry
DB Stats for accounts: 6062080 bytes, 60.6438446609712 bytes/entry

SantiagoPittella · 2025-01-31T15:26:21Z

I could not reduce the time that takes for each block. It is currently at 1.6 blocks/sec

bobbinth · 2025-01-31T19:55:59Z

I could not reduce the time that takes for each block. It is currently at 1.6 blocks/sec

Do you have the breakdown of different components of this? Specifically, how much, out of 1.6 seconds is taken up by block construction vs. block insertion?

bobbinth · 2025-01-31T20:07:36Z

DB Stats for account_deltas: 4096 bytes, 4096.0 bytes/entry
DB Stats for block_headers: 122880 bytes, 311.088607594937 bytes/entry
DB Stats for account_fungible_asset_deltas: 4096 bytes,  bytes/entry
DB Stats for notes: 460816384 bytes, 4608.16384 bytes/entry
DB Stats for account_non_fungible_asset_updates: 4096 bytes,  bytes/entry
DB Stats for nullifiers: 4370432 bytes, 43.7218087234894 bytes/entry
DB Stats for account_storage_map_updates: 4096 bytes,  bytes/entry
DB Stats for settings: 4096 bytes, 2048.0 bytes/entry
DB Stats for account_storage_slot_updates: 4096 bytes, 1365.33333333333 bytes/entry
DB Stats for transactions: 5644288 bytes, 56.244337488665 bytes/entry
DB Stats for accounts: 6062080 bytes, 60.6438446609712 bytes/entry

Interesting! It seems like 97% of the database is in the notes table (we are adding 4.5KB of data per note). Let's create an issue to optimize note storage. Let's create an issue to optimize note storage. I think we should be able to reduce the size by about 10x.

SantiagoPittella · 2025-01-31T21:19:45Z

@bobbinth I created #662 to address the notes issue.

Do you have the breakdown of different components of this? Specifically, how much, out of 1.6 seconds is taken up by block construction vs. block insertion?

According to the flamegraph, a big part of this process is in miden_processor::Process::execute_mast_node. Though I'm making some changes to get more insights of this.

bobbinth · 2025-01-31T21:48:17Z

Do you have the breakdown of different components of this? Specifically, how much, out of 1.6 seconds is taken up by block construction vs. block insertion?

According to the flamegraph, a big part of this process is in miden_processor::Process::execute_mast_node. Though I'm making some changes to get more insights of this.

Looking at the code, I think what we are measuring as insertion time is actually BlockBuilder::build_block() time. This will actually covers much more than just inserting block into the store (e.g., it also executes the block kernel to build the block). So, this will be much longer than what we want. What I really mean by insertion time is the time it takes the store to process apply_block() request. My hope is that this is less than 100ms - but we'll need to confirm.

What may make sense to do is build blocks manually (and probably the same for batches as well), without using block and batch builders. This is somewhat difficult now, but should become much easier after #659 and subsequent work is done.

So, I guess we have 3 paths forward:

Review & merge this work roughly as is and then re-factor it once manual block/batch building becomes easier.
Try to implement manual block/batch building now, and then update it with the new structure later.
Put this on hold and then re-work once manual block/batch building becomes easier.

Assuming this is pretty close to being done, path 1 probably makes the most sense - but let me know what you think.

bobbinth · 2025-01-31T21:59:30Z

crates/block-producer/Cargo.toml

@@ -15,35 +15,37 @@ version.workspace      = true
 workspace = true

 [features]
+testing        = ["dep:miden-air", "dep:miden-node-test-macro", "dep:rand_chacha", "dep:winterfell"]


With a more manual approach to batch/block building, we probably wouldn't need to introduce the testing feature. So, maybe that's an argument for putting the work on hold for now.

TomasArrachea added 30 commits January 15, 2025 12:30

test: add stress binary

a7d4130

test: set 16 batches per block and 16 txs per batch

a2cdcd8

Merge branch 'next' into tomasarrachea-stress-test

afec921

test: use only block builder instead of block producer

387d95d

test: parallelize account grinding

4395203

test: add sync request test

dc3b1a2

feat: add cli to load store

63916b8

Merge branch 'next' into tomasarrachea-stress-test

5c39a4c

refactor: move code to helper functions

1536920

fix: use different init seed on each account

bf64e2c

fix: use genesis block as anchor

5afac9f

fix: undo StoreClient as pub

f8fcf22

fix: remove miden-objects testing feature

b22ef0d

docs: add block-producer features documentation

77d73b4

feat: add makefile target for miden-stress-test

ee0987c

wip: add authenticated notes

ccdf645

chore: rename subcrate to miden-node-stress-test

5aed8d9

chore: alphabetize dependencies and drop patch version

a6d4c1a

chore: update README

018078c

fix: rename num_accounts parameter

46a976f

Merge branch 'next' into tomasarrachea-stress-test

2e2e164

feat: refactor into separate helper functions

f009a8a

Merge branch 'next' into tomasarrachea-stress-test

08478dc

feat: remove query for note inclusion proof

348b1bb

fix: avoid blocking tokio runtime

dc7e265

Merge branch 'next' into tomasarrachea-stress-test

661c2cb

fix: format

00e86df

feat: use unbounded channels

046f690

fix: add docs and move seed_store definition

e545d92

fix: alphabetize dependencies and declare as dev

606feb1

SantiagoPittella added 4 commits January 28, 2025 14:05

review: update crate description

0820ffd

review: fix block-producer readme

81b1d04

review: make testing deps optional for block-producer

d1ca045

review: add additional metrics

28f0632

feat: refactor binary using linear approach

348b54c

SantiagoPittella force-pushed the santiagopittella-stress-testing-copy branch from 58cac73 to 348b54c Compare January 31, 2025 15:00

SantiagoPittella changed the base branch from tomasarrachea-stress-test to next January 31, 2025 15:01

fix: move changelog entry to unreleased section

19e1513

SantiagoPittella marked this pull request as ready for review January 31, 2025 15:18

Merge branch 'next' into santiagopittella-stress-testing-copy

b7410f0

SantiagoPittella changed the title ~~draft: store seed refactor~~ feat: store seed command in stress test CLI Jan 31, 2025

SantiagoPittella added 2 commits January 31, 2025 12:21

fix: remove total account space print

0a74fac

fix: add workspace lint for stress test bin

24b743e

fix: lint issues

973c7eb

SantiagoPittella mentioned this pull request Jan 31, 2025

Optimize note storage #662

Open

5 tasks

bobbinth reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: store seed command in stress test CLI #657

feat: store seed command in stress test CLI #657

SantiagoPittella commented Jan 29, 2025 •

edited

Loading

bobbinth commented Jan 30, 2025

Mirko-von-Leipzig commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

bobbinth commented Jan 30, 2025

igamigo commented Jan 30, 2025

SantiagoPittella commented Jan 31, 2025

SantiagoPittella commented Jan 31, 2025

bobbinth commented Jan 31, 2025

bobbinth commented Jan 31, 2025

SantiagoPittella commented Jan 31, 2025

bobbinth commented Jan 31, 2025

bobbinth Jan 31, 2025

feat: store seed command in stress test CLI #657

Are you sure you want to change the base?

feat: store seed command in stress test CLI #657

Conversation

SantiagoPittella commented Jan 29, 2025 • edited Loading

bobbinth commented Jan 30, 2025

Mirko-von-Leipzig commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

SantiagoPittella commented Jan 30, 2025

bobbinth commented Jan 30, 2025

igamigo commented Jan 30, 2025

SantiagoPittella commented Jan 31, 2025

SantiagoPittella commented Jan 31, 2025

bobbinth commented Jan 31, 2025

bobbinth commented Jan 31, 2025

SantiagoPittella commented Jan 31, 2025

bobbinth commented Jan 31, 2025

bobbinth Jan 31, 2025

Choose a reason for hiding this comment

SantiagoPittella commented Jan 29, 2025 •

edited

Loading