-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add chunk application stats #12797
base: master
Are you sure you want to change the base?
Conversation
*block_hash, | ||
shard_uid.shard_id(), | ||
apply_result.stats, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saving chunk stats here means that only chunks applied inside blocks will have their stats saved. Stateless chunk validators will not save any stats. In the future we could change it to save it somewhere else, but it's good enough for the first version.
@@ -462,7 +467,8 @@ impl DBCol { | |||
| DBCol::StateHeaders | |||
| DBCol::TransactionResultForBlock | |||
| DBCol::Transactions | |||
| DBCol::StateShardUIdMapping => true, | |||
| DBCol::StateShardUIdMapping | |||
| DBCol::ChunkApplyStats => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope that marking this column as cold
is enough to avoid garbage collection on archival nodes? I think these stats should be kept forever on archival nodes. They are not that big and it would be nice to be able to view stats for chunks older than three epochs.
/// The stats can be read to analyze what happened during chunk application. | ||
/// - *Rows*: BlockShardId (BlockHash || ShardId) - 40 bytes | ||
/// - *Column type*: `ChunkApplyStats` | ||
ChunkApplyStats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I thought that I could use ChunkHash
as a key in the database, but that doesn't really
work. The same chunk can be applied multiple times when there are missing chunks, and I think chunks
created using the same prev_block
would have the same hash (?).
@@ -648,6 +648,7 @@ impl<'a> ChainStoreUpdate<'a> { | |||
self.gc_outgoing_receipts(&block_hash, shard_id); | |||
self.gc_col(DBCol::IncomingReceipts, &block_shard_id); | |||
self.gc_col(DBCol::StateTransitionData, &block_shard_id); | |||
self.gc_col(DBCol::ChunkApplyStats, &block_shard_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use some other garbage collection logic to keep the stats for longer than three epochs. Maybe something similar to LatestWitnesses
where the last N witnesses are kept in the database? It's annoying that useful data like these stats disappears after three epochs, especially in tests which have to run for a few epochs. Can be changed later.
@@ -336,7 +327,7 @@ impl Runtime { | |||
apply_state: &ApplyState, | |||
signed_transaction: &SignedTransaction, | |||
transaction_cost: &TransactionCost, | |||
stats: &mut ApplyStats, | |||
stats: &mut ChunkApplyStatsV0, | |||
) -> Result<(Receipt, ExecutionOutcomeWithId), InvalidTxError> { | |||
let span = tracing::Span::current(); | |||
metrics::TRANSACTION_PROCESSED_TOTAL.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime metrics could probably be refactored so that first we collect the stats and at the very end
we record all of the stats in the metrics. Would reduce clutter in the runtime code.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12797 +/- ##
========================================
Coverage 70.53% 70.53%
========================================
Files 846 847 +1
Lines 174927 175254 +327
Branches 174927 175254 +327
========================================
+ Hits 123389 123623 +234
- Misses 46285 46376 +91
- Partials 5253 5255 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This is the first step towards per-chunk metrics (#12758).
This PR adds a new struct -
ChunkApplyStats
- which keeps information about things that happenedduring chunk application. For example how many transactions there were, how many receipts, what were
the outgoing limits, how many receipts were forwarded, buffered, etc, etc.
For now
ChunkApplyStats
contain mainly data relevant to the bandwidth scheduler, in the futuremore stats can be added to measure other things that we're interested in. I didn't want to add too
much stuff at once to keep the PR size reasonable.
There was already a struct called
ApplyStats
, but it was used only for the balance checker. Ireplaced it with
BalanceStats
insideChunkApplyStats
.ChunkApplyStats
are returned inApplyChunkResult
and saved to the database for later use. A newdatabase column is added to keep the chunk application stats. The column is included in the standard
garbage collection logic to keep the size of saved data reasonable.
Running
neard view-state chunk-apply-stats
allows node operator to view chunk application statsfor a given chunk. Example output for a mainnet chunk:
Click to expand
The stats are also available in
ChainStore
, making it easy to read them from tests.In the future we could also add an RPC endpoint to make the stats available in
debug-ui
.The PR is divided into commits for easier review.