-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Head tracking batch #417
base: main
Are you sure you want to change the base?
Head tracking batch #417
Conversation
@dipkakwani Maybe I should have been more explicit about it. This is a sketch of the idea. I wanted to gather some input while running the benchmarks with the current setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
This PR introduces a notion of a
IMultiHeadChain
. The multi head approach enables creating multiple heads of the blockchain that track the recent state properly. If no reorganization happens, the sameIHead
is used over and over for a prolonged period of time while ensuring that all the reads are served in a speed of reading the raw underlying database. This is done by applying COW not only on the underlying database level but also in the head with a help of aPageTable
. With this we have:ancestors
' notionThe reusability is addressed by reusing the head tracking branch through multiple blocks. Every time a block is committed, it's state is applied to internal state of
HeadTrackingBatch
in a specific manner, allowing tracking of blocks.HeadTrackingBatch
has no notion of ancestors as it keeps all the modified pages in a simple lookup, that allows to perceive a squashed view of the world.No
BitFilter
is requires as data are always queried like there would be from the database. The only difference is that some pages are represented as modified in-memory copies. These copies eventually are applied to the disk, but for sake of lookups, are treated like there were applied already.IHead
The idea is implemented by introduction of one more level of abstraction over a batch. Before this PR, only a single
PageDb.Batch
could exist. This PR introduces a new type of the batch, calledIMultiHeadChain
. From the chain you can getIHead
, that accumulates changes in memory, meaning, that at a given batch number, multiple propositions can exist before they are committed/discarded. To make it work, the head introduces two dictionaries that allow resolvingPage
<->DbAddress
mapping in both sides. There are used to track pages that were already overwritten. It's possible that a given page will be overwritten multiple times though, so that the mappings keep only the last version. The mapping using dictionary seems to be costly (it was shown in the profiling). To address this, a dummy array-based lookup is provided in front of it.ProposedBatch
A simple construct of a
ProposedBatch
is used to track:(DbAddress,Page)
tuplesRootPage
These proposed blocks are created every time
HeadTrackingBatch
is committed, by selecting pages that were updated during the given batch. Once a finality is reached, proposed batches that have block numbers lower than the finalized one, are applied on the database. The application should be fast and almost instantaneous as it copies payloads to proper pages, updates theRootPage
and decreases the counter on theProposedBatch
(counter based managed similar to blockchain).Readers
Readers manage the read access. They are created whenever a block is committed/finalized and are just retrieved by the client code. No readers should be created in the hot paths!
Benchmarks
Some areas of interest
BeforeCommit
that represents MerkleizationCommitImpl
that applies the changes to the in-memory representation. Potentially can be offloadedCommit
that registers the proposal