Skip to content

Commit

Permalink
New node store interface (#625)
Browse files Browse the repository at this point in the history
Co-authored-by: Dan Laine <[email protected]>
Co-authored-by: hhao <[email protected]>
  • Loading branch information
3 people authored Aug 13, 2024
1 parent 8eef1a5 commit 2f9d07b
Show file tree
Hide file tree
Showing 109 changed files with 6,266 additions and 17,851 deletions.
18 changes: 9 additions & 9 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,9 @@ jobs:
${{ runner.os }}-deps-${{ hashFiles('**/Cargo.toml') }}-
${{ runner.os }}-deps-
- name: Check
run: cargo check --workspace --tests --examples --benches --all-features
run: cargo check --workspace --tests --examples --benches
- name: Build
run: cargo build --workspace --tests --examples --benches --all-features
run: cargo build --workspace --tests --examples --benches
# Always update the cache
- name: Cleanup
run: |
Expand Down Expand Up @@ -100,7 +100,7 @@ jobs:
- name: Format
run: cargo fmt -- --check
- name: Clippy
run: cargo clippy --tests --examples --benches --all-features -- -D warnings
run: cargo clippy --tests --examples --benches -- -D warnings

test:
needs: build
Expand All @@ -122,7 +122,7 @@ jobs:
target/
key: ${{ needs.build.outputs.cache-key }}
- name: Run tests
run: cargo test --all-features --verbose
run: cargo test --verbose

examples:
needs: build
Expand All @@ -142,11 +142,11 @@ jobs:
~/.cargo/git/db/
target/
key: ${{ needs.build.outputs.cache-key }}
# benchmarks were not being done in --release mode, we can enable this again later
# - name: Run benchmark example
# run: RUST_BACKTRACE=1 cargo run --example benchmark -- --nbatch 100 --batch-size 1000
- name: Run insert example
run: RUST_BACKTRACE=1 cargo run --example insert
# benchmarks were not being done in --release mode, we can enable this again later
# - name: Run benchmark example
# run: RUST_BACKTRACE=1 cargo run --example benchmark -- --nbatch 100 --batch-size 1000
# - name: Run insert example
# run: RUST_BACKTRACE=1 cargo run --example insert

docs:
needs: build
Expand Down
3 changes: 1 addition & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
members = [
"firewood",
"fwdctl",
"growth-ring",
"libaio",
"storage",
"grpc-testtool",
]
resolver = "2"
Expand Down
86 changes: 9 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,20 +19,10 @@ but compaction is not required to maintain the index. Firewood was first conceiv
a very fast storage layer for the EVM but could be used on any blockchain that
requires an authenticated state.

Firewood only attempts to store the latest state on-disk and will actively clean up
unused data when state diffs are committed. To avoid reference counting trie nodes,
Firewood does not copy-on-write (COW) the state trie and instead keeps
the latest version of the trie index on disk and applies in-place updates to it.
Firewood keeps some configurable number of previous states in memory to power
state sync (which may occur at a few roots behind the current state).

Firewood provides OS-level crash recovery via a write-ahead log (WAL). The WAL
guarantees atomicity and durability in the database, but also offers
“reversibility”: some portion of the old WAL can be optionally kept around to
allow a fast in-memory rollback to recover some past versions of the entire
store back in memory. While running the store, new changes will also contribute
to the configured window of changes (at batch granularity) to access any past
versions with no additional cost at all.
Firewood only attempts to store recent revisions on-disk and will actively clean up
unused data when revisions expire. Firewood keeps some configurable number of previous states in memory and on disk to power state sync (which may occur at a few roots behind the current state). To do this, a new root is always created for each revision that can reference either new nodes from this revision or nodes from a prior revision. When creating a revision, a list of nodes that are no longer needed are computed and saved to disk in a future-delete log (FDL) as well as kept in memory. When a revision expires, the nodes that were deleted when it was created are returned to the free space.

Firewood guarantees recoverability by not referencing the new nodes in a new revision before they are flushed to disk, as well as carefully managing the free list during the creation and expiration of revisions.

## Architecture Diagram

Expand Down Expand Up @@ -73,69 +63,11 @@ versions with no additional cost at all.
`Revision`.

## Roadmap

**LEGEND**

- [ ] Not started
- [ ] :runner: In progress
- [x] Complete

### Green Milestone

This milestone will focus on additional code cleanup, including supporting
concurrent access to a specific revision, as well as cleaning up the basic
reader and writer interfaces to have consistent read/write semantics.

- [x] Concurrent readers of pinned revisions while allowing additional batches
to commit, to support parallel reads for the past consistent states. The revisions
are uniquely identified by root hashes.
- [x] Pin a reader to a specific revision, so that future commits or other
operations do not see any changes.
- [x] Be able to read-your-write in a batch that is not committed. Uncommitted
changes will not be shown to any other concurrent readers.
- [x] Add some metrics framework to support timings and volume for future milestones
To support this, a new method Db::metrics() returns an object that can be serialized
into prometheus metrics or json (it implements [serde::Serialize])

### Seasoned milestone

This milestone will add support for proposals, including proposed future
branches, with a cache to make committing these branches efficient.

- [x] Be able to support multiple proposed revisions against the latest committed
version.
- [x] Be able to propose a batch against the existing committed revision, or
propose a batch against any existing proposed revision.
- [x] Committing a batch that has been proposed will invalidate all other proposals
that are not children of the committed proposed batch.
- [x] Be able to quickly commit a batch that has been proposed.
- [x] Remove RLP encoding

### Dried milestone

The focus of this milestone will be to support synchronization to other
instances to replicate the state. A synchronization library should also
be developed for this milestone.

- [x] Migrate to a fully async interface
- [x] Pluggable encoding for nodes, for optional compatibility with MerkleDB
- [ ] :runner: MerkleDB root hash in parity for a seamless transition between MerkleDB
and Firewood.
- [ ] :runner: Support replicating the full state with corresponding range proofs that
verify the correctness of the data.
- [ ] Pluggable IO subsystem (tokio\_uring, monoio, etc)
- [ ] Add metric reporting
- [ ] Enforce limits on the size of the range proof as well as keys to make
synchronization easier for clients.
- [ ] Add support for Ava Labs generic test tool via grpc client
- [ ] Support replicating the delta state from the last sync point with
corresponding change proofs that verify the correctness of the data.
- [ ] Refactor `Shale` to be more idiomatic, consider rearchitecting it

## Build

Firewood currently is Linux-only, as it has a dependency on the asynchronous
I/O provided by the Linux kernel (see `libaio`).
- [ ] Complete the proof code
- [ ] Complete the revision manager
- [ ] Complete the API implementation
- [ ] Implement a node cache
- [ ] Hook up the RPC

## Run

Expand Down
2 changes: 1 addition & 1 deletion docs/assets/architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 8 additions & 15 deletions firewood/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,38 +18,31 @@ readme = "../README.md"
[dependencies]
aquamarine = "0.5.0"
async-trait = "0.1.77"
bytemuck = { version = "1.14.3", features = ["derive"] }
enum-as-inner = "0.6.0"
growth-ring = { version = "0.0.4", path = "../growth-ring" }
libaio = {version = "0.0.4", path = "../libaio" }
storage = { version = "0.0.4", path = "../storage" }
futures = "0.3.30"
hex = "0.4.3"
lru = "0.12.2"
metered = "0.9.0"
nix = {version = "0.28.0", features = ["fs", "uio"]}
parking_lot = "0.12.1"
serde = { version = "1.0", features = ["derive"] }
sha3 = "0.10.8"
serde = { version = "1.0" }
sha2 = "0.10.8"
thiserror = "1.0.57"
tokio = { version = "1.36.0", features = ["rt", "sync", "macros", "rt-multi-thread"] }
typed-builder = "0.18.1"
bincode = "1.3.3"
bitflags = { version = "2.4.2", features = ["bytemuck"] }
env_logger = { version = "0.11.2", optional = true }
log = { version = "0.4.20", optional = true }
test-case = "3.3.1"
integer-encoding = "4.0.0"

[features]
logger = ["dep:env_logger", "log"]
default = []
logger = ["log"]
nightly = []

[dev-dependencies]
criterion = {version = "0.5.1", features = ["async_tokio"]}
keccak-hasher = "0.15.3"
rand = "0.8.5"
triehash = "0.8.4"
assert_cmd = "2.0.13"
predicates = "3.1.0"
clap = { version = "4.5.0", features = ['derive'] }
test-case = "3.3.1"
pprof = { version = "0.13.0", features = ["flamegraph"] }

[[bench]]
Expand Down
Loading

0 comments on commit 2f9d07b

Please sign in to comment.