Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

litep2p: Provide partial results to speedup GetRecord queries #7099

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Jan 9, 2025

This PR provides the partial results of the GetRecord kademlia query.

This significantly improves the authority discovery records, from ~37 minutes to ~2/3 minutes.
In contrast, libp2p discovers authority records in around ~10 minutes.

The authority discovery was slow because litep2p provided the records only after the Kademlia query was completed. A normal Kademlia query completes in around 40 seconds to a few minutes.
In this PR, partial records are provided as soon as they are discovered from the network.

Testing Done

Started a node in Kusama with --validator and litep2p backend.
The node discovered 996/1000 authority records in ~ 1 minute 45 seconds.

Screenshot 2025-01-09 at 12 26 08

Before & After

In this image, on the left side is libp2p, in the middle litep2p without this PR, on the right litep2p with this PR

Screenshot 2025-01-07 at 17 57 56

Closes: #7077

cc @paritytech/networking

@lexnv lexnv added T0-node This PR/Issue is related to the topic “node”. I5-enhancement An additional feature request. labels Jan 9, 2025
@lexnv lexnv self-assigned this Jan 9, 2025
@@ -850,7 +850,7 @@ linked-hash-map = { version = "0.5.4" }
linked_hash_set = { version = "0.1.4" }
linregress = { version = "0.5.1" }
lite-json = { version = "0.2.0", default-features = false }
litep2p = { version = "0.8.4", features = ["websocket"] }
litep2p = { git = "https://github.com/paritytech/litep2p.git", branch = "lexnv/kad-found-records", features = ["websocket"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will do one soon, wanted to double-check first on our Kusama validator that this indeed solves the problem 🙏

@lexnv
Copy link
Contributor Author

lexnv commented Jan 10, 2025

CPU

Litep2p without this PR was running before ~7:15 (spike—left side). On the right side is Litep2p with this PR.

The CPU consumption looks overall stable, with some fluctuations at ~10-minute intervals.
The peak CPU consumption has increased by ~6%, from 171% before to 182% current.

Considering the network's randomness and the ~18x speedup improvement in authority-discovery records, this small tradeoff (if any) appears justified.

Screenshot 2025-01-10 at 12 30 51

Triage Report

The Some network error occurred when fetching erasure chunk warning appears in the first 2 minutes of starting the node. In contrast, before this PR the warning appeared for ~30 minutes.

Repo Count Level Triage report
https://github.com/paritytech/polkadot-sdk/ 437 warn_if_frequent Some network error occurred when fetching erasure chunk
https://github.com/paritytech/polkadot-sdk/ 25 warn Recovery of available data failed.
https://github.com/paritytech/polkadot-sdk/ 25 warn Data unavailable for candidate .*
https://github.com/paritytech/polkadot-sdk/ 12 warn Report .: . to .. Reason: .. Banned, disconnecting. ( A collator provided a collation for the wrong para. Banned, disconnecting.)
https://github.com/paritytech/polkadot-sdk/ 4 warn fetch_pov_job
https://github.com/paritytech/polkadot-sdk/ 3 warn Importing locally an already known assignment
https://github.com/paritytech/polkadot-sdk/ 1 warn Report .: . to .. Reason: .. Banned, disconnecting. ( Same block request multiple times. Banned, disconnecting.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I5-enhancement An additional feature request. T0-node This PR/Issue is related to the topic “node”.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

litep2p authority discovery is very slow 5m vs 1h
2 participants