Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: more metrics #122

Closed
wants to merge 1 commit into from
Closed

feat: more metrics #122

wants to merge 1 commit into from

Conversation

pete-eiger
Copy link
Contributor

@pete-eiger pete-eiger commented Dec 13, 2023

Issue: #83

Metrics updates:

  • Total Message Count - this was already covered by CACHED_PPOI_MESSAGES, I think
  • Average message frequency - I think we can use CACHED_PPOI_MESSAGES for this, with this promql query on Grafana rate(cached_ppoi_messages[ $__rate_interval ])
  • Average Processing Time - new metric AVERAGE_PROCESSING_TIME that tracks the execution time of process_ppoi_message
  • Frequent senders - new metric FREQUENT_SENDERS_COUNTER, which is updated in process_ppoi_message, presented in Grafana using topk(10, sum(rate(frequent_senders_counter[$__interval])) by (indexer_address))
  • Divergence Rate - using the current metrics CACHED_PPOI_MESSAGES and DIVERGING_SUBGRAPHS and this query (DIVERGING_SUBGRAPHS / sum(CACHED_PPOI_MESSAGES)) * 100 we can display the divergence rate
  • Latest Message Timestamp - to make things easier we have a new metric LATEST_MESSAGE_TIMESTAMP which also gets updated in process_ppoi_message
  • POI stakes for Deployment - new metric MAX_STAKE_POI that gets updated in compare_attestations when we get the most attested poi. Still mulling over the name though 🤔
  • Network connectivity - we already have the gossip peers, it was mislabelled on the dashboard, and frequent senders reflects the actual Indexers now

Other changes:

  • updated grafana.json

@pete-eiger pete-eiger requested a review from hopeyen December 14, 2023 06:59
@pete-eiger pete-eiger marked this pull request as ready for review December 14, 2023 06:59
@@ -109,6 +113,8 @@ pub async fn process_ppoi_message(
messages: Vec<GraphcastMessage<PublicPoiMessage>>,
callbook: &CallBook,
) -> Result<RemoteAttestationsMap, AttestationError> {
let start_time = Instant::now();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you explore a bit on how to use the metrics generated by the autometrics macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of time I created an Issue for this, it will take more time since between the version we're using and 1.0.0 there are a fair amount of breaking changes in autometrics-rs.
#125

let most_attested_poi = sorted_remote_attestations.last().cloned();

if let Some(attestation) = &most_attested_poi {
MAX_STAKE_POI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATTESTED_MAX_STAKE_WEIGHT?

Copy link
Collaborator

@hopeyen hopeyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • CACHED_PPOI_MESSAGES is a gauge, Total Message Count is meant to be a running total of all the POI messages that have been processed, more like a counter, this might be instead the RECEIVED_MESSAGES.
  • Thinking about Divergence Rate = (DIVERGING_SUBGRAPHS / sum(CACHED_PPOI_MESSAGES)) * 100 again, I'm not sure how an user would interpret this number... I think there could be another metric in addition to track a counter of Comparison results labeled with deployment, then show a metrics on the rate of results counter / received messages

would you please also post a screenshot of the dashboard?

@pete-eiger pete-eiger force-pushed the petko/add-more-metrics branch 2 times, most recently from 067b8ab to 5604214 Compare January 22, 2024 12:47
@pete-eiger pete-eiger force-pushed the petko/add-more-metrics branch from 5604214 to 4933912 Compare January 22, 2024 16:59
@pete-eiger
Copy link
Contributor Author

I will reopen this PR or create a new one with the changes when the memory leak is fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants