-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ingest/processed/bytes metric #17581
base: master
Are you sure you want to change the base?
Conversation
@@ -329,6 +331,27 @@ public void run(final QueryListener queryListener) throws Exception | |||
} | |||
// Call onQueryComplete after Closer is fully closed, ensuring no controller-related processing is ongoing. | |||
queryListener.onQueryComplete(reportPayload); | |||
|
|||
long totalProcessedBytes = reportPayload.getCounters().copyMap().values().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a wrong place to put this logic .
Ingest/processed/bytes seems like a ingestion only metric no ?
If that is the case, we should emit the metric only if the query is an ingestion query.
you could probably expose a method here https://github.com/apache/druid/blob/9bebe7f1e5ab0f40efbff620769d0413c943683c/extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java#L517
saying emit summary metrics and have the task report and the query passed to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the logic
Also there are some static check failures which need to be looked at. |
@cryptoe fixed |
try { | ||
emitMetric(toolbox.getEmitter(), "ingest/processed/bytes", rowStatsForRunningTasks.getProcessedBytes()); | ||
} | ||
catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need a try-catch here. rowStatsForRunningTasks
should always be non-null afaict.
.setMetric("ingest/processed/bytes", bytesProcessed) | ||
); | ||
} | ||
catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need a try-catch.
ServiceMetricEvent.builder() | ||
.setDimension("taskId", task.getId()) | ||
.setDimension("dataSource", task.getDataSource()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this, you could use IndexTaskUtils.setTaskDimensions()
to set all task related dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about setting the .setMetric("ingest/processed/bytes", bytesProcessed)
? @kfaraz
Does this look right?
IndexTaskUtils.setTaskDimensions(new ServiceMetricEvent.Builder(), task);
toolbox.getEmitter().emit(
ServiceMetricEvent.builder().setMetric("ingest/processed/bytes", bytesProcessed)
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you would need to set that separately. Something like:
final ServiceMetricEvent.Builder metricBuilder = new ServiceMetricEvent.Builder();
IndexTaskUtils.setTaskDimensions(metricBuilder, task);
toolbox.getEmitter().emit(metricBuilder.setMetric("segment/txn/failure", 1));
Using IndexTaskUtils.setTaskDimensions()
helps avoid duplication of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated
}).sum()).sum() | ||
: 0; | ||
|
||
log.info("Total processed bytes: %d, query: %s", totalProcessedBytes, querySpec.getQuery()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a debug log.
log.info("Total processed bytes: %d, query: %s", totalProcessedBytes, querySpec.getQuery()); | |
log.debug("Processed bytes[%d] for query[%s].", totalProcessedBytes, querySpec.getQuery()); |
A new metric
ingest/processed/bytes
has been introduced to track the total number of bytes processed during ingestion tasks, including native batch ingestion, streaming ingestion, and multi-stage query (MSQ) ingestion tasks. This metric helps provide a unified view of data processing across different ingestion pathways.Key changed/added classes in this PR
This metric was added in three key ingestion task classes:
This PR has: