Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify profile stack trace representation #615

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

felixge
Copy link
Member

@felixge felixge commented Jan 9, 2025

  • Introduce a first-class Stack message type and lookup table.
  • Replace location index range based stack trace encoding on Sample with a single stack_index reference.
  • Remove the location_indices lookup table.

The primary motivation is laying the ground work for timestamp based profiling where the same stack trace needs to be referenced much more frequently compared to aggregation based on low
cardinality attributes.

Timestamp based profiling is also expected to be used with the upcoming Off-CPU profiling feature in the eBPF profiler. Off-CPU stack traces have a different distribution compared to CPU stack traces. In particular Off-CPU stack traces are much more repetitive because they typically occur at special leaf functions such as leaf (async preemption being a notable exception). For the same reason it is also uncommon to see a stack trace that are a root-prefix of a previously observed stack trace.

We might need to revisit the previous previous benchmarks to confirm these claims, as the previous analysis seems to have shown that the location range based encoding is always either comparable or better than the mechanism proposed here.

The secondary motivation is simplicitly. Arguably the proposed change here will make it easier to write exporters, processors as well as receivers.

It seems like we had rough consensus around this change in previous SIG meetings, and it seems like a good incremental step to make progress on the timestamp proposal.

- Introduce a first-class Stack message type and lookup table.
- Replace location index range based stack trace encoding on Sample with
  a single stack_index reference.
- Remove the location_indices lookup table.

The primary motivation is laying the ground work for [timestamp based
profiling][timestamp proposal] where the same stack trace needs to be
referenced much more frequently compared to aggregation based on low
cardinality attributes.

Timestamp based profiling is also expected to be used with the the
upcoming [Off-CPU profiling][off-cpu pr] feature in the eBPF profiler.
Off-CPU stack traces have a different distribution compared to CPU
samples. In particular stack traces are much more repetitive because
they only occur at call sites such as syscalls. For the same reason it
is also uncommon to see a stack trace are a root-prefix of a previously
observed stack trace.

We might need to revisit the previous [previous benchmarks][benchmarks]
to confirm these claims.

The secondary motivation is simplicitly. Arguably the proposed change
here will make it easier to write exporters, processors as well as
receivers.

It seems like we had rough consensus around this change in previous SIG
meetings, and it seems like a good incremental step to make progress on
the timestamp proposal.

[timestamp proposal]: #594
[off-cpu pr]: open-telemetry/opentelemetry-ebpf-profiler#196
[benchmarks]: https://docs.google.com/spreadsheets/d/1Q-6MlegV8xLYdz5WD5iPxQU2tsfodX1-CDV1WeGzyQ0/edit?gid=2069300294#gid=2069300294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant