Simplify profile stack trace representation #615
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The primary motivation is laying the ground work for timestamp based profiling where the same stack trace needs to be referenced much more frequently compared to aggregation based on low
cardinality attributes.
Timestamp based profiling is also expected to be used with the upcoming Off-CPU profiling feature in the eBPF profiler. Off-CPU stack traces have a different distribution compared to CPU stack traces. In particular Off-CPU stack traces are much more repetitive because they typically occur at special leaf functions such as leaf (async preemption being a notable exception). For the same reason it is also uncommon to see a stack trace that are a root-prefix of a previously observed stack trace.
We might need to revisit the previous previous benchmarks to confirm these claims, as the previous analysis seems to have shown that the location range based encoding is always either comparable or better than the mechanism proposed here.
The secondary motivation is simplicitly. Arguably the proposed change here will make it easier to write exporters, processors as well as receivers.
It seems like we had rough consensus around this change in previous SIG meetings, and it seems like a good incremental step to make progress on the timestamp proposal.