-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic Conventions - Multi-Registry Proposal #348
base: main
Are you sure you want to change the base?
Conversation
`attributes` section of a group can reference an imported attribute. | ||
|
||
In the current semantic conventions specification, referencing a group is currently unsupported. Uniqueness within | ||
groups is scoped by the type of group. It is entirely possible to have an event and a metric identified by the same ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should fix this first - This is something we could do BEFORE allowing multiple registries and I think would make our lives better overall.
cc @lmolkova for thoughts on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsuereth I think it's a feature, not a bug - #348 (comment)
E.g. http.request.body.size
can in theory be an attribute and a metric name. Or messaging.message.time_in_queue
.
I don't believe we have real examples of this but there were some discussions in the past where it would make sense to use the same attribute name on an event/span as some existing metric name.
It could be a good feature for spans/events->metrics pipeline (take specific attribute and convert it to metric) and I'd like to check if we can keep this door open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is entirely possible to have an event and a metric identified by the same ID.
I think it's no longer the case - we have unique group ids enforced across signals. So perhaps this discussion can be resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova I removed the part mentioning the possible lack of uniqueness between different types of signals. However, we still have an issue when a registry wants to extend the attribute_group
of another registry. The attribute_group
ID, I believe, is unique within a specific version of the registry. However, group IDs, in general, are not truly persistent across registry versions, unlike the name
fields (used for the signals). I added a note in the proposal to keep track of this. Please let me know if you disagree.
I don't feel qualified to approve this, but the motivation and vision look good to me. Thanks @lquerel! |
@lquerel Is it time to revisit this and get it merged? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #348 +/- ##
=====================================
Coverage 74.5% 74.6%
=====================================
Files 51 51
Lines 3965 3965
=====================================
+ Hits 2957 2959 +2
+ Misses 1008 1006 -2 ☔ View full report in Codecov by Sentry. |
I've read through this a few times now and this proposal seems sound to me. What do we need to get this proposal merged? Do we just need all the Open Questions resolved? Do we need to move any items out of scope? Separately, Multi-Registry widens the user-base for semconv authoring and Weaver beyond those few who are very invested today. There is a steep learning curve not only to writing good quality definitions but all the periphery items like rego policies, minijinja templates, jq, etc. So, in parallel with this work I think we need to consider how to stabilize and simplify the authoring process. Perhaps we can make Weaver more "batteries-included" too so you get a standard set of document and code generation templates and validation policies out-of-the-box and so on. |
weaver_registry.yaml | ||
``` | ||
|
||
### `weaver_registry.yaml` File |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should decide if/how it aligns with the 3-file-schema-approach we discussed in semconv tooling call:
- 3 files
- Existing file - marker
- contains pointer to diff
- contains pointer to def
- Model file
- Contains the full set of defined semantic conventions for that version
- Migration file
- Contains transformations you can use to keep version compatible *when they are non breaking differences*
- Should be cohesive/stand-alone for transformation use case.
- Existing file - marker
There seem to be a lot of intersection with what'd we put into the 2.0 of schema.yaml
and we should figure out if we need another one for registry manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, we had:
- The current semantic conventions, which are a series of YAML files adhering to a syntax defined by a JSON schema maintained in the Weaver repository. This format includes concepts of references, with or without overrides, and group extensions.
- The telemetry schema file for a specific version of the registry, containing the transformations required to transition from one version of the registry to another (manually maintained).
With the introduction of multi-registry support, we are defining:
- A manifest file to inform Weaver of the dependencies of the registry containing this manifest. This manifest is used by Weaver to produce a resolved and packaged registry (see next bullet point).
- A packaged version of a registry (self-contained format) to facilitate its publication and import into other registries. The packaged version of a registry includes within it the version of the manifest that enabled its construction.
- An update to the file currently referenced by the
schema_url
fields in OpenTelemetry streams. This file is currently called a telemetry schema file (need to be renamed IMO), and our previous discussions have led to a complete redefinition of this file. The main idea is for this file to be as minimal as possible, containing primarily pointers to files dedicated to specific needs (packaged/self-contained registry description, description of version migration transformations, and optionally a description of changes between the “current” version and the previous one).
To conclude, in my view, the role of the manifest file and the file containing pointers to artifacts produced by Weaver from this manifest file are two distinct things.
Apart from the manifest file, this proposal does not go into too much detail regarding the structure of these new files. While defining the files is important, I don’t believe that the specifics of their definition would change the general idea presented in this proposal. And to be completely honest, I’d like to have this proposal merged into the repository to feel like we’re making progress :-)
One of the next steps will be to define these different formats more precisely. Do you think this approach is satisfactory?
References to experimental entities across registries are allowed under certain conditions: | ||
|
||
- An entity referencing an experimental entity must also be marked as experimental. | ||
- The flag `allow_experimental_ref` in the `weaver_registry.yaml` file must be set to `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow_experimental_ref
- could we instead recommend (and include) a set of default policies to run when referencing things? I assume there will be tons of other policy-related things (charset, naming, deprecation/removal/back-compat, etc) that depend on the project needs and it'd be nice to make this configurable/expandable via a policy set rather than config options in the manifest
I really like the direction it's going to! There are a lot of details to discuss/agree which are impl details and can be solved in one way or another. I don't know if merging this specific PR is necessary though - if it is, I'd minimize it to cover the essential parts and would remove implementation details that need discussion. E.g. I think it'd be great to define
But I feel like having smaller focused PRs would be easier. |
@jsuereth @lmolkova @jerbly I’ve made a significant number of changes to the document to address most of your feedback. Regarding the open questions initially present in the document, I’ve either answered them by adding to the proposal or removed them because they were not necessarily important within the specific scope of this proposal. Unless you see major issues to address in this new version, I’d like to get your approval and merge this proposal. This proposal has been sitting in the list of PRs for six months (mostly due to me). I believe it now provides a good representation of the direction we’re heading in terms of supporting multiple registries and the implications for the ecosystem. Having a merged version in the repo will also simplify communication. On a personal note, it would be nice to no longer see it in the list of PRs :-) My next focus will be finalizing the PR on schema diffs to then begin the first round of implementation for this multi-registry support. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be merged now - there's more than enough here to start iterating. There's growing interest in this area with more people working out their own workarounds.
This PR proposes to support multiple semantic convention registries in OTEL and Weaver.
If you’re like me and prefer to read a rendered version of the markdown spec, it’s available here.
Note: This proposal could eventually be transformed into an OTEP if needed.
See GH issue #215