-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registry manifest and Schema diff #400
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # crates/weaver_semconv/src/group.rs
# Conflicts: # .clippy.toml # Cargo.toml # crates/weaver_semconv_gen/src/lib.rs # src/registry/search.rs # src/registry/stats.rs # src/registry/update_markdown.rs
Co-authored-by: Jeremy Blythe <[email protected]>
Co-authored-by: Jeremy Blythe <[email protected]>
…l-weaver into generate-otel-schema
…cies to be part of the public API.
…cies to be part of the public API.
docs/schema-changes.md
Outdated
semconv_version: v1.26.0 | ||
changes: | ||
attributes: | ||
- type: deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit and personal taste, so definitely not blocking
having
- name: http.server_name
type: deprecated
looks more readable to me, maybe because it's closer to how we write attribute definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will make this change in the documentation.
and the deprecation note is stored in the note attribute. | ||
- `removed`: An item in the baseline registry was removed in the head | ||
registry. The name of the removed item is stored in the name attribute. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll eventually need a way to describe other (breaking) changes such as attribute type, metric unit or instrument change.
Maybe we can add other
category that would be used for change types that we don't formally support yet? This would allow someone to look at the diff and understand that there was some change and maybe some manual intervention is necessary.
Otherwise, the lack of any mention could be perceived as 'there was no change, all looks good and compatible'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a note and describe how this could be represented as a potential future extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still haven't finished some of the more mechanical parts, but did review the diff algorithm, deserialization core code.
I think we have an issue around attribute identity we may need to solve. PTAL at comments.
} | ||
|
||
/// Returns the number of non-fatal errors, or 1 if the result is a fatal error, 0 otherwise. | ||
pub fn error_count(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we use num_errors
?
@@ -22,7 +22,7 @@ Brief: Span attributes used by non-OTLP exporters to represent OpenTelemetry Sco | |||
- Examples: [ | |||
"io.opentelemetry.contrib.mongodb", | |||
] | |||
- Deprecated: use the `otel.scope.name` attribute. | |||
- Deprecated: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test be updated?
@@ -39,7 +39,7 @@ Brief: {{ resource.brief }} | |||
- Sampling relevant: {{ attribute.sampling_relevant }} | |||
{%- endif %} | |||
{%- if attribute.deprecated %} | |||
- Deprecated: {{ attribute.deprecated }} | |||
- Deprecated: {{ attribute.deprecated.note }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to put this in the changelog or have some kind of migration guide for this then.
/// Creates a new string attribute. | ||
/// Note: This constructor is used for testing purposes. | ||
#[cfg(test)] | ||
pub(crate) fn string<S: AsRef<str>>(name: S, brief: S, note: S) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: name, brief and note would have to be the same concrete type for this helper function.
You should either have three different type parameters for them or use something like impl Into<String>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I like the proposal.
|
||
/// Sets the deprecated field of the attribute. | ||
/// Note: This method is used for testing purposes. | ||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these helper methods. May replicate the pattern elsewhere.
|
||
/// Get the attributes of the resolved telemetry schema. | ||
#[must_use] | ||
pub fn attribute_map(&self) -> HashMap<&str, &Attribute> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's some faulty assumptions here.
- This assumes that all Attributes MUST show up in an attribute group. That is true for Semantic Conventions (due to custom policy), but not enforced by weaver.
- This assumes that all attribute groups are registries. This is not true for Semantic Conventions. We have some attribute groups that just "share" attributes for other non-attribute groups. These may be selected before the registry attributes.
Sadly, I think we likely need a more rigid identifier here than name. I haven't had a chance to look through the rest of the code for implications, but I don't think we can rely on this method to only give us unique attributes or attributes from the registry.
/// | ||
/// Note: At the moment (2024-12-30), I don't know a better way to identify | ||
/// the "registry" attributes other than by checking if the group ID starts | ||
/// with "registry.". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because "registry" is a semantic convention concept, not a weaver one. Should we elevate this to weaver itself?
|
||
/// Get the groups of a specific type from the resolved telemetry schema. | ||
#[must_use] | ||
pub fn groups(&self, group_type: GroupType) -> HashMap<String, &Group> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Any reason why this one uses String
instead of &str
like the other helpers?
semconv_specs: Vec<(String, SemConvSpec)>, | ||
) -> SemConvRegistry { | ||
) -> Result<SemConvRegistry, Error> { | ||
static VERSION_REGEX: LazyLock<Regex> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We should use: https://docs.rs/semver/latest/semver/ and URL parser that can give us the last element of the path to send to the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clippy found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
docs/schema-changes-use-cases.md
Outdated
```yaml | ||
# Version n+1 | ||
groups: | ||
- id: registry.network.deprecated | ||
type: attribute_group | ||
attributes: | ||
- id: net.peer.name | ||
type: string | ||
brief: Deprecated, use `server.address` on client spans and `client.address` on server spans. | ||
deprecated: | ||
type: conditionally_renamed | ||
forward: > | ||
switch span_kind { | ||
case 'client' => attributes['server.address'] = attributes['net.peer.name'], | ||
case 'server' => attributes['client.address'] = attributes['net.peer.name'] | ||
} | ||
backward: > | ||
switch span_kind { | ||
case 'client' => attributes['net.peer.name'] = attributes['server.address'], | ||
case 'server' => attributes['net.peer.name'] = attributes['client.address'] | ||
} | ||
stability: experimental | ||
examples: ['example.com'] | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example, the attribute has been deprecated in the attribute_group but the condition is related to its usage in a span. Should the forward / backward instructions be on the span definition where net.peer.name
is referenced? There may be different instructions for different spans that use net.peer.name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. It's a good topic for discussion for our next Semantic Conventions Tooling SIG.
docs/schema-changes-use-cases.md
Outdated
```yaml | ||
# Version n+1 | ||
groups: | ||
- id: registry.db.deprecated | ||
type: attribute_group | ||
stability: experimental | ||
attributes: | ||
- id: db.instance.id | ||
type: string | ||
brief: 'Deprecated, no general replacement at this time. For Elasticsearch, use `db.elasticsearch.node.name` instead.' | ||
deprecated: | ||
type: conditionally_deprecated | ||
forward: > | ||
if attributes['db.system'] == 'elasticsearch' then attributes['db.elasticsearch.node.name'] = attributes['db.instance.id'] | ||
else drop attributes['db.instance.id'] | ||
backward: > | ||
if attributes['db.system'] == 'elasticsearch' then attributes['db.instance.id'] = attributes['db.elasticsearch.node.name'] | ||
stability: experimental | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like my previous comment. When the instructions involve other attributes then the change is dependent on the presence of those attributes in that signal definition. Perhaps defining the instructions on the attribute_group means it's universal for all uses via references? Maybe this can be overridden with further instructions on the span, for example, to deviate from this default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in the attribute_group
, the setting is intended to be universal. However, I agree that we should demonstrate how this setting can be overridden when dealing with a specific signal.
IMPORTANT NOTE: At the moment, the only important part to review is the document
docs/schema-changes-use-cases.md
, which is also available in a more readable format here.https://github.com/open-telemetry/weaver/blob/1388d1e3b7200cefda95769a2e3f3d298a1957db/docs/schema-changes-use-cases.md
Note: The scope of this PR has been reduced to focus only focus on the schema diff feature. Github issues have been created to track the features that have been postponed #482, #483.
This PR implements the command
registry diff
, see the following example:In this example, the diff is displayed in markdown format. The following formats are supported: json, yaml, markdown, ansi, ansi_stats
A detailed description of the schema diff data model and the diffing process is visible here.
Notes:
weaver_otel_schema
is not essential for this PR; it was initially included as part of the preparations for theregistry schema-update
command. We have decided to implement this command in a future PR. However, for simplicity, I prefer to keep the preparation code in place instead of removing it. Same thing forall_changes
inweaver_version
.List of modifications to apply to the semantic conventions repository after the release of the Weaver containing the current PR:
registry-manifest.yaml
file with the version of the next release.Closes: #186