Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[all] limitations of our type exports and proposed ways forward #5283

Open
pichlermarc opened this issue Dec 18, 2024 · 2 comments
Open

[all] limitations of our type exports and proposed ways forward #5283

pichlermarc opened this issue Dec 18, 2024 · 2 comments
Labels
feature-request target:next-major-release This PR targets the next major release (`next` branch)

Comments

@pichlermarc
Copy link
Member

pichlermarc commented Dec 18, 2024

Description

The way we currently export our types in this repository is difficult to maintain and is the root-cause of important issues that need to be addressed. This issue is intended as a

  • documentation of the status-quo
  • a starting point to improvement in upcoming releases (SDK 2.x, 3.x and beyond)

Status-quo

As it stands today, we export classes directly, including associated types, with all internal "private" properties as part of our "public" API. This routinely causes trouble, as we expect the exact version of a package to be used when using other packages (non-exhaustive list: #5062, #2845, #3944, #5090). While breaking is expected when a packages are pinned and an exact class type is provided, this is still a source of end-user pain, because:

  • we don't always have a more generic interface that could be used instead
  • in places where we have generic interfaces it's still easy to get it wrong by using the more concrete type, even though there's (in my opinion) few compelling reasons to have it exported at all (more on that later)

One key problem here is that experimental packages like @opentelemetry/sdk-node depend on pinned versions of other SDK packages and expect types from these packages as part of the public API. If a package used by the end-user not match that version exactly (see this demostrator) then that type may be rejected by the typescript compiler due to re-declaration of private properties on that class.

This problem cannot be fully fixed by just un-pinning the dependency, since end-users may still have another version somewhere in their lockfile and that old version may be incompatible, even though there were no changes to the class between these versions. So while an un-pinned version allows for de-duping of the package, the break may still happen, but but it may now happen depending on the lockfile history of the package and the pre-existing node_modules directory, which makes troubleshooting extremely difficult for us, since the user would have to supply their lockfile for us to investigate (which often a level of detail they are not allowed to provide).

This sudden increase in complexity from un-pinning means that attempting to solve this through unpinning alone may immobilize us to an extent where we cannot make any notable improvements to OTel JS, as we will be busy triaging issues and troubleshooting suddenly-occuring end-user type problems. Issue triage and bug handling is already a key limiting factor for the project today, even with pinned dependencies.

Keeping packages pinned, however, is also not a solution. While pinned packages make it easier to spot such incompatibilities, it does introduce another pain-point for end-users: packages that cannot be de-duped and can blow up the size of node_modules to an extent where certain limits, like the maximum size of a lambda (250 MB), are reached, which may prevent a new or exiting OTel user from deploying their app. Similar issues apply to use of OTel JS on the web, but it also drives cost significantly on server-side deployments.

Exporting class types also poses a different problem: it promotes sub-classing our exports to modify behavior, which introduces a set of users that we have to account for during pull request reviews - and that limits us in which kind of optimizations we can apply. In most cases, users are extending existing classes using inheritance even tough they're not actively trying to make use of the polymorphic structure. There, the cleaner approach would be to use composition over inheritance, as the base that is desired is always the extension interface, rather than the base class. We even do that ourselves, which leads to APIs being exposed to end-users that would otherwise stay private. An example of this is BasicTracerProvider and NodeTracerProvider. There's no real benefit of having BasicTracerProvider be part of the polymorphic structure of the NodeTracerProvider, as the base interface the user will likely be interested in is TracerProvider from @opentelemetry/api.

A side note on why this is less of a problem in contrib

This problem is only minimally present in the contrib repo, as most packages there are instrumentation packages.

These often only depend on only three packages:

  • @opentelemetry/api
  • @opentelemetry/core
  • @opentelemetry/instrumentation

Only @opentelemetry/api and @opentelemetry/instrumentation are being used in the public API of the packages. With @opentelemetry/api being a peer dependency in all of these packages, it's fairly uncommon for multiple instances of @opentelemetry/api ending up in the node_modules of the end-user. For @opentelemetry/instrumentation, this problem occurs more often, but has less impact through the Instrumentation interface which stays compatible across versions through a lack of private private properties. If a user were to consume InstrumentationAbstract, they may run into similar issues as described above, but simply having an alternative (the Instrumentation interface) solves a large part of the problem.

Proposed Solution

As stop-gap solution, we introduce interfaces as a replacement for concrete types (classes, abstract classes) that we currently consume in the public API of our packages. We review all our types and functions and if they are public, we ensure that we only consume interfaces over concrete classes in the public APIs (the resolution #3597 is an example for such a change). While doing this we should also make sure to mark anything that's private today as conventionally private (starting with an underline), as these properties will of course still be part of the object and end-users (or plain-js users) may get the wrong idea that it's safe to depend on this. This alone should resolve most of the type issues that prevent us from un-pinning dependencies today. This is the change that I propose should go into SDK 2.0.

As a follow-up (3.0 and beyond), we stop exporting classes wherever possible, and pivot to using factory-functions with separately defined return-types/interfaces instead. This will cut down a large amount of our public API, and since via this approach we won't expose private properties anymore, which usually cause this conflict. This means that two subsequent feature versions may be compatible with each other (types from the the newer may get assigned to an older one, and vice-versa - as long as they're actually compatible). An example of this approach can be seen in the re-structured OTLP exporter packages that I have been working on over the past few months (example).

I further propose we adopt Semantic Versioning for TypeScript types (SemVer TS). This specification outlines principles that I had partially already been applying independently in the re-structured exporters. These principles seemed intuitively correct based on my experience guiding end-users through resolving type-related issues. SemVer TS goes a step further by introducing proper documentation for types that should not be constructed directly by end-users. This is something we should also adopt for @opentelemetry/api to make expectations for SDK implementers more explicit.

A side note on differences between the practical guidance of SemVer TS and what I'm proposing

SemVer TS does offers practical guidance on how to still export class types so that they can be referenced by consumers. I don't think exporting classes like this is what we'd want to do as most of the types we export are based on so-called plugin interfaces by the OTel Specification. There the only part of the polymorphic structure that's important is that it conforms to that interface, so by exposing the class type we just add to the public interface without really getting any benefits for doing that.

Moving largely to factory-functions and hiding implementation details through hand-rolled return types is a breaking change, but has significant upsides:

  • we can apply significant performance and bundle-size improvements in feature-releases as we don't have to consider the internals part of the public API anymore, as anything will be hidden behind an interface.
  • it forces us to clean up any tests that assert on internal properties, which slows down development today as changes in one package often break tests in another one, that bloats up PRs and makes review more difficult. It also set a better example of how tests should be written and it forces us to write cleaner tests going forward
  • most crucially, it enables us to un-pin our dependencies as the types used in the public API become stable and very predictable not only for our end-users, but also for us, which gives us breathing room to apply this change without being overwhelmed by a flood of hard-to-debug typescript compile issues.
  • it makes approaches of packages like @opentelemetry/sdk-node more robust. We currently re-export types from SDK packages to circumvent this (I assume this was the original motivation), but with the approach I described, this is not necessary anymore and cleans up the public API of the package significantly.
  • adopting SemVer TS means we have to set exactOptionalPropertyTypes: true, which will solve Set the exactOptionalPropertyTypes tsconfig option #3713

Doing so also has some downsides:

  • it prevents end-users/third-party distributions from subclassing our exports, which means that that users who do this today, would need to shift towards a composition-based approach over their current inheritance-based approach.
    • it also prevent us from sub-classing and we may have to change exports that exist today to a composition-based approach. MetricReader is an example of something that would need to change.
  • it breaks the end-users's setup code, so this can only be done in an SDK major release. Fortunately, the change for end-users that don't extend classes can be a fairly straight-forward search and replace operation. See feat(sdk-metrics)!: export factory functions over classes  #4932 for an example of what changes for us and the end-user.

Once we've applied these changes, I propose we start un-pinning dependencies to packages that have received these changes, which will allow more flexibility in the types accepted by packages and will allow for easier de-duping of packages without introducing an unsustainable maintenance overhead for us.

What this will not solve:

@pichlermarc pichlermarc changed the title [work-in-progress][all] limitations of our type exports and proposed ways forward [all] limitations of our type exports and proposed ways forward Jan 9, 2025
@pichlermarc pichlermarc added feature-request target:next-major-release This PR targets the next major release (`next` branch) labels Jan 9, 2025
@trentm
Copy link
Contributor

trentm commented Jan 10, 2025

I'm very supportive of this plan. (I haven't read through much of semver-ts.org yet.)

As a follow-up (3.0 and beyond), we stop exporting classes wherever possible, and pivot to using factory-functions with separately defined return-types/interfaces instead.

Those factory-functions could be introduced in the 2.x line, slowing deprecating (but not removing) the class-based creation, leading up to 3.0.

@pichlermarc
Copy link
Member Author

Those factory-functions could be introduced in the 2.x line, slowing deprecating (but not removing) the class-based creation, leading up to 3.0.

Yes that was also my thinking - doing it for 2.0 immediately would be a large undertaking that may not be possible to do until Feb 17, and it may break a lot of users without warning.

By introducing the factory functions and deprecating the classes in 2.0 feature versions we give users adequate time to make changes in their apps and give feedback on the proposed approach (and also ourselves over at the contrib repo and opentelemetry.io). 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request target:next-major-release This PR targets the next major release (`next` branch)
Projects
None yet
Development

No branches or pull requests

2 participants