-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support OCI registries for provider and module distribution. #308
Comments
Are there well established OCI registries that you can easily operate though? I ask, because we are on-prem and containers registry are one of the few things we are not operating ourselves (we still lean on Dockerhub for that). I didn't exactly find an s3-backed docker registry (based on the registry image on dockerhub) smooth sailing when I tried ~2 years ago (I could give it another go to see if things have improved since then) and Harbor is complex enough that the official instructions advocate using kubernetes to operate it (which you need to think really hard about if you are inclined to use it as a foundational technology for your platform in a small team which your terraform provider registry will be). I haven't really looked at Dragonfly yet. Ironically, compared to that, S3 hasn't been that bad actually with Minio or Ceph (the later come with most Openstack installations and the former is pretty nice to operate). Edit: Actually, I now recall that Gitlab also comes with a container registry though its a bit overblown if you don't need the git functionality that comes with it. I think Nexus OSS also supports it, though it is not HA. |
@Magnitus- GitHub offers it as well. |
Yep, exactly what @ilmax said. There's many container registry implementations and it's a widely-used standard (on AWS you'd probably use the Elastic Container Registry, rather than S3). There's also an official headless registry implementation by Docker, useful for creating internal mirrors of the images you use, though I believe Harbor would be the fully-featured self-hosted option. I think it makes much more sense to reuse all of this effort than have to author registries with a custom protocol, but as @ilmax said, this is meant to be an alternative, not a replacement for the registry protocol. |
We go with Dockerhub because it is cheap (25$/month) and expensing any recurring service is a pain here. However, the automation is non-existent (really old school dashboard experience which is ok if you have a small number of repositories in a small team). Large cloud suppliers are less attractive for services that download non-negligible amounts of data when you are otherwise not on them, because they charge ridiculous egress fees (ie, 0.09$/GB for aws and it goes down with volume, though not by that much), although I should periodically take a look at what the more reasonable cloud suppliers (like Digital Ocean with its 0.01$/GB egress fee) are offering. I'll have to look at Github. But yes, at least there is choice.
Yes, these are the two solutions I've looked at on-prem. The former was beautiful on my local machine using the filesystem for storage, but failed spectacularly once I tried to scale it against a Ceph object store. It looked so elegant and yet it didn't work and we were really short on time so we went with Dockerhub. It broke my heart. Harbor looks very robust, though it also look like its trying hard to outdo Openstack, Kubernetes or Kafka in terms of operational complexity. I'll have to try the registry implementation again, its been about 2 years now. Perhaps that has changed.
Well, if its optional, then more choice is always nice. And yes, it is more standard (although finding a robust open source offering that is not too hard to operate felt a little like pulling teeth so far). |
@Magnitus- also worth nothing that a terraform modules shouldn't be as big as a regular container, most modules are a matter of KBs vs tens or hundred of MBs for regular container, so I am not expecting a cloud container registry used to store terraform modules to be very expensive |
I think its more like a couple of MBs, but fair point. |
Defining an ignore file like |
On the costs side, I'd like to bring to your attention a no-egress-fee object storage, in case it might be useful later: (Disclaimer: I have no affiliation with CF, other than customer/user) |
Chiming in to express my support for this proposal, I've brought it up on hashicorp/terraform in the past, some possibly useful additional context and information can be found there: hashicorp/terraform#31463 |
@itspngu Glad to hear that! If you can, please edit your comment to describe the issue here in place (or the important points from it), we generally try for the issue discussion here to be self-contained and not link to legacy Terraform issues / PRs / documents. |
I've already opened #376 (and closed it again after finding this issue), the contents are replicated there. However, the original issue's comments/discussion contain some useful additional information, and it'd feel weird to copy a series of posts by other people here if that makes any sense. |
Gitlab's container registry supports OCI pretty well, I'm using it for hosting one of my custom Steampipe plugins. |
I should have re-read that. I was thinking about providers (golang binaries) to be in MBs. Modules, which are just hcl files, would indeed be measured more in KBs. |
Harbor is the registry we're using outside of cloud. It's a CNCF Graduated project and is pretty solid. I'd contribute to this PR to help with the OCI bits and to get it working/test on Harbor. |
We'd appreciate the community adding reactions to the post, or posting comments with their use-cases to support the implementation of this. |
@cube2222 do this need an RFC in order to be discussed and (hopefully) accepted and implemented? |
Hey @janosdebugs I can explain my use case if you want |
@ilmax please do! |
@janosdebugs OCI registries are a ubiquitous pattern with an effectively zero barrier to entry (GHCR). What alternative approach offers this ease of use? I think the Helm community experience could be used as a case in point, and in that scenario there was already a relatively simple self hosted option with automation support. |
Hey @stevehipwell, thank you. We are looking for specific things that not having OCI registries as a distribution method prevent OpenTofu users from doing. |
For me, it's less about not having OCI registries and more about having "yet another registry and format" that I have to maintain. I don't think the current mechanism needs to be deprecated, but make OCI an option. This relieves a team from having to provision, configure, and deploy another registry. Helm is a good example here because like tofu, a helm registry was a static HTTP site with a specific document structure. When they moved to OCI (though it was less of a move and more like a target audience) then they didn't have to manage a separate registry. GHCR is one of many compatible OCI registries, meaning hosting a new tofu package would be part of the pipeline to release a new package--the deployment of the package is done by OCI. I (and others) pressed HC about Terraform using an OCI registry. They were never motivated because it competes with TF enterprise. Tofu doesn't have that issue. |
If it's a question of contributors to make the changes, then I volunteer as tribute and encourage others to react to this comment as a sign of those who can work on it. |
Hey folks, we had a discussion with the core team about this and we believe that this would be a valuable addition to OpenTofu and we should implement it. That being said, there are a few things that are unclear and require an RFC.
If anyone feels up to the task of writing this RFC, please let us know. |
1 and 2 are probably up for discussion as there’s a few of us with ideas. I’m down to contribute on the RFC/code portion as well (as I suspect many are). Going on vacation in a few but I’ll check back after and if no one else starts it, I can take a first pass. |
At any rate, these should be addressed in detail in the RFC with a detailed rundown on how the implementation should work. |
Unasigning @cube2222 so someone else can write a RFC based on the POC |
I am currently working on a POC around this issue to support oci based images in my own module registry terrapak, I am happy to contribute my findings to an RFC. Some thoughts I have based on raised questions.
tofu registry login | pull | push | add | rm |
Hey @eunanhardy thank you very much.
|
In my opinion, an |
|
Regarding the RFC, one of the bigger unknowns is the actual structure/schema of the OCI artifact. OCI storage mostly consists of content-addressable blobs, manifests that are jsons and often reference blobs or other manifests, as well as mutable tags to reference blobs or manifests. E.g. multi-platform docker images have a central manifest (for the overall image) that points to one manifest per platform (among other metadata). Each per-platform manifest contains a list of references to its layers (among other metadata). Layers are blobs which contain tarballs of directory structures. In my PoC I made a somewhat sensible structure that supports providers for many operating systems and architectures, but it wasn't well thought-out. It was loosely inspired by multi-platform docker images, though simplified. It would be good if someone (either as a comment, or as part of an RFC) would suggest a concrete schema and structure for the manifests we'd store. What should the mediaType and artifactType be. Where should we reuse existing ones, and where should we introduce our own. Maybe actually just using the format used by container images is the way to go (after all, a provider is in fact a tarball of files with an entrypoint, which is almost what a container image is if you squint hard enough)? Or maybe not. Really just needs careful consideration and a proposal. |
It may be an interesting read for inspiration - how OCI images are used to ship WASM plugin to e.g. Istio - it's basically a |
The CUE project just landed their implementation of modules. They have some nicely written documentation around it: They open sourced their OCI code here: https://github.com/cue-labs/oci |
Personally, I like the WASM approach mentioned by @programmer04 because it would allow standard security tools in OCI registries to work on it. The only concern is that someone might misunderstand that layout as something where they can ship additional files alongside the binary, which is, of course, not the case. It's also worth mentioning that unpacking a container image isn't necessarily the simplest process. |
Your right @janosdebugs. I think unpacking is not great, but it is not terrible - see the implementation |
Is it not the case? You can already do that by including additional files in the provider tarballs, and I think that's fine, and nothing prohibits you from using them, and the provider using those files when running? Either way, I really like that |
I like the I also happy to implement this feature. I am already implementing all the core components in other projects and would be happy to dedicate some time to this |
I might be misunderstanding the conversation about structure/schema of the OCI artifact. But is this the sort of situation ORAS is trying to standardize? Helm uses this under the hood for storing helm charts in OCI registries. It feels like a similar concept?
The usage for a module would then end up something like the below? (Using the orca CLI here, but would really be their go SDK). oras push <your-registry>/<your-module-name>:<your-module-version> \
--artifact-type application/vnd.opentofu.module \
<path-to-terraform-module> |
Seconding @nick-williams-spirit on the ORAS "standardisation" that uses media-type. An alternative project that I haven't seen mentioned is FluxCD's usage of OCI to store kustomize bundles. They're looking to push this to kustomize upstream so their layer design might be worth looking at. At the end of the day from what I understand it's quite ORAS compatible in the sense that it's a tarball with the correct labelling of the layer. |
Hey folks, I filed a preliminary draft as #2163, but it still needs a lot of work on the technical side as OpenTofu-specific questions are concerned. Feel free to leave your comments on the PR as reviews. |
Hey folks, we need your help! As we are finalizing the technical design of this feature, we want your input. Please help us by filling out this short survey if you are interested in using OCI with OpenTofu. This survey is open until January 31, 2025. |
Right now OpenTF relies on a specific registry implementation to host providers and modules. That works fine and is easily hostable using static files on something like S3, but generally, it's not trivial to self-host, and it's OpenTF+Terraform-specific.
There is however a common component used widely - OCI registries, like DockerHub or the GitHub Container Registry - that are basically perfect for the use-case, as they really are a generic storage for content-addressable-hash-identified blobs. If OpenTF supported these, then you could use any container registry you already have on hand to host providers and modules, either as a way to host custom ones, or in order to have an internal mirror of the public registry.
This does not mean that we'd somehow use docker to run providers, or anything like that. It's just using OCI registries to host the provider zipfiles and module tarballs. You can take a look at the ORAS website to learn more about using OCI registries for generic artifact distribution.
I've already done a PoC of this for providers, the diff of which you can find here and which also requires some changes to the terraform-registry-address repository.
Here's a short video of it:
![https://github.com/opentffoundation/opentf/assets/7013055/c9477fe9-df0c-416e-bb2e-c2572e6476da](https://private-user-images.githubusercontent.com/7013055/266039187-c9477fe9-df0c-416e-bb2e-c2572e6476da.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwMjMzMTIsIm5iZiI6MTczOTAyMzAxMiwicGF0aCI6Ii83MDEzMDU1LzI2NjAzOTE4Ny1jOTQ3N2ZlOS1kZjBjLTQxNmUtYmIyZS1jMjU3MmU2NDc2ZGEubXA0P1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIwOCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMDhUMTM1NjUyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NTAzODkyNjAxMDIyNjk1ZjMzZWUzNjU4NTBkNzQ5MmQ1ZWM3OTM1NjM5N2NhNzhiODI1ZjdlMmEzMjM5NWI5ZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.IO-Sdjr4TCzBv8la53-rgu9Z3-sdN_cEP_WYHFM6oXA)
I will prepare a proper RFC for this eventually, once I figure out a few details related to media types, artifact types, and the exact schema we should be using here.
The text was updated successfully, but these errors were encountered: