Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust controller lock to be based on controller name #258

Closed
wants to merge 2 commits into from

Conversation

mallardduck
Copy link
Contributor

@mallardduck mallardduck commented Jan 16, 2025

This PR was created while exploring solutions to an issue with Prometheus Federator's usage for helm-controller. More context here: rancher/prometheus-federator#141 (comment)

From what I've found the lock mechanism is one of the remaining changes needed to more elegantly support - for lack of better term - "multiple helm-controllers in a single cluster". Our use case is specific to PromFed, however I suspect it's better to generalize the idea of supporting this since that has the same challenges and doesn't involve the external project.

@mallardduck mallardduck force-pushed the allow-managedby-locks branch from 7a8ca5c to 08ac400 Compare January 16, 2025 20:26
Comment on lines +1 to +2
# This is the cluster scoped example - in k3s/RKE2 this is unnecessary as it will be deployed out of the box.
# To use this example on k3s/RKE2 you should exclude this part, or disable the embedded controller.
Copy link
Member

@brandond brandond Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what's going on with this example - we deploy one controller to the default namespace that watches all namespaces, and another to the helm-controller namespace that watches only its own namespace?

Unless I'm missing something this doesn't seem like a valid configuration. You could run one that watches all namespaces, or multiple that watch specific namespaces, but the two should not be mixed otherwise the global one will still step on the namespaced ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could run one that watches all namespaces, or multiple that watch specific namespaces, but the two should not be mixed otherwise the global one will still step on the namespaced ones.

That is essentially the behavior this is seeking to modify - expanding on the prior works around the ManagedBy field to allow co-existence of multiple controllers. Either of helm-controller type deployment - or when the controller is re-used as a library as with PrometheusFederator.

controllerLockName := "helm-controller-lock"
if controllerName != "helm-controller" {
klog.Infof("Starting helm controller using alias `%s`", controllerName)
controllerLockName = strings.Join([]string{controllerName, controllerLockName}, "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear, this is only necessary if you're running multiple in the same namespace, right? I guess it was envisioned that the controllers would run in separate namespaces, and watch whatever namespace they're deployed to. It sounds like you're trying to deploy multiple controllers to the same ns, while watching other namespaces?

Copy link
Contributor Author

@mallardduck mallardduck Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only necessary if you're running multiple in the same namespace, right?

I would clarify this to say: "multiple in the same scope".

Since PrometheusFederator exists (by default) as a cluster-scoped controller. However it sets a managedBy annotation matching its name, which means any controller named helm-controller will never deploy those charts.

And similarly the PrometheusFederator controller will never get the lock because an embedded k3s helm-controller will always get the lock first.

The example manifest is just a (admittedly contrived) example to try and show that dynamic without needing to directly involve PrometheusFederator images/charts in testing/fixing this.

@mallardduck
Copy link
Contributor Author

Hiya @brandond - Just realized that I think I overlooked some important details when crafting this fix and considering the "bug" I thought I saw. I'm seeing your point now about why this change doesn't make much sense. My initial understanding overlooked that the leases were already namespace respecting.

So even if when using multiple namespaced controllers, it doesn't matter if the leases/locks all get the named helm-controller. They exist in different namespaces so that's fine. Similarly, if a package (PromFed) implementing the controller deploys to a new namespace (like it should) then the lock not matching the name of the custom controller will be a visual concern (maybe) - but not a functional issue.

That's also assuming that the package implements the top level helm-controller/controllers. Where as one could also workaround the static helm-controller names by instead targeting helm-controller/controllers/chart and implementing their own lease/etc.

Beyond my confusion, the only other valid potential concern is that CRDs are (understandably) installed outside of the controller logic. However that also (potentially sub-optimally) means that every instance updates CRDs every time they start up. I'm just less clear on if there's a simple/obvious solution to that aspect.


So out of all of that, before I close this I have two questions:

  1. Is it worth it to provide a much more simple PR that updates the lock name to match input controller-name (without the incorrect examples)? Or is it preferred for use-cases with custom controller names to directly use chart controller and implement top-level controllers?
  2. Is it worth exploring a lease for CRDs? I'd imagine this is helpful in context of running multiple helm-controller's on a cluster in different namespaces.
    • I'd imagine that this one would ideally be global/universal. So in kube-system and a consistent helm-controller-crds name.
    • The lifespan of the lock should be long enough that the same "controller instance" that installed CRDs initially is always likely to keep it.
    • Then, in namespaced setup: only one instance would "own" updating the CRDs at a time. Potentially preventing weird situations where they're updating the version of helm-controller used in different projects/namespaces.
    • Today, when the first new one starts up the CRDs are updated to newest (probably fine). However, if a pod with an old version dies and starts up it will overwrite CRDs with older versions. Potentially creating errors on the previously successful deployment of the new version (in a different namespace).

@brandond
Copy link
Member

brandond commented Jan 21, 2025

Is it worth it to provide a much more simple PR that updates the lock name to match input controller-name

Yeah, that seems useful.

Is it worth exploring a lease for CRDs? I'd imagine this is helpful in context of running multiple helm-controller's on a cluster in different namespaces.

IIRC, if the controller's RBAC doesn't allow it access to manage CRDs it will just skip syncing them on startup. So maybe in your case, just create the CRDs externally (in the embedding controller, or so on) and ensure that the running controllers' RBAC does not include write access to CRD.

@mallardduck
Copy link
Contributor Author

if the controller's RBAC doesn't allow it access to manage CRDs it will just skip syncing them on startup.
Oh that's a neat idea. We can def explore that as an option for PromFed situation since that seems viable.

Appreciate the feedback here, I'll rebuild this PR to the smaller scope of adjusting the name. Then at some point soon, but not now, I may still explore options for a CRD focused lease/lock.

@mallardduck
Copy link
Contributor Author

More simple PR to update remaining static controllerName references: #260

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants