Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up Endpoints object #16

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 78 additions & 3 deletions pkg/ha/ha_service.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ import (

"github.com/go-logr/logr"
plkokanov marked this conversation as resolved.
Show resolved Hide resolved
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"sigs.k8s.io/controller-runtime/pkg/client"
ctlmgr "sigs.k8s.io/controller-runtime/pkg/manager"

Expand Down Expand Up @@ -68,7 +69,7 @@ func (ha *HAService) setEndpoints(ctx context.Context) error {
// Bypass client cache to avoid triggering a cluster wide list-watch for Endpoints - our RBAC does not allow it
err := ha.manager.GetAPIReader().Get(ctx, client.ObjectKey{Namespace: ha.namespace, Name: app.Name}, &endpoints)
if err != nil {
if !errors.IsNotFound(err) {
if !apierrors.IsNotFound(err) {
return fmt.Errorf("updating the service endpoint to point to the new leader: retrieving endpoints: %w", err)
}

Expand Down Expand Up @@ -98,6 +99,7 @@ func (ha *HAService) Start(ctx context.Context) error {

select {
case <-ctx.Done():
_ = ha.cleanUpServiceEndpoints()
return fmt.Errorf("starting HA service: %w", ctx.Err())
case <-ha.testIsolation.TimeAfter(retryPeriod):
}
Expand All @@ -108,5 +110,78 @@ func (ha *HAService) Start(ctx context.Context) error {
}
}

return nil
<-ctx.Done()
err := ha.cleanUpServiceEndpoints()
if err == nil {
err = ctx.Err()
}
return err
}

// cleanUpServiceEndpoints is executed upon ending leadership. Its purpose is to remove the Endpoints object created upon acquiring
// leadership.
func (ha *HAService) cleanUpServiceEndpoints() error {
// Use our own context. This function executes when the main application context is closed.
// Also, try to finish before a potential 15 seconds termination grace timeout.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From where these 15s of potential termination grace timeout come from?
According to https://github.com/gardener/gardener-custom-metrics/blob/c43b2064794e5534f2a0d7a831285210620f9ed8/example/custom-metrics-deployment.yaml#L72 we should have 30s from the SIGTERM signal until the SIGKILL.

Copy link
Contributor Author

@andrerun andrerun Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 is a nice, round number, and also half of the default 30. I'm speculating that upon a hypothetical future shortening of grace period, 15 will be a likely choice (the other obvious choice being 10, of course).

This is not a critical choice. I'm simply picking a value which is likely to work slightly better with potential future changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an option for the manager which allows to specify the termination grace period for all runnables: GracefulShutdownTimeout and it is defaulted to 30s: https://github.com/kubernetes-sigs/controller-runtime/blob/76d3d0826fa9dca267c70c68c706f6de40084043/pkg/manager/internal.go#L55
Not sure if it makes sense to use a (doubly) short time for this function.

Either way, if you have a strong reason to not specify the default or not make it configurable, and keep it 15s can you please add the reason as a comment. Otherwise people will wonder where the magic number comes from.

ctx, cancel := context.WithTimeout(context.Background(), 14*time.Second)
defer cancel()
seedClient := ha.manager.GetClient()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I wouldn't call it a seedClient. In all other places we call it client. Tomorrow, we might need to support the runtime cluster to scale gardener-apiserver or virtual-kube-apiserver.

Suggested change
seedClient := ha.manager.GetClient()
client := ha.manager.GetClient()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but since this program is talking to multiple clusters, I don't want to use "client". I agree with your point that the name will need to change in the future, but at that time I'll also have the the context which will allow me come up with the right genralisation, without resorting to the excessively general (IMO) "client".


attempt := 0
var err error
for {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@andrerun andrerun Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that it may have been better to use a poll function, but I don't think this it's worth refactoring. Replacing a "for" with which everybody is familiar, with а callback-based function in a domain-specific library, is a matter of preference.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally prefer to use Poll<...> here unless there is a strong argument for the max number of attempts. Then I would stick with the for loop.
Generally, one benefit is that Poll<...> should already be tested. Another is that when someone tries to do a similar wait and sees the for{...} loop he might decide to copy it and change it a bit, instead of simply reusing the Poll<...> function. As for domain-specificity - I think both GCMx and the functions in the https://github.com/kubernetes/kubernetes/blob/v1.28.0/staging/src/k8s.io/apimachinery/pkg/util/wait/poll.go share the same domain.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image that we have (or introduce) a bug in the func by forgetting to break/exit in 1 case. This for { would then results in endless loop. Why we should have a custom logic for retrying where there is a package that already does this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ialidzhikov @plkokanov
Poll() is a better fit. No good argument against it. It was just that I didn't know if refactoring is worth it.

But the fact is that I haven't used Poll() before, so I can't really judge. I expect you're probably right and the improved readability justifies the refactoring.

I'll use Poll().

endpoints := corev1.Endpoints{
ObjectMeta: metav1.ObjectMeta{
Name: app.Name,
Namespace: ha.namespace,
},
}
err = seedClient.Get(ctx, client.ObjectKeyFromObject(&endpoints), &endpoints)
if err != nil {
if apierrors.IsNotFound(err) {
ha.log.V(app.VerbosityVerbose).Info("The endpoints object cleanup succeeded: the object was missing")
return nil
}

ha.log.V(app.VerbosityInfo).Info("Failed to retrieve the endpoints object", "error", err.Error())
} else {
// Avoid data race. We don't want to delete the endpoint if it is sending traffic to a replica other than this one.
if !ha.isEndpointStillPointingToOurReplica(&endpoints) {
// Someone else is using the endpoint. We can't perform safe cleanup. Abandon the object.
ha.log.V(app.VerbosityWarning).Info(
"Abandoning endpoints object because it was modified by an external actor")
return nil
}

// Only delete the endpoint if it is the resource version for which we confirmed that it points to us.
deletionPrecondition := client.Preconditions{UID: &endpoints.UID, ResourceVersion: &endpoints.ResourceVersion}
err = seedClient.Delete(ctx, &endpoints, deletionPrecondition)
if client.IgnoreNotFound(err) == nil {
Comment on lines +158 to +159
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = seedClient.Delete(ctx, &endpoints, deletionPrecondition)
if client.IgnoreNotFound(err) == nil {
deletionPrecondition := client.Preconditions{UID: &endpoints.UID, ResourceVersion: &endpoints.ResourceVersion}
if err = seedClient.Delete(ctx, endpoints, deletionPrecondition); client.IgnoreNotFound(err) == nil {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm intentionally keeping "delete with precondition" on a separate line here. It's an uncommon construct, which is likely to give the reader a pause, and I don't want to force other logic on the same line.

// The endpoint was deleted (even if not by us). We call that successful cleanup.
ha.log.V(app.VerbosityVerbose).Info("The endpoints object cleanup succeeded")
return nil
}
ha.log.V(app.VerbosityInfo).Info("Failed to delete the endpoints object", "error", err.Error())
}

// Deletion request failed, possibly because of a midair collision. Wait a bit and retry.
attempt++
if attempt >= 10 {
break
}
time.Sleep(1 * time.Second)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a time.NewTicker, or better yet clock.RealClock.NewTicker which allows you to use a Clock interface that can be mocked for tests.
Tickers take into account the time that was actually spent while executing the Get/Update calls. Additionally, since you will have to add a select statement for it, you can use the select to check if the context has expired meanwhile

}

ha.log.V(app.VerbosityError).Error(err, "All retries to delete the endpoints object failed. Abandoning object.")
return fmt.Errorf("HAService cleanup: deleting endponts object: retrying failed, last error: %w", err)
}

// Does the endpoints object hold the same values as the ones we previously set to it?
func (ha *HAService) isEndpointStillPointingToOurReplica(endpoints *corev1.Endpoints) bool {
return len(endpoints.Subsets) == 1 &&
len(endpoints.Subsets[0].Addresses) == 1 &&
endpoints.Subsets[0].Addresses[0].IP == ha.servingIPAddress &&
len(endpoints.Subsets[0].Ports) == 1 &&
endpoints.Subsets[0].Ports[0].Port == int32(ha.servingPort) &&
endpoints.Subsets[0].Ports[0].Protocol == corev1.ProtocolTCP
}
Loading