Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to set custom terminationGracePeriodSeconds for a knative Service pod #15555

Open
sebastianjohnk opened this issue Oct 8, 2024 · 3 comments
Labels
kind/question Further information is requested lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@sebastianjohnk
Copy link

Ask your question here:

Hi. I'm working on a small POC to create some knative Services.
The image I'm providing for the pod currently contains a small flask app that listens on a port.
Right now I'm testing out the scale-down-ability of these Services.
It seems that once Knative decides to scale down the number of replicas of a pod from, say 3 to 2, or even to 0, these pods remain in a "Terminating" state for a long time, close to 4 or 5 minutes I'd say.
And they seem to be in a 1/2 state.
I checked the container logs. It seems the queue-proxy container is shutting down properly, but not my flask app container.

But anyway, I learned that every pod has a terminationGracePeriodSeconds value that decides how long a pod can stay in this "Terminating" stage before kubernetes force kills it.

Now here is the problem. The terminationGracePeriodSeconds seems to be a default value of 300 for all pods spawned as part of a Service, with seemingly no option to specify it in the Service yaml spec.

I'm able to specify this in a Pod yaml spec and deploy that pod individually and it gets reflected in the pod (when I fetch the pod yaml using kubectl).

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
    - name: my-container
      image: nginx
      ports:
        - containerPort: 80
  terminationGracePeriodSeconds: 120

But when I try to deploy a Service using a yaml spec, which in turn contains a pod spec with the same configuration, something like this --

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-world
  namespace: default
  labels:
    app: hello-world
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1" # Minimum number of Pods
        autoscaling.knative.dev/maxScale: "5" # Maximum number of Pods
    spec:
      terminationGracePeriodSeconds: 99 # Custom termination grace period
      containers:
        - image: gcr.io/knative-samples/helloworld
          ports:
            - containerPort: 8080

I get an error saying
Error from server (BadRequest): error when creating "helloservice.yaml": Service in version "v1" cannot be handled as a Service: strict decoding error: unknown field "spec.template.spec.terminationGracePeriodSeconds"

But if I remove this field from the Service yaml, the service gets deployed, and the pod seems to have a default terminationGracePeriodSeconds value of 300.

I also checked what the default value for a pod that is directly deployed from a pod yaml spec is (without the terminationGracePeriodSeconds specified ), to see if it a kubernetes default thing, but it seems to be 30.
So the default seems to be 30 for individual pods and 300 for pods that are part of a Service.

I guess my question is, how is this default terminationGracePeriodSeconds value of 300 being set for pods belonging to Services and is there any way I can change this either by mentioning it in my Service yaml spec, or by changing some kubernetes/knative configuration ?

Any help would be much appreciated thank you.

@sebastianjohnk sebastianjohnk added the kind/question Further information is requested label Oct 8, 2024
@sebastianjohnk
Copy link
Author

Update

It looks like the terminationGracePeriodSeconds value is being directly picked from the revision-timeout-seconds value specified in the config-defaults configmap in the knative namespace.

Is there any way I can have different value for these two ?
Because I don't want my pod to be stuck in Terminating state for more than 25 seconds.
But I might still have requests coming in to my pod that take longer than 25 seconds at which point I don't want a timeout error happening.

@skonto
Copy link
Contributor

skonto commented Oct 9, 2024

Hi @sebastianjohnk, Knative manages the pod termination cycle as it:
a) sets a preStop hook for draining connections and manage inflight requests. This hook will query a queue proxy endpoint to check if drainer has finished. The drainer (run by queue proxy) has a waiting period of 30secs before it returns assuming no new requests have arrived. Any new request resets the timer.
b) sets the terminationGracePeriodSeconds=rev.Spec.TimeoutSeconds so that requests have enough time to finish and be treated equally as any other request. If you don't set that field in the ksvc spec, the value is set from the defaults cm.

Copy link

github-actions bot commented Jan 8, 2025

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Further information is requested lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

2 participants