Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-48276: Make the revision controller report degraded on error #1918

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Jan 10, 2025

This makes the error handling slightly easier, keeps the retry on error logic, leaves the requeue on self-modification we rely upon.

This doesn't introduce VAP to prevent writes to "immutable" configmaps and secrets, but I'm open to doing that to prevent stale caches from creating unstable and inconsistent revisions.

new

  - apiVersion: operator.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          k:{"type":"RevisionControllerDegraded"}:
            .: {}
            f:lastTransitionTime: {}
            f:reason: {}
            f:status: {}
            f:type: {}
    manager: RevisionController-reportDegraded
    operation: Apply
    subresource: status
    time: "2025-01-10T20:20:59Z"
  - apiVersion: operator.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:latestAvailableRevision: {}
    manager: kube-apiserver-RevisionController
    operation: Apply
    subresource: status
    time: "2025-01-10T20:30:45Z"

old

  - apiVersion: operator.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          k:{"type":"RevisionControllerDegraded"}:
            .: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
        f:latestAvailableRevision: {}
    manager: kube-apiserver-RevisionController
    operation: Apply
    subresource: status
    time: "2025-01-11T07:38:04Z"

This makes the error handling slightly easier, keeps the retry on error
logic, leaves the requeue on self-modification we rely upon.

This doesn't introduce VAP to prevent writes to "immutable" configmaps
and secrets, but I'm open to doing that to prevent stale caches from
creating unstable and inconsistent revisions.
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 10, 2025
@deads2k
Copy link
Contributor Author

deads2k commented Jan 10, 2025

requires passing proof in openshift/cluster-kube-apiserver-operator#1783 that confirms the low level condition names are the same between 4.18 and master.

@benluddy
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 13, 2025
Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: benluddy, deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@benluddy
Copy link
Contributor

I think this patch will fix the following scenario that causes the revision controller to attempt to decrease LatestAvailableRevision:

  1. Observe RV=N LatestAvailable=X Degraded=False
  2. Write RV=N+1 LatestAvailable=X+1 Degraded=False
  3. Observe RV=N LatestAvailable=X Degraded=False
  4. Write RV=N+2 LatestAvailable=X Degraded=True

/retitle OCPBUGS-48276: Make the revision controller report degraded on error

@openshift-ci openshift-ci bot changed the title Make the revision controller report degraded on error OCPBUGS-48276: Make the revision controller report degraded on error Jan 13, 2025
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 13, 2025
@openshift-ci-robot
Copy link

@deads2k: This pull request references Jira Issue OCPBUGS-48276, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This makes the error handling slightly easier, keeps the retry on error logic, leaves the requeue on self-modification we rely upon.

This doesn't introduce VAP to prevent writes to "immutable" configmaps and secrets, but I'm open to doing that to prevent stale caches from creating unstable and inconsistent revisions.

new

 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:conditions:
         k:{"type":"RevisionControllerDegraded"}:
           .: {}
           f:lastTransitionTime: {}
           f:reason: {}
           f:status: {}
           f:type: {}
   manager: RevisionController-reportDegraded
   operation: Apply
   subresource: status
   time: "2025-01-10T20:20:59Z"
 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:latestAvailableRevision: {}
   manager: kube-apiserver-RevisionController
   operation: Apply
   subresource: status
   time: "2025-01-10T20:30:45Z"

old

 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:conditions:
         k:{"type":"RevisionControllerDegraded"}:
           .: {}
           f:lastTransitionTime: {}
           f:status: {}
           f:type: {}
       f:latestAvailableRevision: {}
   manager: kube-apiserver-RevisionController
   operation: Apply
   subresource: status
   time: "2025-01-11T07:38:04Z"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@benluddy
Copy link
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jan 13, 2025
@openshift-ci-robot
Copy link

@benluddy: This pull request references Jira Issue OCPBUGS-48276, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@benluddy
Copy link
Contributor

/cherry-pick release-4.18

@openshift-ci openshift-ci bot requested a review from wangke19 January 13, 2025 18:27
@openshift-cherrypick-robot

@benluddy: once the present PR merges, I will cherry-pick it on top of release-4.18 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@deads2k: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 020245f into openshift:master Jan 13, 2025
4 checks passed
@openshift-ci-robot
Copy link

@deads2k: Jira Issue OCPBUGS-48276: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-48276 has been moved to the MODIFIED state.

In response to this:

This makes the error handling slightly easier, keeps the retry on error logic, leaves the requeue on self-modification we rely upon.

This doesn't introduce VAP to prevent writes to "immutable" configmaps and secrets, but I'm open to doing that to prevent stale caches from creating unstable and inconsistent revisions.

new

 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:conditions:
         k:{"type":"RevisionControllerDegraded"}:
           .: {}
           f:lastTransitionTime: {}
           f:reason: {}
           f:status: {}
           f:type: {}
   manager: RevisionController-reportDegraded
   operation: Apply
   subresource: status
   time: "2025-01-10T20:20:59Z"
 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:latestAvailableRevision: {}
   manager: kube-apiserver-RevisionController
   operation: Apply
   subresource: status
   time: "2025-01-10T20:30:45Z"

old

 - apiVersion: operator.openshift.io/v1
   fieldsType: FieldsV1
   fieldsV1:
     f:status:
       f:conditions:
         k:{"type":"RevisionControllerDegraded"}:
           .: {}
           f:lastTransitionTime: {}
           f:status: {}
           f:type: {}
       f:latestAvailableRevision: {}
   manager: kube-apiserver-RevisionController
   operation: Apply
   subresource: status
   time: "2025-01-11T07:38:04Z"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

@benluddy: new pull request created: #1920

In response to this:

/cherry-pick release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants