Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-cluster-diagnostics pod #2780

Merged
merged 37 commits into from
Dec 12, 2023
Merged

multi-cluster-diagnostics pod #2780

merged 37 commits into from
Dec 12, 2023

Conversation

jessegoodier
Copy link
Collaborator

@jessegoodier jessegoodier commented Nov 21, 2023

What does this PR change?

Creates a new diagnostic deployment for monitoring health in multi-cluster Kubecost environments

Does this PR rely on any other PRs?

KCM PR: https://github.com/kubecost/kubecost-cost-model/pull/1922

How does this PR impact users? (This is the kind of thing that goes in release notes!)

Creates a new "diagnostics" deployment that has two functions:

  1. On non-primary clusters: the deployment sends Kubecost agent health to a central object-store
  2. On the Primary Kubecost cluster: the diagnostics deployment monitors the data sent by the agents and makes this information available via and API
    More information available on our docs (link TBD)

Links to Issues or tickets this PR addresses or fixes

NA

What risks are associated with merging this PR? What is required to fully test this PR?

Minimal risk. we are using this internally for a month now.

How was this PR tested?

Many QA deployments

Have you made an update to documentation? If so, please provide the corresponding PR.

We will have a doc by GA.

@jessegoodier jessegoodier enabled auto-merge (squash) November 29, 2023 21:42
Copy link
Collaborator

@chipzoller chipzoller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests failing as of these changes because kubecostProductConfigs is currently in a commented-out state. If diagnostics must be enabled by default, then kubecostProductConfigs.clusterName must be enabled by default.

if you see this:
https://github.com/kubecost/cost-analyzer-helm-chart/pull/2780/files#diff-f6d5a9ddec47ad0c6043e1a7258057f261cc6a135446693326d5cd4d27fc1497R124

diagnostics are only enabled if federation is enabled. if federation is enabled, you must set a cluster name.

@brstuder
Copy link

brstuder commented Dec 1, 2023

Let me know what you help you need on this re: documentation

@nikovacevic
Copy link
Contributor

Should we change this PR to a draft until the dependencies are merged?

@chipzoller
Copy link
Collaborator

Should we change this PR to a draft until the dependencies are merged?

Yes

@chipzoller chipzoller marked this pull request as draft December 1, 2023 21:09
auto-merge was automatically disabled December 1, 2023 21:09

Pull request was converted to draft

@thomasvn thomasvn marked this pull request as ready for review December 12, 2023 01:28
@thomasvn
Copy link
Member

@jessegoodier @chipzoller I've cleaned this up a bit, and believe it's ready for review! Code for the diagnostics backend has been merged.

@jessegoodier
Copy link
Collaborator Author

@thomasvn this is good by me. mind approving?

@jessegoodier jessegoodier enabled auto-merge (squash) December 12, 2023 02:10
@thomasvn thomasvn dismissed chipzoller’s stale review December 12, 2023 02:14

Helm tests no longer failing

@jessegoodier jessegoodier merged commit ec23beb into develop Dec 12, 2023
28 of 29 checks passed
@jessegoodier jessegoodier deleted the hackathon-diagnostics branch December 12, 2023 02:14
@jessegoodier
Copy link
Collaborator Author

/cherry-pick v1.108

jessegoodier added a commit that referenced this pull request Dec 12, 2023
* diagnostics deployment

Signed-off-by: chipzoller <[email protected]>
Co-authored-by: thomasvn <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
jessegoodier added a commit that referenced this pull request Dec 12, 2023
* diagnostics deployment

Signed-off-by: chipzoller <[email protected]>
Co-authored-by: thomasvn <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
@chipzoller
Copy link
Collaborator

Forgot to label this.

## Ref: https://docs.kubecost.com/install-and-configure/install/diagnostics
##
diagnostics:
enabled: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look disabled by default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had some conversations with product and engineering and the thought was launch and learn.

I do feel that the risk is low and the potential benefit is high.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest updating the values comment then so it's not confusing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants