Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guess etcd replicas number function #239

Open
kvaps opened this issue Jun 19, 2024 · 6 comments
Open

Guess etcd replicas number function #239

kvaps opened this issue Jun 19, 2024 · 6 comments
Milestone

Comments

@kvaps
Copy link
Member

kvaps commented Jun 19, 2024

According to the latest meeting 2024-06-18 MINUTES we decided that we need a function that guesses the needed amount of etcd replicas.

It can be used for recovering non-exising STS object and also for scaling from 0
Design ref: #181

Proposal:

  • Create variable guessed=0
  • Check cluster-state configmap
    • if configmap exists and initial-cluster-members defined
      • if there are any hostnames defined in initial-cluster-members
        • take the hostname of pod with highest number and +1
          • save value into guessed variable
  • Check endpoins for etcd-headless service
    • if there are any endpoints
      • connect to the cluster using endpoint and collect information from member list
        • if there are any members in output from etcd
          • take the hostname with highest number and +1
            • if value is greater then value in guessed, save value into guessed variable
      • read endpoints from kubernetes object:
        • take the hostname of the pod for endpoint with highest number and +1
          • if value is greater then value in guessed, save value into guessed variable
  • read persistent volume claims that falls under StatefulSet label selector
    • if there are any pvcs
      • take the name of the pvc with highest number and +1
        • if value is greater then value in guessed, save value into guessed variable
  • read pods pods that falls under StatefulSet label selector
    • if there are any pods
      • take the pod name with highest number and +1
        • if value is greater then value in guessed, save value into guessed variable
  • return guessed
@lllamnyp
Copy link
Collaborator

I would definitely like to drop these steps altogether.

Check cluster-state configmap

  • if configmap exists and initial-cluster-members defined

    • if there are any hostnames defined in initial-cluster-members

      • take the hostname of pod with highest number and +1

        • save value into guessed variable

This seems redundant, as we already have this info from checking the Endpoints object:

read pods pods that falls under StatefulSet label selector

  • if there are any pods

    • take the pod name with highest number and +1

      • if value is greater then value in guessed, save value into guessed variable

I don't like this step at all:

if value is greater then value in guessed, save value into guessed variable

IMO, if we found a value from a reliable source, such as member list, we should never fall back to a less reliable source, such as "number of endpoints". Only if the more reliable source is unavailable (e.g. we cannot get member list due to lack of quorum), should we try guessing the right number of replicas from Endpoints or PVCs.

@kvaps
Copy link
Member Author

kvaps commented Jun 19, 2024

@lllamnyp

I would definitely like to drop these steps:

Check cluster-state configmap

it is created at initial and keeps existing all the time. It should always contain correct infromation, until someone will remove it, why no using it?

read pods pods that falls under StatefulSet label selector
This seems redundant, as we already have this info from checking the Endpoints object

Are all our pods always get into service endpoints? If so it can be omitted.
Also is there any chance that by running this check service and endpoints will not be exising?

If we consider member list as reliable source, then you're right, let's return it directly

v2:

  • Create variable guessed=0
  • Check endpoins for etcd-headless service
    • if there are any endpoints
      • connect to the cluster using endpoint and collect information from member list
        • if there are any members in output from etcd
          • take the hostname with highest number and +1
            • return value
      • read endpoints from kubernetes object:
        • take the hostname of the pod for endpoint with highest number and +1
          • if value is greater then value in guessed, save value into guessed variable
  • Check cluster-state configmap
    • if configmap exists and initial-cluster-members defined
      • if there are any hostnames defined in initial-cluster-members
        • take the hostname of pod with highest number and +1
          • save value into guessed variable
  • read persistent volume claims that falls under StatefulSet label selector
    • if there are any pvcs
      • take the name of the pvc with highest number and +1
        • if value is greater then value in guessed, save value into guessed variable
  • return guessed

@Kirill-Garbar
Copy link
Collaborator

Kirill-Garbar commented Jun 19, 2024

Etcd-headless service will always have endpoints - it doesn't rely on readiness probes => so all created pods with ip addresses will be in the headless-service.
This service is ensured in the very beginning => so it must exist.

I personally do not like checking cluster-state configmap because in the past we agreed that this is some kind of cache and it would be nice to get this info from etcd pvcs. So amount of pvcs in my opinion is more reliable source than cluster-state cm.
So cm can be checked but as a last resort.

@kvaps
Copy link
Member Author

kvaps commented Jun 19, 2024

Okay it seems cluster-state configmap check makes no sense, so removed:

v3:

  • Create variable guessed=0
  • Check endpoins for etcd-headless service
    • if there are any endpoints
      • connect to the cluster using endpoint and collect information from member list
        • if there are any members in output from etcd
          • take the hostname with highest number and +1
            • return value
      • read endpoints from kubernetes object:
        • take the hostname of the pod for endpoint with highest number and +1
          • if value is greater then value in guessed, save value into guessed variable
  • read persistent volume claims that falls under StatefulSet label selector
    • if there are any pvcs
      • take the name of the pvc with highest number and +1
        • if value is greater then value in guessed, save value into guessed variable
  • return guessed

@kvaps kvaps added this to the v0.4.0 milestone Jul 9, 2024
@lllamnyp
Copy link
Collaborator

lllamnyp commented Jul 9, 2024

Okay it seems cluster-state configmap check makes no sense, so removed:

v3:

  • return guessed

LGTM

@lllamnyp
Copy link
Collaborator

This function is tentatively implemented here as

func (o *observables) desiredReplicas() (max int) {}

@kvaps kvaps modified the milestones: v0.4.0, v0.5.0 Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants