Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep original server names in HAproxy backends #59

Closed

Conversation

gdubicki
Copy link
Contributor

@gdubicki gdubicki commented Jun 4, 2020

Mostly to make the HAproxy logs more useful for tracing the requests.

Although it doesn't make much sense without #51 it works independently so it can be reviewed in the same way.

Also related to #45.

to make the logs more useful
gdubicki pushed a commit to egnyte/haproxy-consul-connect that referenced this pull request Jun 17, 2020
@aiharos aiharos requested a review from ShimmerGlass July 28, 2020 13:07
@aiharos
Copy link
Collaborator

aiharos commented Jul 28, 2020

Functionally looks good, but I'd like to hear from @ShimmerGlass too.

One thing I'm not entirely certain about is changing the generated server names to "disabled_server*" - if there is a possibility of these server names being retained (while the address information is updated), and thus creating confusion as to which ones are active or not - I'd suggest retaining the old non-informative "server_xyz" naming scheme.

@ShimmerGlass
Copy link
Collaborator

ShimmerGlass commented Jul 29, 2020

@aiharos @gdubicki, we cannot use the original server names, there is a mechanism that reuses server slots to avoid reloading haproxy each time an instance comes up or down. The name in the server slot must be stable even if the ip/port changes or the server goes down.

Copy link
Collaborator

@pierresouchay pierresouchay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot work as it is as node.Node might be twice in the list

if host == "" {
host = s.Node.Address

name := s.Node.Node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name would not be unique: the proxy might try to target several instances of the same service on 1 single node

@ShimmerGlass
Copy link
Collaborator

Closing this because I don't see a way to implement this right now.

@gdubicki
Copy link
Contributor Author

One more thing:

Why is reloading HAproxy so bad, @ShimmerGlass and @pierresouchay ? Assuming that only some of the servers change and it happens only every few days, is that a problem?

I am asking because perhaps we could add an alternative mode to haproxy-consul-connect that WILL reload BUT will provide more features, like this one.

@ShimmerGlass
Copy link
Collaborator

@gdubicki, this would mean every server change triggers a reload. This is not so bad with a stable bare-metal architecture with few servers, however, it will be an issue for :

  • Services with a large number of instances: more instances, more health check flapping (heath check changing constantly from passing to critical) more reload
  • A single flapping instance would cause haproxies to reload. The typical Consul health check interval is 10s: reload every 10s, causing platform performance as a whole to degrade because of a single bad apple in the basket. More flapping instances and the problem quickly goes out of hand.
  • Deployments by design cause all instances to flap in cascade, causing a lot of reloads, impacting client services

We need to decouple haproxy reloads from server health for a stable service.

I don't think an option for that would be a good idea. It is pretty low level and relates to an Haproxy implementation detail and it is hard for users to understand exactly how this can go bad. Instead, if observability is an issue, I propose we focus on prometheus metrics for example, or an HTML status page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants