Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for high availability redundancy #1015

Open
lornasong opened this issue Aug 2, 2022 · 0 comments
Open

Support for high availability redundancy #1015

lornasong opened this issue Aug 2, 2022 · 0 comments
Labels
enhancement New feature or request enterprise Feature will be delivered as a part of Enterprise binary
Milestone

Comments

@lornasong
Copy link
Member

lornasong commented Aug 2, 2022

Description

Currently CTS does not natively support high availability. When CTS becomes unavailable, it relies on an external system or process to intervene and restart it.

We want to support redundancy in CTS by allowing multiple CTS instances to form a cluster of 1 leader and any number of followers. The leader instance would be responsible for executing tasks. The follower instances would be backups that are ready to take over task execution when the leader becomes unavailable. This creates a reliable failover that minimizes the time that the network infrastructure out-of-date and no longer requires external intervention.

Use Cases

Keeping network infrastructure up-to-date is mission critical for application delivery. Without high availability, CTS can become the single point of failure for network automation workflows that it is responsible for.

Alternative Solutions

Currently, there are workarounds that minimize CTS downtime, such as using an orchestrator like Nomad or Kubernetes. However, this puts the burden on users to make CTS more highly available.

Additional context

Some highly available systems distribute load amongst cluster members. For example, CTS followers could potentially share task execution responsibilities. This would be a task distribution type feature and is separate enhancement from the redundancy backup feature described in this issue

New Cluster Status Endpoint

To support high availability, a new API endpoint will be added, GET /status/cluster. This will allow users to get information about the members in the CTS cluster, including health information and leadership status.

@lornasong lornasong added enhancement New feature or request enterprise Feature will be delivered as a part of Enterprise binary labels Aug 2, 2022
@lornasong lornasong added this to the v0.7.0 milestone Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request enterprise Feature will be delivered as a part of Enterprise binary
Projects
None yet
Development

No branches or pull requests

1 participant