-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustering stability #879
Clustering stability #879
Conversation
Ran some tests on this and sometimes (like 3 out of 10 times) I'm getting this error when shutting down the leader, and the follower will not take over as leader. Haven't looked further into why.
|
Thanks, fixed in 57b583d |
0667e3e
to
a336fa3
Compare
Catch and explicity reraise IO::Errors in etcd, otherwise when an Etcd method yielded, and that inner call raised IO::Error that was interpreted as a Etcd error. Extra logging related to Following Start etcd lease keepalive after won election Apprently there's no need to update lease TTL until the election is won Refactor Leadership lease keepalive dont log Lost leadership if manually revoked etcd error are sometimes json, sometimes not don't let Launcher know about clustering/leases Let it be a concern for Clustering Controller No need to poll the data dir lock, because it's only required for NFS disks.
we want to timeout when waiting for acks, if the follower is unresponsive
use custom ports for the specs
can't see the need
1fbb459
to
f599d2c
Compare
Config.instance was used heavily in Server
If the Launcher receives an Etcd, Launcher creates, and later closes, the ClusteringServer instance.
No need to use the getter
f599d2c
to
02969a3
Compare
All commits are independent and does different things, only the first is really related to this PR. |
WHAT is this pull request doing?
Improving clustering
HOW can this pull request be tested?