-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add robustness failpoint for IO stall in raft loop #16859
Conversation
Signed-off-by: ZhouJianMS <[email protected]>
f31c859
to
827dc18
Compare
@ZhouJianMS Thanks for contribution! I have been looking into adding sleep during robustness for some time. #16776 Main blocker was etcd-io/gofail#47. Addition of API that would allow us to confirm that sleep was really executed. I really like the idea of deactivating the failpoint, it should allow us to longer sleep times without breaking the minimal qps requirement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleep failpoints were blocked for long time. Having them strict is important, however I don't think we should block on this any more.
This PR already brings us some value.
Ups, missed couple of error handling issues.
Signed-off-by: ZhouJianMS <[email protected]>
@serathius Error handling added, please review. |
Introduces a new failpoint to simulate an IO stall in the Raft loop during robustness tests. Extensible to check whether the stalled raft loop handled properly once #15247 (comment) or similar fix merged.