-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for Generic error for persistent task on starting replication #1003
base: main
Are you sure you want to change the base?
Changes from 1 commit
3c5187c
40b9eee
6293b13
fd04709
6dcbd7d
734c13f
0f00726
0069331
278f08d
731ce6d
aa95027
faa1612
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -828,7 +828,11 @@ open class IndexReplicationTask(id: Long, type: String, action: String, descript | |
} | ||
} catch(e: Exception) { | ||
val err = "Unable to initiate restore call for $followerIndexName from $leaderAlias:${leaderIndex.name}" | ||
val aliasErrMsg = "cannot rename index [${leaderIndex.name}] into [$followerIndexName] because of conflict with an alias with the same name" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Relying on exception message isn't robust as any simple change in string will impact the logic. Have we exhausted other ways to detect? (Exception type, error code etc). If so, we may want to check on substring instead of complete sentence. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is just to show specific error message in case of alias conflict. Main concern is to handle failed state in case of restore state. If string will change in future as well, then also we get the failed state with message related to "unable to initiate restore" and more specific error will come in logs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This scenario is one of cases in which the replication fails to start and proper message is not thrown. We need to make it generic to cover all cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, For other cases if something fails in restore state then we will get this message in index replication status API "Unable to initiate restore call for followerIndexName from leaderAlias:{leaderIndex.name}". Initially when failed state was not handled properly, we get generic error of "Timed out when waiting for persistent task after 1m". Which might happened because of failure in restore state or error in any other state. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the failed state message with whatever exception message will come |
||
log.error(err, e) | ||
if (e.message!!.contains(aliasErrMsg)) { | ||
return FailedState (Collections.emptyMap(), aliasErrMsg) | ||
} | ||
return FailedState(Collections.emptyMap(), err) | ||
} | ||
cso.waitForNextChange("remote restore start") { inProgressRestore(it) != null } | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does handling for FAILED exist?
(Also a nit - there is a space after the
.
)Does this need any update to test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh yes thanks for pointing out the space after
.