-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(fix): Hitting unhealthy node 10 times #2819
base: main
Are you sure you want to change the base?
Conversation
…roduces excludeCurrent method in the List class Signed-off-by: ivaylogarnev-limechain <[email protected]>
Signed-off-by: ivaylogarnev-limechain <[email protected]>
Signed-off-by: ivaylogarnev-limechain <[email protected]>
…hgraph/hedera-sdk-js into fix/hitting-unhealthy-node-ten-times
…d added unit tests Signed-off-by: ivaylogarnev-limechain <[email protected]>
continue; | ||
} | ||
|
||
this._nodeAccountIds.advance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some notes:
- Before always
advance
'd (before theif (!node.isHealthy())
), now we have 2advance
calls, is this necessary? - Around line 740 in executable.js, we have
client._network.increaseBackoff(node);
which removes the node from the healthy nodes list and sets new value for_readmitTime
. This logic works as expected, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Both advances handle different scenarios in the retry/rotation logic.
1.1 The first
advance()
inside the health check is specifically for when a node is unhealthy.1.2 The second
advance()
is part of the normal node rotation logic when trying different nodes for retries.
(tested inAccountInfoMocking.js
- should retry onINTERNAL
and retry multiple nodes) -
Yes, it still reaches that point because this code accounts for the scenario where the node is initially healthy but throws an error after making a request. If the error is a
GrpcService
orHttpError
, theincreaseBackOff()
method is triggered, as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the 1st one - this means we advance in both cases - can we have only 1 advance above if (!node.isHealthy())
on line 644?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that with a single advance before the health check, we're effectively skipping the health check of the first node and this would break the node health checking functionality as demonstrated by the failing test "should skip unhealthy node and execute with healthy node"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
const responses1 = [ | ||
{ response: ACCOUNT_INFO_QUERY_COST_RESPONSE }, | ||
{ response: ACCOUNT_INFO_QUERY_RESPONSE }, | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have it("should retry on UNAVAILABLE", async function () {
, what is the behaviour there with more than 1 node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added that covers this case.
…ocking Signed-off-by: ivaylogarnev-limechain <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you manage to test the new functionality on testnet where we actually have unhealthy nodes? I know it's hard to make integration tests for this functionality because you can't make localnode unhealthy easily and then you are going to need a second account that will execute the transaction on?
This functionality is being tested in Currently, node 5 on the testnet is down, yet the tests still pass. If you explicitly hardcode the nodeAccountId to only use node 5 with:
it will throw the error: |
Signed-off-by: ivaylogarnev-limechain <[email protected]>
Signed-off-by: ivaylogarnev-limechain <[email protected]>
Description:
This PR refactors and fixes the unhealthy node error handling logic in
Executable.js
. Additionally, it introduces a few unit tests to confirm this.Related issue(s):
#2804
Checklist