-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heads stuck in a state without being able to progress snapshots #1773
Comments
One of the main aims of #1468 is to solve this problem. Let us know if you think that would help you! If so, it's on our roadmap very soon, so hopefully that will help :) |
This will definitely help in terms of being able to fanout a subset of usable UTxOs, there may be something I'm missing, but does it also help in terms of reconciling forked local states? |
@twwu123 Thanks for reporting this and evaluating the Hydra project in-depth! You raise very valid points and your understanding is just right. The As you can see from past issues, this can happen if networking, persistence or version incompatibility was preventing smooth off-chain protocol progress. Besides making the individual components more resilient to faults (e.g. by following up a successful experiment with #1720), we also discussed fallback mechanisms for situations where its purely a technical hiccup and not a loss in consensus that is preventing progress, for example: #1284 Your expected behavior indicates that such a clearing of the diverging local views or reset to the last confirmed snapshot would be a solution for you? |
The most ideal solution for me would be a way to allow one If this isn't possible, then at the very least, resetting everyone's local state to the latest agreed on snapshot would be my second option. My understanding is that both of these should be possible. However, the first option may have small security concerns, and might require some sort of manual "approval" of a specific state for every |
Yes, exactly. A node can't just trust another node by adopting their state. That being said, we should make the snapshotting logic more defensive and retry more, as this could be an issue here too. |
Context & versions
Hydra version: 0.19.0
At some point, some of the hydra-nodes failed in signing new snapshots, creating a situation where the local ledger state of each hydra-node deviates from each other, essentially forked states. It is still possible to submit transactions to the hydra-nodes, but doing so is useless, as the snapshots do not update unless all of the hydra-nodes sign it, and each of the nodes start to accept transactions based on completely different states and stop agreeing on what UTxOs have been spent or not.
Actual behavior
Firstly, I'm unsure why the hydra-nodes stop signing new snapshots, but once they do, it seems to be unrecoverable. There is no way to forcibly sync the snapshots of the hydra-nodes, nor do the nodes attempt to sign any new states, it doesn't even seem like they attempt to update the snapshots in any way.
Unfortunately, my use case will result in a state that is impossible to close for the majority of the time, and right before a planned close, we will reconcile the state to a closable state. This means that whenever this happens, the head is completely doomed, and unrecoverable.
Expected behavior
Hopefully allow some way to recover from such a situation, there does seem to be a snapshot that all the hydra-nodes agree on, but somehow the local ledger states start to deviate from each other. I suspect the easiest solution would be to allow some way to reset the nodes' ledger state to the most recently agreed upon snapshot.
The text was updated successfully, but these errors were encountered: