-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htlcswitch+routerrpc: support unencrypted failure reasons #7067
base: master
Are you sure you want to change the base?
htlcswitch+routerrpc: support unencrypted failure reasons #7067
Conversation
3b61a30
to
57823b6
Compare
57823b6
to
6067bc3
Compare
not that i think this is in scope of the issue but it would also be cool generally if you could return the error as if it was from the next hop, if you know the private key associated with the next hop like for a virtual peer returning an invalid payment hash error |
Assuming I understand the comment correctly, this is already supported on the htlc interceptor api via an encrypted failure message? For example in lnmux, we encrypt for the sender outside of lnd. |
221fe21
to
818c44e
Compare
yes exactly, what i mean is to do this at the API level in an unencrypted way, to avoid the caller having to build the encrypted failure message |
818c44e
to
215c2c4
Compare
bytes failure_message = 4; | ||
|
||
// Indicates whether the failure message still needs to be encrypted for the | ||
// first hop. | ||
bool failure_message_unencrypted = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes exactly, what i mean is to do this at the API level in an unencrypted way, to avoid the caller having to build the encrypted failure message
Okay got it. I think it is debatable whether this is functionality that should live inside lnd. Passing in the private key is a security risk too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair, the problem though is that it seems like people using this API are messing up routing because they are returning errors from the wrong node and I hear reports about people needing to write workarounds for that. Another option to satisfy the no private key material constraint could be to not pass a private key but instead use an arbitrary child key for the message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This discussion thread still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good point, and indeed just passing in the shared secret for the virtual hop and letting lnd do the encryption could be a middle way.
This PR is about adding support for unencrypted failure reasons though, not making it easier to use the encrypted reason functionality that is pre-existing. I think that would be more suitable as a follow-up? In that a failure_message_shared_secret
could be added to indicate that failure_message
still has to be encrypted. At that point, failure_message_unencrypted = false
can be deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joostjager IIUC, this is intended for a use case for "virtual nodes", so anycast-like behavior using special hop hints. In that case, how does this help? It'll let you send a plaintext error (from this interceptor sub-system) and have lnd encrypted that, where it's the source node. IIUC though, you actually want to be able to send errors from the "virtual node", which means you need to use the shared secret for that hop for the path finding retry/apply logic to work properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joostjager IIUC, this is intended for a use case for "virtual nodes", so anycast-like behavior using special hop hints. In that case, how does this help? It'll let you send a plaintext error (from this interceptor sub-system) and have lnd encrypted that, where it's the source node. IIUC though, you actually want to be able to send errors from the "virtual node", which means you need to use the shared secret for that hop for the path finding retry/apply logic to work properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC though, you actually want to be able to send errors from the "virtual node", which means you need to use the shared secret for that hop for the path finding retry/apply logic to work properly.
That is correct.
But sometimes though the virtual node may want to return an error originating from the preceding real lnd node. An example of this is when there is a routing fee associated with the virtual channel. NYDIG-OSS/lnmux#101 illustrates this. The real node has to encrypt the insufficient fee failure message.
Intercepted htlcs do not pass through the switch in the real lnd instance. This means that any forwarding logic (such as the fee check) needs to be replicated by the external interceptor, and the interceptor requires the ability to instruct the real node to produce certain failure messages.
This has been possible in master for some time (
lnd/lnrpc/routerrpc/router.proto
Line 930 in bf5aab9
lnrpc.Failure.FailureCode failure_code = 5; |
failure_code
field. This however is not a complete API. Ideally the complete failure message structure would need to be replicated on the proto level to be able to signal any failure. This PR takes the approach of accepting an encoded but not yet encrypted failure message to avoid that.
In summary this PR:
- Completes the interceptor API by allowing users to supply a full failure message rather than a restrictive set of failure codes without additional data fields.
- Enable use cases that require different failure messages to be returned, such as a non-zero virtual channel fee.
dbc18c2
to
a5504ea
Compare
a5504ea
to
613479c
Compare
613479c
to
a2d9d8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look pretty good, I think the last commit needs to be squashed somewhere, or re-worked slightly. Main question here is re the intended behavior now, and if what is wanted is actually already supported, eg: lnd
node N
, makes virtual nodes M_i...
, then makes invoices with pubkey of node M
, they can then process the onion twice to decode full contents, and utilize the existing method to send back the error as if it came from node M
.
@@ -134,7 +135,7 @@ func (r *forwardInterceptor) resolveFromClient( | |||
// message size + hmac + two uint16 lengths. See BOLT | |||
// #4. | |||
if len(in.FailureMessage) != | |||
lnwire.FailureMessageLength+32+2+2 { | |||
lnwire.FailureMessageLength+sha256.Size+2+2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah had thought it was introduced in this diff for some reason, I guess a rebase removed an earlier portion (was stacked on something else?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the exact history, but may be the case that a later refactor left just this.
} | ||
|
||
code, err := unmarshalFailCode(in.FailureCode) | ||
err := unmarshallFailureResolution(in, fwdRes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think it's blocking, but we could have this instead return the new value and assign it here, s.t the new function is a pure function. Then tests can just exercise that the right value is returned vs asserting that the input is mutated as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying out multiple ways to make it a pure function, but after the changes in the later commit that extend this function it never really worked out in a clean way. So settled for this.
bytes failure_message = 4; | ||
|
||
// Indicates whether the failure message still needs to be encrypted for the | ||
// first hop. | ||
bool failure_message_unencrypted = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joostjager IIUC, this is intended for a use case for "virtual nodes", so anycast-like behavior using special hop hints. In that case, how does this help? It'll let you send a plaintext error (from this interceptor sub-system) and have lnd encrypted that, where it's the source node. IIUC though, you actually want to be able to send errors from the "virtual node", which means you need to use the shared secret for that hop for the path finding retry/apply logic to work properly.
bytes failure_message = 4; | ||
|
||
// Indicates whether the failure message still needs to be encrypted for the | ||
// first hop. | ||
bool failure_message_unencrypted = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joostjager IIUC, this is intended for a use case for "virtual nodes", so anycast-like behavior using special hop hints. In that case, how does this help? It'll let you send a plaintext error (from this interceptor sub-system) and have lnd encrypted that, where it's the source node. IIUC though, you actually want to be able to send errors from the "virtual node", which means you need to use the shared secret for that hop for the path finding retry/apply logic to work properly.
@@ -93,6 +94,11 @@ type InterceptableSwitch struct { | |||
// currentHeight is the currently best known height. | |||
currentHeight int32 | |||
|
|||
// signChannelUpdate is used when an intercepting application includes | |||
// an unsigned channel update to be signed by us. | |||
signChannelUpdate func(u *lnwire.ChannelUpdate) (*ecdsa.Signature, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context in which this is to be used is to update the routing policy of a virtual channel. The channel isn't broadcasted through gossip and doesn't exist in the graph on disk.
Assuming the virtual channel has a distinct pubkey (and also shared secret in the onion), then how would this work in practice? lnd would use diff logic for if it was the final hop vs the penultimate hop.
return err | ||
} | ||
|
||
err = s.resignChannelUpdate(update) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also commit this update to disk as well? Apologies if this was already an older review comment...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because the real node isn't aware of this virtual channel at all. It's all up to the external interceptor logic to keep track of the virtual channel properties. As mentioned in #7067 (comment), the external interceptor is taking over part of the responsibilities of the switch.
update *lnwire.ChannelUpdate) error { | ||
|
||
// The maxHTLC flag is mandatory. | ||
if !update.MessageFlags.HasMaxHtlc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More validation is needed here, eg: max HTLC shouldn't be greater than the channel capacity, etc, etc. I think we should just call the normal validation routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to call
Line 165 in bf5aab9
func ValidateChannelUpdateFields(capacity btcutil.Amount, |
But that would create a circular reference. For now, I just copied over the additional max_htlc >= min_htlc
check. Capacity checking is not possible, because the real lnd doesn't know what the capacity is. It's probably infinite because the channel is typically virtual.
If desired I can do a bigger refactor and move the validation logic into htlcswitch
. But not sure if worth it.
Yes, will squash once I am certain we're all on the same page with this PR.
Hopefully my replies in this round have clarified this. I know it is confusing, but the scope of this PR is not to change anything about errors encrypted by M. |
Keep indentation under control and prepare for additional validation.
Prepares for unencrypted failure messages.
e415913
to
f5b80e8
Compare
Prepare for adding a header to a pre-encoded message.
This commit adds the possibility for users of the htlc interception api to supply arbitrary pre-encoded failure messages. LND will only add an hmac and encrypt the message for the first hop.
f5b80e8
to
58dfa61
Compare
Is this PR ready to be merged? |
@saubyk Is this PR planned any time soon? |
Hi @JssDWt I've added this pr to 18's backlog. But at this point, uncertain if it will be included, as there is still a lot of work to get through to complete the in-flight work. Will make a determination in another week or so, if it will stay in the scope or moved to a minor release |
This PR adds the possibility for users of the htlc interception api to supply arbitrary pre-encoded failure messages. LND will only add an hmac and encrypt the message for the first hop.
Interceptor applications that implement a virtual channel can use this to return for example a
FeeInsufficient
failure as if it was generated within the LND node.The existing
failure_code
field on the htlc interceptor response is deprecated. Extending that with all possible failure codes and failure attributes doesn't look like an attractive path.