-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Enable MLKEM768X25519 by default #2389
base: main
Are you sure you want to change the base?
Conversation
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to 108fb8d. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Note to self: put MLKEM behind a config flag (that defaults to on). CC @jschanck |
c38e4b1
to
1681758
Compare
Signed-off-by: Lars Eggert <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2389 +/- ##
==========================================
- Coverage 95.31% 95.30% -0.01%
==========================================
Files 114 114
Lines 36869 36954 +85
Branches 36869 36954 +85
==========================================
+ Hits 35142 35220 +78
- Misses 1721 1728 +7
Partials 6 6 ☔ View full report in Codecov by Sentry. |
Benchmark resultsPerformance differences relative to 1090de3. decode 4096 bytes, mask ff: No change in performance detected.time: [11.851 µs 11.884 µs 11.924 µs] change: [-0.6905% +0.0091% +0.6244%] (p = 0.99 > 0.05) decode 1048576 bytes, mask ff: Change within noise threshold.time: [2.9217 ms 2.9309 ms 2.9417 ms] change: [+0.8098% +1.2620% +1.7254%] (p = 0.00 < 0.05) decode 4096 bytes, mask 7f: No change in performance detected.time: [19.793 µs 19.843 µs 19.901 µs] change: [-0.1053% +2.4502% +6.0057%] (p = 0.14 > 0.05) decode 1048576 bytes, mask 7f: Change within noise threshold.time: [5.0495 ms 5.0617 ms 5.0750 ms] change: [-0.9013% -0.5243% -0.1546%] (p = 0.01 < 0.05) decode 4096 bytes, mask 3f: No change in performance detected.time: [6.8901 µs 6.8938 µs 6.9012 µs] change: [-1.7885% -0.7947% +0.0617%] (p = 0.09 > 0.05) decode 1048576 bytes, mask 3f: No change in performance detected.time: [1.4154 ms 1.4222 ms 1.4293 ms] change: [-0.4874% +0.1987% +0.7945%] (p = 0.56 > 0.05) coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [99.019 ns 99.355 ns 99.698 ns] change: [-0.2396% +0.1567% +0.6178%] (p = 0.48 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [117.39 ns 117.73 ns 118.11 ns] change: [-0.0742% +0.3418% +0.7544%] (p = 0.11 > 0.05) coalesce_acked_from_zero 10+1 entries: Change within noise threshold.time: [117.13 ns 117.62 ns 118.20 ns] change: [+0.3880% +0.7998% +1.2989%] (p = 0.00 < 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [97.534 ns 97.682 ns 97.844 ns] change: [-0.4304% +0.4917% +1.3941%] (p = 0.33 > 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [111.83 ms 111.88 ms 111.93 ms] change: [+0.4982% +0.5591% +0.6228%] (p = 0.00 < 0.05) SentPackets::take_ranges: No change in performance detected.time: [5.3888 µs 5.4676 µs 5.5437 µs] change: [-2.9725% -0.5511% +1.9954%] (p = 0.67 > 0.05) transfer/pacing-false/varying-seeds: Change within noise threshold.time: [42.008 ms 42.102 ms 42.200 ms] change: [+1.8974% +2.2183% +2.5649%] (p = 0.00 < 0.05) transfer/pacing-true/varying-seeds: Change within noise threshold.time: [42.093 ms 42.183 ms 42.285 ms] change: [+1.1567% +1.4529% +1.7753%] (p = 0.00 < 0.05) transfer/pacing-false/same-seed: Change within noise threshold.time: [42.287 ms 42.373 ms 42.473 ms] change: [+1.8840% +2.1879% +2.4937%] (p = 0.00 < 0.05) transfer/pacing-true/same-seed: Change within noise threshold.time: [42.506 ms 42.591 ms 42.687 ms] change: [+1.3298% +1.6156% +1.9003%] (p = 0.00 < 0.05) 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💔 Performance has regressed.time: [926.86 ms 935.21 ms 943.64 ms] thrpt: [105.97 MiB/s 106.93 MiB/s 107.89 MiB/s] change: time: [+2.8820% +4.3313% +5.7442%] (p = 0.00 < 0.05) thrpt: [-5.4321% -4.1515% -2.8013%] 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.time: [323.80 ms 327.28 ms 330.81 ms] thrpt: [30.229 Kelem/s 30.555 Kelem/s 30.884 Kelem/s] change: time: [+0.7914% +2.1290% +3.5416%] (p = 0.00 < 0.05) thrpt: [-3.4204% -2.0846% -0.7852%] 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: 💚 Performance has improved.time: [25.569 ms 25.748 ms 25.934 ms] thrpt: [38.560 elem/s 38.839 elem/s 39.110 elem/s] change: time: [-25.713% -25.036% -24.415%] (p = 0.00 < 0.05) thrpt: [+32.301% +33.398% +34.612%] 1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.time: [1.8980 s 1.9171 s 1.9366 s] thrpt: [51.638 MiB/s 52.163 MiB/s 52.687 MiB/s] change: time: [+16.161% +18.001% +19.881%] (p = 0.00 < 0.05) thrpt: [-16.584% -15.255% -13.913%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% happy with the complexity increase in the tests, but they do seem to be limited enough.
BTW, the changes to crypto frame sending are probably separable. No harm done and hard to extricate from the rest of this business, but it might have been better to tackle that part first.
There are some important changes that I suggest, but nothing too bad.
neqo-transport/src/connection/mod.rs
Outdated
@@ -2503,6 +2515,14 @@ impl Connection { | |||
// but wait until after sending an ACK. | |||
self.discard_keys(PacketNumberSpace::Handshake, now); | |||
} | |||
|
|||
// If the client has more Initial CRYPTO data queued up, do not coalesce. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say a little more about why you needed to add this check? My understanding was that we'd have filled the buffer if this was the case. You could maybe replace this with the same if builder.is_full()
check as above in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue here is that with SNI slicing and 0-RTT and without this change, the client sends a first Initial packet that contains the second part of the CRYPTO data, coalesced with a 0-RTT packet. The server doesn't seem to save that 0-RTT data for later processing (keys for ZeroRtt already discarded
which is clearly wrong). Maybe that is the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes:
neqo/neqo-transport/src/crypto.rs
Lines 923 to 937 in 379722c
/// Whether keys for processing packets in the indicated space are pending. | |
/// This allows the caller to determine whether to save a packet for later | |
/// when keys are not available. | |
/// NOTE: 0-RTT keys are not considered here. The expectation is that a | |
/// server will have to save 0-RTT packets in a different place. Though it | |
/// is possible to attribute 0-RTT packets to an existing connection if there | |
/// is a multi-packet Initial, that is an unusual circumstance, so we | |
/// don't do caching for that in those places that call this function. | |
pub fn rx_pending(&self, space: CryptoSpace) -> bool { | |
match space { | |
CryptoSpace::Initial | CryptoSpace::ZeroRtt => false, | |
CryptoSpace::Handshake => self.handshake.is_none() && !self.initials.is_empty(), | |
CryptoSpace::ApplicationData => self.app_read.is_none(), | |
} | |
} |
It seems we're not saving those 0-RTT packets anywhere currently?
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>
This enables MLKEM768X25519 by default on the server and client. There is a parameter to turn it off.
This broke a ton of tests, since various handshake flights become longer than single packets. I fixed those tests where doing so was straightforward, and turned off MLKEM for those where it wasn't or where I wasn't sure the test was still doing the right thing.
There are some non-test changes:
neqo-transport/src/connection/mod.rs
had some code instart_handshake
that needed to move topostprocess_packet
, because we can now be inWaitInitial
for longer.neqo-transport/src/crypto.rs
has changes to the SNI slicing code inwrite_frame
to make that be less wasteful (i.e., generate less padding) than the old code generated for longer-than-MTU crypto frames.TxBuffers
gained anis_empty
function, so when retransmitting sliced CRYPTO frames, we don't stop (and pad the rest of the packet) if there are more discontiguous CRYPTO frames queued up for sending.neqo-transport/src/connection/params.rs
adds themlkem
option toConnectionParameters
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1943471