ssh: fix mlkem768x25519 hybrid shared secret encoding#11209
Open
cole-christensen wants to merge 1 commit into
Open
ssh: fix mlkem768x25519 hybrid shared secret encoding#11209cole-christensen wants to merge 1 commit into
cole-christensen wants to merge 1 commit into
Conversation
The classical X25519 part of the mlkem768x25519-sha256 hybrid key exchange shared secret was reconstructed by taking the last 32 bytes of the mpint encoding of the ECDH secret. mpint is a minimal-length big-endian representation, so when the shared secret's most significant byte is 0x00 and the following byte is < 0x80 the payload is only 31 bytes and "the last 32 bytes" pulls in a byte of the 4-byte length prefix (0x1f) in place of the genuine leading 0x00. hybrid_common/4 then hashes a different octet string than the peer (e.g. OpenSSH), producing a different exchange hash H. The server signs an H the client never computed, and the client aborts the connection with "incorrect signature". This affects roughly 1 in 512 handshakes (P[msb=0x00] * P[next<0x80] = 1/256 * 1/2). Encode the ECDH shared secret as a fixed-width big-endian octet string so a genuine leading 0x00 byte is preserved, matching the peer. Add a regression test (ssh_algorithms_SUITE:mlkem768x25519_hybrid_secret_encoding) that searches for an X25519 key pair whose ECDH secret triggers the condition and asserts hybrid_common/4 hashes it as a fixed-width 32-byte octet string. The test fails against the old code and passes with the fix. Introduced in 95d0848 (ssh: MLKEM kex ssh); present since OTP-28.4.
Contributor
CT Test Results 2 files 29 suites 26m 5s ⏱️ Results for commit 869e6b5. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts
// Erlang/OTP Github Action Bot |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #11208
Summary
mlkem768x25519-sha256key exchange intermittently fails (~1 in 512 handshakes)with
incorrect signatureon the peer.hybrid_common/4reconstructed theclassical X25519 shared secret by taking the last
?X25519_PUBLICKEY_SIZEbytesof its mpint encoding. Because mpint is minimal-length, a shared secret whose
most significant byte is
0x00and whose next byte is< 0x80encodes to a31-byte payload, and "the last 32 bytes" then includes a byte of the 4-byte
length prefix (
0x1f) in place of the genuine leading0x00. The two peers hashdifferent octet strings, derive a different exchange hash
H, and the signaturecheck fails.
This encodes the ECDH shared secret as a fixed-width big-endian octet string,
preserving leading zero bytes.
draft-ietf-sshm-mlkem-hybrid-kex
§2.4 specifies this encoding — the component secrets are hashed as fixed-length
big-endian byte arrays — and OpenSSH's
kexmlkem768x25519.cmatches it.hybrid_common(K_pq_secret, Curve, PeerPublic, MyPrivate) -> K_cl_secret = compute_key(ecdh, PeerPublic, MyPrivate, Curve), - K_cl_secret_mpint = <<?Empint(K_cl_secret)>>, - K_cl_secret_mpint_trim = - binary:part(K_cl_secret_mpint, byte_size(K_cl_secret_mpint), -?X25519_PUBLICKEY_SIZE), - crypto:hash(sha(Curve), <<K_pq_secret/binary, K_cl_secret_mpint_trim/binary>>). + K_cl_secret_fixed = <<K_cl_secret:(?X25519_PUBLICKEY_SIZE*8)/big-unsigned-integer>>, + crypto:hash(sha(Curve), <<K_pq_secret/binary, K_cl_secret_fixed/binary>>).The new encoding is byte-for-byte identical to the old one in every case except
the previously-broken one, so it only changes behaviour where the handshake was
already failing.
Probability / impact
P(trigger) = P(
secret[0]==0x00) × P(secret[1] < 0x80) = 1/256 × 1/2 =1/512 ≈ 0.195% per
mlkem768x25519-sha256handshake. At any non-trivialconnection volume this is a frequent, hard-to-diagnose interop failure against
OpenSSH (which is correct).
Introduced in
95d0848c60("ssh: MLKEM kex ssh"), released in OTP-28.4.Verification
Deterministic unit check (
<<0,1,0:240>>→ old code yields1F01…, fixedyields
0001…). See linked issue.Monte-Carlo: 300k random secrets, old-vs-fixed mismatch rate 0.194%, every
mismatch satisfies
secret[0]==0x00 ∧ secret[1] < 0x80.End-to-end against OpenSSH 10.3p1, BOTH roles, with the hybrid code
instrumented to count how many handshakes actually hit the trigger ("case C")
so success is not luck-of-the-draw:
Regression test in the suite (
ssh_algorithms_SUITE:mlkem768x25519_hybrid_secret_encoding):passes with the fix, fails against the stock encoding. Run via the supported
framework:
ts:run(ssh, ssh_algorithms_SUITE, mlkem768x25519_hybrid_secret_encoding, [batch])→
TEST COMPLETE, 1 ok, 0 failed.Full
ssh_algorithms_SUITE:TEST COMPLETE, 930 ok, 0 failedof 930