All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: Christopher Lusk <clusk@northecho.dev>, Jakub Kicinski <kuba@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH net v3] net: tls: use sync AEAD for sk_msg BPF sockets
Date: Tue, 26 May 2026 14:44:24 +0800	[thread overview]
Message-ID: <d92bc603-e345-4dee-9ae9-6ad45e4e6642@linux.dev> (raw)
In-Reply-To: <20260526025154.60607-1-clusk@northecho.dev>


On 5/26/26 10:51 AM, Christopher Lusk wrote:
> The kTLS TX path can hand an open record to a sk_msg verdict
> program before encryption.  If the verdict applies fewer bytes
> than the open record contains, tls_push_record() splits
> ctx->open_rec into the record being encrypted and a remainder.
> The synchronous path reattaches that remainder before continuing.
>
> With an async AEAD provider, crypto_aead_encrypt() can return
> -EINPROGRESS after ctx->open_rec has been unhooked but before the
> split remainder is reattached.  The remainder is no longer
> reachable through ctx->open_rec or ctx->tx_list, silently dropping
> transmitted data and leaking the unreachable tls_rec.  The same
> composition also entangles the user-page zerocopy lifetime rules
> with an async completion path.
>
> A sockmap cannot be attached to a socket after an inet ULP is
> installed: sk_psock_init() returns -EINVAL when
> inet_csk_has_ulp() is true.  So the supported ordering for
> sockmap + kTLS TX is sockmap first, TLS_TX setup second.  When
> TLS_TX setup sees an existing sk_psock, allocate the AEAD with
> CRYPTO_ALG_ASYNC masked out and latch the TX zerocopy gate
> (sw_ctx_tx->async_capable) so the buggy composition becomes
> structurally unreachable.  Ordinary kTLS sockets without sk_msg
> BPF attached are unaffected and continue to use async-capable
> providers.
>
> Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling")
> Cc: stable@vger.kernel.org # 4.20+
> Signed-off-by: Christopher Lusk <clusk@northecho.dev>
> Assisted-by: Codex:gpt-5.5
> Assisted-by: Claude:claude-opus-4-7
> ---
>
> Changes since v2 [1]:
> - Per netdev maintainer guidance [2], replace the Option-C
>    drain-on-error fix with a setup-time surface narrowing in
>    tls_set_sw_offload(): when a sockmap is already attached at
>    TLS_TX setup, request a synchronous AEAD (CRYPTO_ALG_ASYNC in
>    the allocation mask) and set sw_ctx_tx->async_capable = 1.
>    Both moves are needed: latching async_capable alone disables
>    zerocopy but tls_do_encryption() can still return -EINPROGRESS
>    on the copy path; selecting a sync provider removes that return
>    path for sk_msg-attached sockets.
> - Drop the selftest from the series per Jakub's note that the
>    existing sockmap + TLS coverage at
>    tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c exercises
>    this configuration [3].  That suite covers sockmap + kTLS
>    policy paths broadly; the specific async-pcrypt pass-then-drop
>    failure mode from the v2 reproducer was validated for v3 on
>    QEMU/KVM with a KASAN+LOCKDEP-instrumented kernel against net
>    base 2156a29aecff before send.
> - Single-patch series.
>
> Changes since v1:
> - v1's remainder-rooting fix was incomplete; Sashiko AI review
>    surfaced a real UAF in the v2 follow-up that John Fastabend
>    endorsed on the v1 thread [4].  The surface-narrowing approach
>    in v3 makes both failure modes unreachable by avoiding the
>    async + sk_msg composition entirely rather than patching each
>    continuation point.
>
> [1] https://lore.kernel.org/all/20260521025840.976378-1-clusk@northecho.dev/
> [2] https://lore.kernel.org/all/20260525133028.58494274@kernel.org/
> [3] https://lore.kernel.org/all/20260525133048.2dc6d8d3@kernel.org/
> [4] https://lore.kernel.org/all/huduxtn6parzgiaf5cyiyrrvjjvx6jsdedowvrd4nkwmuyeind@j6migjgofh2i/
>
>   net/tls/tls_sw.c | 15 ++++++++++++++-
>   1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 964ebc268..0000000 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2867,7 +2867,20 @@ int tls_set_sw_offload(struct sock *sk, int tx,
>   	rec_seq = crypto_info_rec_seq(src_crypto_info, cipher_desc);
>
>   	if (!*aead) {
> -		*aead = crypto_alloc_aead(cipher_desc->cipher_name, 0, 0);
> +		u32 mask = 0;
> +
> +		if (tx) {
> +			struct sk_psock *psock;
> +
> +			psock = sk_psock_get(sk);
> +			if (psock) {
> +				mask = CRYPTO_ALG_ASYNC;
> +				sw_ctx_tx->async_capable = 1;
> +				sk_psock_put(sk, psock);
> +			}
> +		}
> +
> +		*aead = crypto_alloc_aead(cipher_desc->cipher_name, 0, mask);
>   		if (IS_ERR(*aead)) {
>   			rc = PTR_ERR(*aead);
>   			*aead = NULL;
> --
> 2.54.0

If async_capable is set to 1, the zerocopy path in tls_sw_sendmsg() is 
skipped.
Unfortunately ktls with bpf_msg_pop_data() does not work correctly under 
this
copy path.

tls_clone_plaintext_msg() aliases msg_pl onto msg_en's plaintext area 
(in-place encryption).

BPF runs bpf_msg_pop_data(msg, 0, 2). This shifts msg_pl's SG entry 
forward by 2 bytes.
The two SGs now point to the same page at different offsets. Physical 
memory overlaps but the start of
address differ.

I think selecting a sync provider via mask = CRYPTO_ALG_ASYNC is 
sufficient to
remove the -EINPROGRESS return path.

May be time to remove skmsg from ktls? (disable by default first, 
re-enable via a new ktls module_param?)



  reply	other threads:[~2026-05-26  6:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26  2:51 [PATCH net v3] net: tls: use sync AEAD for sk_msg BPF sockets Christopher Lusk
2026-05-26  6:44 ` Jiayuan Chen [this message]
2026-05-26 23:11   ` Jakub Kicinski
2026-05-27  5:09     ` Jiayuan Chen
2026-05-27 19:16       ` John Fastabend
2026-05-28 23:47         ` Jakub Kicinski
2026-05-27  8:36     ` Sabrina Dubroca
2026-05-27 12:24     ` Christopher Lusk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d92bc603-e345-4dee-9ae9-6ad45e4e6642@linux.dev \
    --to=jiayuan.chen@linux.dev \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clusk@northecho.dev \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sd@queasysnail.net \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.