* [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
@ 2026-05-17 14:56 Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
0 siblings, 2 replies; 5+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
sk_psock_verdict_data_ready() lacks the tls_sw_has_ctx_rx() guard that
sk_psock_strp_data_ready() gained in e91de6afa81c. When a socket is
inserted into a sockmap (BPF_SK_SKB_VERDICT) before TLS RX is configured,
the missing guard causes tcp_read_skb() to drain sk_receive_queue without
advancing copied_seq, leaving a dangling frag_list pointer that
tls_decrypt_sg() walks — a use-after-free.
Patch 1 mirrors the fix from e91de6afa81c: add the tls_sw_has_ctx_rx()
check to sk_psock_verdict_data_ready() so that when a TLS RX context is
present the function defers to psock->saved_data_ready (sock_def_readable)
instead of calling tcp_read_skb().
Patch 2 adds a selftest that drives the vulnerable sequence end-to-end
and verifies recv() returns the correct decrypted data.
Xingwang Xiang (2):
bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
selftests/bpf: add regression test for ktls+sockmap verdict UAF
net/core/skmsg.c | 9 +-
.../selftests/bpf/prog_tests/sockmap_ktls.c | 103 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_ktls.c | 21 ++++
3 files changed, 131 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net v5 1/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
@ 2026-05-17 14:56 ` Xingwang Xiang
2026-05-18 14:57 ` sashiko-bot
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
1 sibling, 1 reply; 5+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
sk_psock_strp_data_ready() already checks tls_sw_has_ctx_rx() and
defers to psock->saved_data_ready when a TLS RX context is present,
avoiding a conflict with the TLS strparser's ownership of the receive
queue (commit e91de6afa81c, "bpf: Fix running sk_skb program types
with ktls").
sk_psock_verdict_data_ready() has no equivalent guard. When a socket
is inserted into a sockmap (BPF_SK_SKB_VERDICT) before TLS RX is
configured, tls_sw_strparser_arm() saves sk_psock_verdict_data_ready
as rx_ctx->saved_data_ready. On data arrival:
tls_data_ready -> tls_strp_data_ready -> tls_rx_msg_ready
-> saved_data_ready() = sk_psock_verdict_data_ready()
-> tcp_read_skb() drains sk_receive_queue via __skb_unlink()
without calling tcp_eat_skb(), so copied_seq is not advanced.
tls_strp_msg_load() then finds tcp_inq() >= full_len (stale), calls
tcp_recv_skb() on the now-empty queue, hits WARN_ON_ONCE(!first), and
returns with rx_ctx->strp.anchor.frag_list pointing at a psock-owned
(potentially freed) skb. tls_decrypt_sg() subsequently walks that
frag_list: use-after-free.
Apply the same fix as sk_psock_strp_data_ready(): if a TLS RX context
is present, call psock->saved_data_ready (sock_def_readable) to wake
recv() waiters and return immediately, leaving the receive queue
untouched. TLS retains sole ownership of the queue and decrypts the
record normally through tls_sw_recvmsg().
Signed-off-by: Xingwang Xiang <v3rdant.xiang@gmail.com>
---
net/core/skmsg.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6187a83bd..e1850caf1 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1268,12 +1268,19 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
static void sk_psock_verdict_data_ready(struct sock *sk)
{
const struct proto_ops *ops = NULL;
+ struct sk_psock *psock;
struct socket *sock;
int copied;
trace_sk_data_ready(sk);
rcu_read_lock();
+ psock = sk_psock(sk);
+ if (psock && tls_sw_has_ctx_rx(sk)) {
+ psock->saved_data_ready(sk);
+ rcu_read_unlock();
+ return;
+ }
sock = READ_ONCE(sk->sk_socket);
if (likely(sock))
ops = READ_ONCE(sock->ops);
@@ -1283,8 +1290,6 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
copied = ops->read_skb(sk, sk_psock_verdict_recv);
if (copied >= 0) {
- struct sk_psock *psock;
-
rcu_read_lock();
psock = sk_psock(sk);
if (psock)
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
@ 2026-05-17 14:56 ` Xingwang Xiang
2026-05-18 14:57 ` sashiko-bot
1 sibling, 1 reply; 5+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
Test the scenario where a socket is inserted into a sockmap with a
BPF_SK_SKB_VERDICT program before TLS RX is configured. Previously
sk_psock_verdict_data_ready() would call tcp_read_skb() and drain the
receive queue without advancing copied_seq, causing tls_decrypt_sg()
to walk a dangling frag_list pointer (use-after-free).
The test drives the full vulnerable sequence and verifies that after
the fix recv() returns the correct decrypted data.
Signed-off-by: Xingwang Xiang <v3rdant.xiang@gmail.com>
---
.../selftests/bpf/prog_tests/sockmap_ktls.c | 103 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_ktls.c | 21 ++++
2 files changed, 124 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
index b87e7f39e..6ed8e149e 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
@@ -417,6 +417,107 @@ static void run_tests(int family, enum bpf_map_type map_type)
close(map);
}
+/*
+ * Regression test for the KTLS + sockmap (verdict) reverse-order UAF.
+ *
+ * Vulnerable sequence:
+ * 1. Insert receiver socket into sockmap with BPF_SK_SKB_VERDICT program.
+ * sk->sk_data_ready becomes sk_psock_verdict_data_ready.
+ * 2. Configure TLS RX: tls_sw_strparser_arm() saves
+ * sk_psock_verdict_data_ready as rx_ctx->saved_data_ready.
+ *
+ * When data arrives, tls_rx_msg_ready() calls saved_data_ready() =
+ * sk_psock_verdict_data_ready(), which calls tcp_read_skb() and drains
+ * sk_receive_queue via __skb_unlink() without advancing copied_seq.
+ * tls_strp_msg_load() then finds the queue empty while tcp_inq() is still
+ * non-zero, hits WARN_ON_ONCE(!first), and leaves a dangling frag_list
+ * pointer that tls_decrypt_sg() walks — a use-after-free.
+ *
+ * The fix adds a tls_sw_has_ctx_rx() check to sk_psock_verdict_data_ready(),
+ * mirroring what sk_psock_strp_data_ready() already does: when a TLS RX
+ * context is present, defer to psock->saved_data_ready (sock_def_readable)
+ * instead of calling tcp_read_skb(), so TLS retains sole ownership of the
+ * receive queue. Data is then decrypted and returned correctly by
+ * tls_sw_recvmsg().
+ */
+static void test_sockmap_ktls_verdict_with_tls_rx(int family, int sotype)
+{
+ struct tls12_crypto_info_aes_gcm_128 crypto_info = {};
+ char send_buf[] = "hello ktls sockmap reverse order";
+ char recv_buf[sizeof(send_buf)] = {};
+ struct test_sockmap_ktls *skel;
+ int c = -1, p = -1, zero = 0;
+ int prog_fd, map_fd;
+ ssize_t n;
+ int err;
+
+ skel = test_sockmap_ktls__open_and_load();
+ if (!ASSERT_TRUE(skel, "open_and_load"))
+ return;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (!ASSERT_OK(err, "create_pair"))
+ goto out;
+
+ prog_fd = bpf_program__fd(skel->progs.prog_skb_verdict_pass);
+ map_fd = bpf_map__fd(skel->maps.sock_map_verdict);
+
+ err = bpf_prog_attach(prog_fd, map_fd, BPF_SK_SKB_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach sk_skb verdict"))
+ goto out;
+
+ /* Step 1: configure TLS TX on sender (no sockmap involvement) */
+ err = setsockopt(c, IPPROTO_TCP, TCP_ULP, "tls", strlen("tls"));
+ if (!ASSERT_OK(err, "setsockopt(TCP_ULP) client"))
+ goto out;
+
+ crypto_info.info.version = TLS_1_2_VERSION;
+ crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128;
+ memset(crypto_info.key, 0x01, sizeof(crypto_info.key));
+ memset(crypto_info.salt, 0x02, sizeof(crypto_info.salt));
+
+ err = setsockopt(c, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info));
+ if (!ASSERT_OK(err, "setsockopt(TLS_TX)"))
+ goto out;
+
+ /* Step 2: insert receiver into sockmap BEFORE TLS RX */
+ err = bpf_map_update_elem(map_fd, &zero, &p, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+
+ /* Step 3: configure TLS RX AFTER sockmap insertion */
+ err = setsockopt(p, IPPROTO_TCP, TCP_ULP, "tls", strlen("tls"));
+ if (!ASSERT_OK(err, "setsockopt(TCP_ULP) server"))
+ goto out;
+
+ err = setsockopt(p, SOL_TLS, TLS_RX, &crypto_info, sizeof(crypto_info));
+ if (!ASSERT_OK(err, "setsockopt(TLS_RX)"))
+ goto out;
+
+ /*
+ * A buggy kernel hits WARN_ON_ONCE in tls_strp_load_anchor_with_queue
+ * and may UAF in tls_decrypt_sg here. With the fix,
+ * sk_psock_verdict_data_ready defers to sock_def_readable and TLS
+ * decrypts the record normally.
+ */
+ n = send(c, send_buf, sizeof(send_buf), 0);
+ if (!ASSERT_EQ(n, (ssize_t)sizeof(send_buf), "send"))
+ goto out;
+
+ n = recv_timeout(p, recv_buf, sizeof(recv_buf), 0, 5);
+ if (!ASSERT_EQ(n, (ssize_t)sizeof(send_buf), "recv"))
+ goto out;
+
+ ASSERT_OK(memcmp(send_buf, recv_buf, sizeof(send_buf)), "data integrity");
+
+out:
+ if (c != -1)
+ close(c);
+ if (p != -1)
+ close(p);
+ test_sockmap_ktls__destroy(skel);
+}
+
static void run_ktls_test(int family, int sotype)
{
if (test__start_subtest("tls simple offload"))
@@ -429,6 +530,8 @@ static void run_ktls_test(int family, int sotype)
test_sockmap_ktls_tx_no_buf(family, sotype, true);
if (test__start_subtest("tls tx with pop"))
test_sockmap_ktls_tx_pop(family, sotype);
+ if (test__start_subtest("tls verdict with tls rx"))
+ test_sockmap_ktls_verdict_with_tls_rx(family, sotype);
}
void test_sockmap_ktls(void)
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
index 83df4919c..facafeaf4 100644
--- a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
@@ -17,6 +17,13 @@ struct {
__type(value, int);
} sock_map SEC(".maps");
+struct {
+ __uint(type, BPF_MAP_TYPE_SOCKMAP);
+ __uint(max_entries, 2);
+ __type(key, int);
+ __type(value, int);
+} sock_map_verdict SEC(".maps");
+
SEC("sk_msg")
int prog_sk_policy(struct sk_msg_md *msg)
{
@@ -38,3 +45,17 @@ int prog_sk_policy_redir(struct sk_msg_md *msg)
bpf_msg_apply_bytes(msg, apply_bytes);
return bpf_msg_redirect_map(msg, &sock_map, two, 0);
}
+
+/*
+ * Verdict program for the reverse-order TLS/sockmap regression test.
+ * Returns SK_PASS so tcp_read_skb() drains the receive queue via
+ * sk_psock_verdict_recv() without calling tcp_eat_skb(), which is
+ * the precondition for the KTLS strparser frag_list UAF.
+ */
+SEC("sk_skb/verdict")
+int prog_skb_verdict_pass(struct __sk_buff *skb)
+{
+ return SK_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v5 1/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
@ 2026-05-18 14:57 ` sashiko-bot
0 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-05-18 14:57 UTC (permalink / raw)
To: Xingwang Xiang; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Concurrent sockmap deletion causes a deterministic NULL pointer dereference and kernel panic.
- [High] Type confusion leading to out-of-bounds reads and BPF verdict bypass due to blind ULP context cast.
--
commit 99c04859dba221cd63c5238a47d1833983548f83
Author: Xingwang Xiang <v3rdant.xiang@gmail.com>
bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
sk_psock_strp_data_ready() already checks tls_sw_has_ctx_rx() and
defers to psock->saved_data_ready when a TLS RX context is present.
This patch applies the same logic to sk_psock_verdict_data_ready() to
prevent a use-after-free caused by tcp_read_skb draining the queue
without advancing copied_seq.
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 6187a83bd7411..e1850caf1a71a 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -1268,12 +1268,19 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
> static void sk_psock_verdict_data_ready(struct sock *sk)
> {
> const struct proto_ops *ops = NULL;
> + struct sk_psock *psock;
> struct socket *sock;
> int copied;
>
> trace_sk_data_ready(sk);
>
> rcu_read_lock();
> + psock = sk_psock(sk);
> + if (psock && tls_sw_has_ctx_rx(sk)) {
Is it possible for tls_sw_has_ctx_rx() to cause a type confusion regression here?
tls_sw_has_ctx_rx() calls tls_get_ctx(), which casts icsk->icsk_ulp_data
to a struct tls_context pointer. If a socket has a different ULP configured,
like espintcp, and is attached to a sockmap, icsk_ulp_data will point to a
different structure like espintcp_ctx.
Reading from this mismatched structure could yield garbage data. If that
data is non-zero, the code might falsely assume a KTLS RX context is active
and incorrectly skip the BPF verdict program.
While this problem already exists in sk_psock_strp_data_ready(), does this
change spread the same issue to the verdict path?
> + psock->saved_data_ready(sk);
Can this lead to a NULL pointer dereference regression during concurrent
sockmap deletion?
sk_psock_verdict_data_ready() executes here without holding
sk->sk_callback_lock. If userspace concurrently removes the socket from the
sockmap, sk_psock_stop_verdict() will be called, which explicitly sets
psock->saved_data_ready to NULL.
Since this code doesn't hold read_lock_bh(&sk->sk_callback_lock) or check if
saved_data_ready is NULL before calling it, could it execute the newly nullified
pointer and cause a crash?
I notice this vulnerability already exists in sk_psock_strp_data_ready(), but
does this duplicate it into the verdict handler?
> + rcu_read_unlock();
> + return;
> + }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260517145630.20521-1-v3rdant.xiang@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
@ 2026-05-18 14:57 ` sashiko-bot
0 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-05-18 14:57 UTC (permalink / raw)
To: Xingwang Xiang; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] Usage of ASSERT_TRUE for skeleton pointer checks instead of ASSERT_OK_PTR.
--
commit 2da876f992ba88c42d69b2910d90f780e2c1b89f
Author: Xingwang Xiang <v3rdant.xiang@gmail.com>
selftests/bpf: add regression test for ktls+sockmap verdict UAF
This commit adds a regression test for a use-after-free vulnerability
involving KTLS and sockmap verdict programs. The test creates a scenario
where TLS RX is configured after inserting the receiver socket into a
sockmap.
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
> index b87e7f39e15a8..6ed8e149e3d58 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
[ ... ]
> @@ -417,6 +417,107 @@ static void run_tests(int family, enum bpf_map_type map_type)
> close(map);
> }
>
> +/*
> + * Regression test for the KTLS + sockmap (verdict) reverse-order UAF.
> + *
> + * Vulnerable sequence:
[ ... ]
> +static void test_sockmap_ktls_verdict_with_tls_rx(int family, int sotype)
> +{
> + struct tls12_crypto_info_aes_gcm_128 crypto_info = {};
> + char send_buf[] = "hello ktls sockmap reverse order";
> + char recv_buf[sizeof(send_buf)] = {};
> + struct test_sockmap_ktls *skel;
> + int c = -1, p = -1, zero = 0;
> + int prog_fd, map_fd;
> + ssize_t n;
> + int err;
> +
> + skel = test_sockmap_ktls__open_and_load();
> + if (!ASSERT_TRUE(skel, "open_and_load"))
This isn't a bug, but should this use ASSERT_OK_PTR() instead?
ASSERT_OK_PTR() handles IS_ERR_OR_NULL checks and provides better
debugging output for pointers on failure.
> + return;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260517145630.20521-1-v3rdant.xiang@gmail.com?part=2
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-18 14:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
2026-05-18 14:57 ` sashiko-bot
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
2026-05-18 14:57 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox