* [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
@ 2026-05-17 14:56 Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
0 siblings, 2 replies; 3+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
sk_psock_verdict_data_ready() lacks the tls_sw_has_ctx_rx() guard that
sk_psock_strp_data_ready() gained in e91de6afa81c. When a socket is
inserted into a sockmap (BPF_SK_SKB_VERDICT) before TLS RX is configured,
the missing guard causes tcp_read_skb() to drain sk_receive_queue without
advancing copied_seq, leaving a dangling frag_list pointer that
tls_decrypt_sg() walks — a use-after-free.
Patch 1 mirrors the fix from e91de6afa81c: add the tls_sw_has_ctx_rx()
check to sk_psock_verdict_data_ready() so that when a TLS RX context is
present the function defers to psock->saved_data_ready (sock_def_readable)
instead of calling tcp_read_skb().
Patch 2 adds a selftest that drives the vulnerable sequence end-to-end
and verifies recv() returns the correct decrypted data.
Xingwang Xiang (2):
bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
selftests/bpf: add regression test for ktls+sockmap verdict UAF
net/core/skmsg.c | 9 +-
.../selftests/bpf/prog_tests/sockmap_ktls.c | 103 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_ktls.c | 21 ++++
3 files changed, 131 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH net v5 1/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
@ 2026-05-17 14:56 ` Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
1 sibling, 0 replies; 3+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
sk_psock_strp_data_ready() already checks tls_sw_has_ctx_rx() and
defers to psock->saved_data_ready when a TLS RX context is present,
avoiding a conflict with the TLS strparser's ownership of the receive
queue (commit e91de6afa81c, "bpf: Fix running sk_skb program types
with ktls").
sk_psock_verdict_data_ready() has no equivalent guard. When a socket
is inserted into a sockmap (BPF_SK_SKB_VERDICT) before TLS RX is
configured, tls_sw_strparser_arm() saves sk_psock_verdict_data_ready
as rx_ctx->saved_data_ready. On data arrival:
tls_data_ready -> tls_strp_data_ready -> tls_rx_msg_ready
-> saved_data_ready() = sk_psock_verdict_data_ready()
-> tcp_read_skb() drains sk_receive_queue via __skb_unlink()
without calling tcp_eat_skb(), so copied_seq is not advanced.
tls_strp_msg_load() then finds tcp_inq() >= full_len (stale), calls
tcp_recv_skb() on the now-empty queue, hits WARN_ON_ONCE(!first), and
returns with rx_ctx->strp.anchor.frag_list pointing at a psock-owned
(potentially freed) skb. tls_decrypt_sg() subsequently walks that
frag_list: use-after-free.
Apply the same fix as sk_psock_strp_data_ready(): if a TLS RX context
is present, call psock->saved_data_ready (sock_def_readable) to wake
recv() waiters and return immediately, leaving the receive queue
untouched. TLS retains sole ownership of the queue and decrypts the
record normally through tls_sw_recvmsg().
Signed-off-by: Xingwang Xiang <v3rdant.xiang@gmail.com>
---
net/core/skmsg.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 6187a83bd..e1850caf1 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -1268,12 +1268,19 @@ static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb)
static void sk_psock_verdict_data_ready(struct sock *sk)
{
const struct proto_ops *ops = NULL;
+ struct sk_psock *psock;
struct socket *sock;
int copied;
trace_sk_data_ready(sk);
rcu_read_lock();
+ psock = sk_psock(sk);
+ if (psock && tls_sw_has_ctx_rx(sk)) {
+ psock->saved_data_ready(sk);
+ rcu_read_unlock();
+ return;
+ }
sock = READ_ONCE(sk->sk_socket);
if (likely(sock))
ops = READ_ONCE(sock->ops);
@@ -1283,8 +1290,6 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
copied = ops->read_skb(sk, sk_psock_verdict_recv);
if (copied >= 0) {
- struct sk_psock *psock;
-
rcu_read_lock();
psock = sk_psock(sk);
if (psock)
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
@ 2026-05-17 14:56 ` Xingwang Xiang
1 sibling, 0 replies; 3+ messages in thread
From: Xingwang Xiang @ 2026-05-17 14:56 UTC (permalink / raw)
To: john.fastabend, kuba, mrpre
Cc: jakub, sd, davem, pabeni, horms, netdev, daniel, bpf,
Xingwang Xiang
Test the scenario where a socket is inserted into a sockmap with a
BPF_SK_SKB_VERDICT program before TLS RX is configured. Previously
sk_psock_verdict_data_ready() would call tcp_read_skb() and drain the
receive queue without advancing copied_seq, causing tls_decrypt_sg()
to walk a dangling frag_list pointer (use-after-free).
The test drives the full vulnerable sequence and verifies that after
the fix recv() returns the correct decrypted data.
Signed-off-by: Xingwang Xiang <v3rdant.xiang@gmail.com>
---
.../selftests/bpf/prog_tests/sockmap_ktls.c | 103 ++++++++++++++++++
.../selftests/bpf/progs/test_sockmap_ktls.c | 21 ++++
2 files changed, 124 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
index b87e7f39e..6ed8e149e 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_ktls.c
@@ -417,6 +417,107 @@ static void run_tests(int family, enum bpf_map_type map_type)
close(map);
}
+/*
+ * Regression test for the KTLS + sockmap (verdict) reverse-order UAF.
+ *
+ * Vulnerable sequence:
+ * 1. Insert receiver socket into sockmap with BPF_SK_SKB_VERDICT program.
+ * sk->sk_data_ready becomes sk_psock_verdict_data_ready.
+ * 2. Configure TLS RX: tls_sw_strparser_arm() saves
+ * sk_psock_verdict_data_ready as rx_ctx->saved_data_ready.
+ *
+ * When data arrives, tls_rx_msg_ready() calls saved_data_ready() =
+ * sk_psock_verdict_data_ready(), which calls tcp_read_skb() and drains
+ * sk_receive_queue via __skb_unlink() without advancing copied_seq.
+ * tls_strp_msg_load() then finds the queue empty while tcp_inq() is still
+ * non-zero, hits WARN_ON_ONCE(!first), and leaves a dangling frag_list
+ * pointer that tls_decrypt_sg() walks — a use-after-free.
+ *
+ * The fix adds a tls_sw_has_ctx_rx() check to sk_psock_verdict_data_ready(),
+ * mirroring what sk_psock_strp_data_ready() already does: when a TLS RX
+ * context is present, defer to psock->saved_data_ready (sock_def_readable)
+ * instead of calling tcp_read_skb(), so TLS retains sole ownership of the
+ * receive queue. Data is then decrypted and returned correctly by
+ * tls_sw_recvmsg().
+ */
+static void test_sockmap_ktls_verdict_with_tls_rx(int family, int sotype)
+{
+ struct tls12_crypto_info_aes_gcm_128 crypto_info = {};
+ char send_buf[] = "hello ktls sockmap reverse order";
+ char recv_buf[sizeof(send_buf)] = {};
+ struct test_sockmap_ktls *skel;
+ int c = -1, p = -1, zero = 0;
+ int prog_fd, map_fd;
+ ssize_t n;
+ int err;
+
+ skel = test_sockmap_ktls__open_and_load();
+ if (!ASSERT_TRUE(skel, "open_and_load"))
+ return;
+
+ err = create_pair(family, sotype, &c, &p);
+ if (!ASSERT_OK(err, "create_pair"))
+ goto out;
+
+ prog_fd = bpf_program__fd(skel->progs.prog_skb_verdict_pass);
+ map_fd = bpf_map__fd(skel->maps.sock_map_verdict);
+
+ err = bpf_prog_attach(prog_fd, map_fd, BPF_SK_SKB_VERDICT, 0);
+ if (!ASSERT_OK(err, "bpf_prog_attach sk_skb verdict"))
+ goto out;
+
+ /* Step 1: configure TLS TX on sender (no sockmap involvement) */
+ err = setsockopt(c, IPPROTO_TCP, TCP_ULP, "tls", strlen("tls"));
+ if (!ASSERT_OK(err, "setsockopt(TCP_ULP) client"))
+ goto out;
+
+ crypto_info.info.version = TLS_1_2_VERSION;
+ crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128;
+ memset(crypto_info.key, 0x01, sizeof(crypto_info.key));
+ memset(crypto_info.salt, 0x02, sizeof(crypto_info.salt));
+
+ err = setsockopt(c, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info));
+ if (!ASSERT_OK(err, "setsockopt(TLS_TX)"))
+ goto out;
+
+ /* Step 2: insert receiver into sockmap BEFORE TLS RX */
+ err = bpf_map_update_elem(map_fd, &zero, &p, BPF_NOEXIST);
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+
+ /* Step 3: configure TLS RX AFTER sockmap insertion */
+ err = setsockopt(p, IPPROTO_TCP, TCP_ULP, "tls", strlen("tls"));
+ if (!ASSERT_OK(err, "setsockopt(TCP_ULP) server"))
+ goto out;
+
+ err = setsockopt(p, SOL_TLS, TLS_RX, &crypto_info, sizeof(crypto_info));
+ if (!ASSERT_OK(err, "setsockopt(TLS_RX)"))
+ goto out;
+
+ /*
+ * A buggy kernel hits WARN_ON_ONCE in tls_strp_load_anchor_with_queue
+ * and may UAF in tls_decrypt_sg here. With the fix,
+ * sk_psock_verdict_data_ready defers to sock_def_readable and TLS
+ * decrypts the record normally.
+ */
+ n = send(c, send_buf, sizeof(send_buf), 0);
+ if (!ASSERT_EQ(n, (ssize_t)sizeof(send_buf), "send"))
+ goto out;
+
+ n = recv_timeout(p, recv_buf, sizeof(recv_buf), 0, 5);
+ if (!ASSERT_EQ(n, (ssize_t)sizeof(send_buf), "recv"))
+ goto out;
+
+ ASSERT_OK(memcmp(send_buf, recv_buf, sizeof(send_buf)), "data integrity");
+
+out:
+ if (c != -1)
+ close(c);
+ if (p != -1)
+ close(p);
+ test_sockmap_ktls__destroy(skel);
+}
+
static void run_ktls_test(int family, int sotype)
{
if (test__start_subtest("tls simple offload"))
@@ -429,6 +530,8 @@ static void run_ktls_test(int family, int sotype)
test_sockmap_ktls_tx_no_buf(family, sotype, true);
if (test__start_subtest("tls tx with pop"))
test_sockmap_ktls_tx_pop(family, sotype);
+ if (test__start_subtest("tls verdict with tls rx"))
+ test_sockmap_ktls_verdict_with_tls_rx(family, sotype);
}
void test_sockmap_ktls(void)
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
index 83df4919c..facafeaf4 100644
--- a/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_ktls.c
@@ -17,6 +17,13 @@ struct {
__type(value, int);
} sock_map SEC(".maps");
+struct {
+ __uint(type, BPF_MAP_TYPE_SOCKMAP);
+ __uint(max_entries, 2);
+ __type(key, int);
+ __type(value, int);
+} sock_map_verdict SEC(".maps");
+
SEC("sk_msg")
int prog_sk_policy(struct sk_msg_md *msg)
{
@@ -38,3 +45,17 @@ int prog_sk_policy_redir(struct sk_msg_md *msg)
bpf_msg_apply_bytes(msg, apply_bytes);
return bpf_msg_redirect_map(msg, &sock_map, two, 0);
}
+
+/*
+ * Verdict program for the reverse-order TLS/sockmap regression test.
+ * Returns SK_PASS so tcp_read_skb() drains the receive queue via
+ * sk_psock_verdict_recv() without calling tcp_eat_skb(), which is
+ * the precondition for the KTLS strparser frag_list UAF.
+ */
+SEC("sk_skb/verdict")
+int prog_skb_verdict_pass(struct __sk_buff *skb)
+{
+ return SK_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-17 14:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-17 14:56 [PATCH net v5 0/2] bpf, skmsg: fix verdict sk_data_ready racing with ktls rx Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 1/2] " Xingwang Xiang
2026-05-17 14:56 ` [PATCH net v5 2/2] selftests/bpf: add regression test for ktls+sockmap verdict UAF Xingwang Xiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox