* [PATCH bpf v8 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie
@ 2026-06-08 12:58 Jiayuan Chen
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
2026-06-08 12:58 ` [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
0 siblings, 2 replies; 14+ messages in thread
From: Jiayuan Chen @ 2026-06-08 12:58 UTC (permalink / raw)
To: bpf
Cc: Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Willem de Bruijn, Andrii Nakryiko,
Eduard Zingerman, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Kumar Kartikeya Dwivedi, Song Liu,
Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest
bpf_sk_assign_tcp_reqsk() does not validate the L4 protocol of the skb,
only checking skb->protocol (L3). A BPF program that calls this kfunc on
a non-TCP skb (e.g. UDP) will succeed, attaching a TCP reqsk to the skb.
When the skb enters the UDP receive path, skb_steal_sock() returns the
TCP listener socket from the reqsk. The UDP code then casts this TCP
socket to udp_sock and accesses UDP-specific fields at invalid offsets,
causing a null pointer dereference:
BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0
Read of size 4 at addr 0000000000000008 by task test_progs/537
CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:123)
print_report (mm/kasan/report.c:487)
kasan_report (mm/kasan/report.c:597)
__kasan_check_read (mm/kasan/shadow.c:32)
__udp_enqueue_schedule_skb (net/ipv4/udp.c:1719)
udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500)
udp_queue_rcv_skb (net/ipv4/udp.c:2532)
udp_unicast_rcv_skb (net/ipv4/udp.c:2684)
__udp4_lib_rcv (net/ipv4/udp.c:2742)
udp_rcv (net/ipv4/udp.c:2937)
ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209)
ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242)
ip_local_deliver (net/ipv4/ip_input.c:265)
__netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4))
__netif_receive_skb (net/core/dev.c:6280)
Solution
Validating the protocol in the helper is not enough: a BPF program can
bypass an ip_hdr(skb)->protocol check via TOCTOU by rewriting the header
around the call, and bpf_sk_assign() has the same problem since it can
assign any socket type to any skb. So validate the protocol where the
assigned socket is consumed instead.
Patch 1: Validate the L4 protocol in skb_steal_sock(). Each caller passes
the protocol it handles (TCP or UDP), and a prefetched socket whose
protocol does not match is rejected, regardless of how it was assigned.
Patch 2: Add a selftest that calls bpf_sk_assign_tcp_reqsk() on a UDP skb
and verifies the stack no longer crashes.
---
v1: https://lore.kernel.org/bpf/20260323105510.51990-1-jiayuan.chen@linux.dev/
v2: https://lore.kernel.org/bpf/20260326062657.88446-1-jiayuan.chen@linux.dev/
v3: https://lore.kernel.org/bpf/20260327133915.286037-1-jiayuan.chen@linux.dev/
v4: https://lore.kernel.org/bpf/20260330080746.319680-1-jiayuan.chen@linux.dev/
v5: https://lore.kernel.org/bpf/20260401110511.73355-1-jiayuan.chen@linux.dev/
v6: https://lore.kernel.org/all/20260403015851.148209-1-jiayuan.chen@linux.dev/
Changes in v6 & v7:
- resend and keep selftest.
Changes in v5:
- use skb_header_pointer instead of pskb_may_pull.
Changes in v5:
- Add pskb_may_pull before accessing IP/IPv6 headers in kfunc
- Use buf[] instead of buf[32], verify recv data with ASSERT_STREQ
- Remove unnecessary variable initializations in selftest and BPF
Changes in v4:
- Check if assign_ret is EINVAL instead of checking if it is 0
Changes in v3:
- Add IPv6 test coverage, reuse test_cases[] to iterate over both
address families
- Share TCP/UDP port to simplify BPF program, remove unnecessary
global variables
- Use connect_to_fd() + send()/recv() instead of manual sockaddr
construction
- Suggested by Kuniyuki Iwashima
Changes in v2:
- Add Reviewed-by tag from Kuniyuki Iwashima for patch 1
- Use UDP socket recv() instead of kern_sync_rcu() for synchronization
in selftest
Jiayuan Chen (2):
net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
include/net/inet6_hashtables.h | 7 +-
include/net/inet_hashtables.h | 7 +-
include/net/request_sock.h | 16 ++-
net/ipv4/udp.c | 2 +-
net/ipv6/udp.c | 2 +-
.../bpf/prog_tests/tcp_custom_syncookie.c | 87 ++++++++++++++-
.../bpf/progs/test_tcp_custom_syncookie.c | 102 ++++++++++++++++++
7 files changed, 210 insertions(+), 13 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 12:58 [PATCH bpf v8 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
@ 2026-06-08 12:58 ` Jiayuan Chen
2026-06-08 13:31 ` sashiko-bot
` (2 more replies)
2026-06-08 12:58 ` [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
1 sibling, 3 replies; 14+ messages in thread
From: Jiayuan Chen @ 2026-06-08 12:58 UTC (permalink / raw)
To: bpf
Cc: Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Willem de Bruijn, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu,
Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest
bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
causing a panic when the skb enters the wrong L4 receive path [1].
An initial attempt tried to fix this in the BPF helper by checking
iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
this check via a TOCTOU attack by modifying iph->protocol around the
call:
iph->protocol = IPPROTO_TCP;
bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
iph->protocol = IPPROTO_UDP;
Furthermore, bpf_sk_assign() has had the same class of vulnerability
since its introduction — it can assign any socket type to any skb
without protocol validation. Since the BPF helper check alone cannot
prevent a malicious BPF program from crashing the kernel, add protocol
validation in skb_steal_sock() to reject mismatched sockets regardless
of how they were assigned.
The check is applied to all prefetched sockets. Early demux paths
already only assign matching protocols (e.g., UDP early demux only
assigns UDP sockets to UDP skbs), so they pass the check naturally and
the extra branch is negligible.
Pass the expected protocol from callers rather than extracting it from
the IP header. For IPv6, walking extension headers to find the L4
protocol is complex and unnecessary since each caller already knows
the protocol it handles.
[1] https://lore.kernel.org/bpf/20260403015851.148209-1-jiayuan.chen@linux.dev/
[2] https://sashiko.dev/#/patchset/20260403015851.148209-1-jiayuan.chen%40linux.dev
Fixes: cf7fbe660f2d ("bpf: Add socket assign support")
Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
include/net/inet6_hashtables.h | 7 ++++---
include/net/inet_hashtables.h | 7 ++++---
include/net/request_sock.h | 16 +++++++++++++++-
net/ipv4/udp.c | 2 +-
net/ipv6/udp.c | 2 +-
5 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index 2cc5d416bbb5a..218498373a9c1 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -106,12 +106,13 @@ static inline
struct sock *inet6_steal_sock(struct net *net, struct sk_buff *skb, int doff,
const struct in6_addr *saddr, const __be16 sport,
const struct in6_addr *daddr, const __be16 dport,
- bool *refcounted, inet6_ehashfn_t *ehashfn)
+ bool *refcounted, inet6_ehashfn_t *ehashfn,
+ int protocol)
{
struct sock *sk, *reuse_sk;
bool prefetched;
- sk = skb_steal_sock(skb, refcounted, &prefetched);
+ sk = skb_steal_sock(skb, refcounted, &prefetched, protocol);
if (!sk)
return NULL;
@@ -153,7 +154,7 @@ static inline struct sock *__inet6_lookup_skb(struct sk_buff *skb, int doff,
struct sock *sk;
sk = inet6_steal_sock(net, skb, doff, &ip6h->saddr, sport, &ip6h->daddr, dport,
- refcounted, inet6_ehashfn);
+ refcounted, inet6_ehashfn, IPPROTO_TCP);
if (IS_ERR(sk))
return NULL;
if (sk)
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
index 6e2fe186d0dcb..a2a044f93cc4b 100644
--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -446,12 +446,13 @@ static inline
struct sock *inet_steal_sock(struct net *net, struct sk_buff *skb, int doff,
const __be32 saddr, const __be16 sport,
const __be32 daddr, const __be16 dport,
- bool *refcounted, inet_ehashfn_t *ehashfn)
+ bool *refcounted, inet_ehashfn_t *ehashfn,
+ int protocol)
{
struct sock *sk, *reuse_sk;
bool prefetched;
- sk = skb_steal_sock(skb, refcounted, &prefetched);
+ sk = skb_steal_sock(skb, refcounted, &prefetched, protocol);
if (!sk)
return NULL;
@@ -494,7 +495,7 @@ static inline struct sock *__inet_lookup_skb(struct sk_buff *skb,
struct sock *sk;
sk = inet_steal_sock(net, skb, doff, iph->saddr, sport, iph->daddr, dport,
- refcounted, inet_ehashfn);
+ refcounted, inet_ehashfn, IPPROTO_TCP);
if (IS_ERR(sk))
return NULL;
if (sk)
diff --git a/include/net/request_sock.h b/include/net/request_sock.h
index 5a9c826a7092d..c2b8c6350b624 100644
--- a/include/net/request_sock.h
+++ b/include/net/request_sock.h
@@ -89,9 +89,11 @@ static inline struct sock *req_to_sk(struct request_sock *req)
* @skb: sk_buff to steal the socket from
* @refcounted: is set to true if the socket is reference-counted
* @prefetched: is set to true if the socket was assigned from bpf
+ * @protocol: expected L4 protocol
*/
static inline struct sock *skb_steal_sock(struct sk_buff *skb,
- bool *refcounted, bool *prefetched)
+ bool *refcounted, bool *prefetched,
+ int protocol)
{
struct sock *sk = skb->sk;
@@ -103,6 +105,18 @@ static inline struct sock *skb_steal_sock(struct sk_buff *skb,
*prefetched = skb_sk_is_prefetched(skb);
if (*prefetched) {
+ /* A non-full socket here is either a reqsk or a
+ * timewait sock, both only contain sock_common and
+ * lack sk_protocol. Since both can only be TCP,
+ * use IPPROTO_TCP as the protocol.
+ */
+ if (unlikely(((sk_fullsock(sk) ? sk->sk_protocol : IPPROTO_TCP) != protocol))) {
+ skb_orphan(skb);
+ *prefetched = false;
+ *refcounted = false;
+ return NULL;
+ }
+
#if IS_ENABLED(CONFIG_SYN_COOKIES)
if (sk->sk_state == TCP_NEW_SYN_RECV && inet_reqsk(sk)->syncookie) {
struct request_sock *req = inet_reqsk(sk);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 0ac2bf4f8759b..ceb4d29a64aca 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2618,7 +2618,7 @@ int udp_rcv(struct sk_buff *skb)
goto csum_error;
sk = inet_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
- &refcounted, udp_ehashfn);
+ &refcounted, udp_ehashfn, IPPROTO_UDP);
if (IS_ERR(sk))
goto no_sk;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 15e032194eccc..d9c12cce5acef 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1106,7 +1106,7 @@ INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb)
/* Check if the socket is already available, e.g. due to early demux */
sk = inet6_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
- &refcounted, udp6_ehashfn);
+ &refcounted, udp6_ehashfn, IPPROTO_UDP);
if (IS_ERR(sk))
goto no_sk;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
2026-06-08 12:58 [PATCH bpf v8 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
@ 2026-06-08 12:58 ` Jiayuan Chen
2026-06-08 13:07 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
1 sibling, 2 replies; 14+ messages in thread
From: Jiayuan Chen @ 2026-06-08 12:58 UTC (permalink / raw)
To: bpf
Cc: Jiayuan Chen, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Willem de Bruijn, Andrii Nakryiko,
Eduard Zingerman, Alexei Starovoitov, Daniel Borkmann,
Martin KaFai Lau, Kumar Kartikeya Dwivedi, Song Liu,
Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest
Add test_tcp_custom_syncookie_protocol_check to test assigning a TCP sk to
UDP skb->sk. The test sends a UDP packet through TC ingress, where a BPF
program calls bpf_sk_assign_tcp_reqsk() on it. A UDP server's recv() is
used for synchronization to ensure the BPF program has finished processing
before checking the result.
Without the fix in bpf_sk_assign_tcp_reqsk(), this will cause a null
pointer dereference panic when the kernel processes it through the UDP
receive path.
Test result:
./test_progs -a tcp_custom_syncookie_protocol_check -v
setup_netns:PASS:create netns 0 nsec
setup_netns:PASS:ip 0 nsec
write_sysctl:PASS:open sysctl 0 nsec
write_sysctl:PASS:write sysctl 0 nsec
setup_netns:PASS:write_sysctl 0 nsec
test_tcp_custom_syncookie_protocol_check:PASS:open_and_load 0 nsec
setup_tc:PASS:qdisc add dev lo clsact 0 nsec
setup_tc:PASS:filter add dev lo ingress 0 nsec
run_protocol_check:PASS:start tcp_server 0 nsec
run_protocol_check:PASS:start udp_server 0 nsec
run_protocol_check:PASS:connect udp_client 0 nsec
run_protocol_check:PASS:send udp 0 nsec
run_protocol_check:PASS:recv udp 0 nsec
run_protocol_check:PASS:recv data 0 nsec
run_protocol_check:PASS:udp_intercepted 0 nsec
#492/1 tcp_custom_syncookie_protocol_check/IPv4 TCP:OK
run_protocol_check:PASS:start tcp_server 0 nsec
run_protocol_check:PASS:start udp_server 0 nsec
run_protocol_check:PASS:connect udp_client 0 nsec
run_protocol_check:PASS:send udp 0 nsec
run_protocol_check:PASS:recv udp 0 nsec
run_protocol_check:PASS:recv data 0 nsec
run_protocol_check:PASS:udp_intercepted 0 nsec
#492/2 tcp_custom_syncookie_protocol_check/IPv6 TCP:OK
#492 tcp_custom_syncookie_protocol_check:OK
Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
.../bpf/prog_tests/tcp_custom_syncookie.c | 87 ++++++++++++++-
.../bpf/progs/test_tcp_custom_syncookie.c | 102 ++++++++++++++++++
2 files changed, 185 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
index 00d5c32674fc9..03103b7b9c5bf 100644
--- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
@@ -5,6 +5,7 @@
#include <sched.h>
#include <stdlib.h>
#include <net/if.h>
+#include <netinet/in.h>
#include "test_progs.h"
#include "cgroup_helpers.h"
@@ -47,11 +48,10 @@ static int setup_netns(void)
return -1;
}
-static int setup_tc(struct test_tcp_custom_syncookie *skel)
+static int setup_tc(int prog_fd)
{
LIBBPF_OPTS(bpf_tc_hook, qdisc_lo, .attach_point = BPF_TC_INGRESS);
- LIBBPF_OPTS(bpf_tc_opts, tc_attach,
- .prog_fd = bpf_program__fd(skel->progs.tcp_custom_syncookie));
+ LIBBPF_OPTS(bpf_tc_opts, tc_attach, .prog_fd = prog_fd);
qdisc_lo.ifindex = if_nametoindex("lo");
if (!ASSERT_OK(bpf_tc_hook_create(&qdisc_lo), "qdisc add dev lo clsact"))
@@ -147,7 +147,7 @@ void test_tcp_custom_syncookie(void)
if (!ASSERT_OK_PTR(skel, "open_and_load"))
return;
- if (setup_tc(skel))
+ if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie)))
goto destroy_skel;
for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
@@ -165,6 +165,85 @@ void test_tcp_custom_syncookie(void)
destroy_skel:
system("tc qdisc del dev lo clsact");
+ test_tcp_custom_syncookie__destroy(skel);
+}
+
+/* TCP and UDP servers share the same port. The BPF program intercepts
+ * the UDP packet, looks up the TCP listener via the dest port, and
+ * attempts to assign a TCP reqsk to the UDP skb.
+ * Although bpf_sk_assign_tcp_reqsk() assign udp skb to tcp reqsk, the
+ * network stack should not crash.
+ */
+static void run_protocol_check(struct test_tcp_custom_syncookie *skel,
+ int family, const char *addr)
+{
+ int tcp_server, udp_server, udp_client;
+ char buf[] = "test";
+ int port, ret;
+
+ tcp_server = start_server(family, SOCK_STREAM, addr, 0, 0);
+ if (!ASSERT_NEQ(tcp_server, -1, "start tcp_server"))
+ return;
+
+ port = ntohs(get_socket_local_port(tcp_server));
+
+ /* UDP server on same port for synchronization and port sharing */
+ udp_server = start_server(family, SOCK_DGRAM, addr, port, 0);
+ if (!ASSERT_NEQ(udp_server, -1, "start udp_server"))
+ goto close_tcp;
+
+ skel->bss->udp_intercepted = false;
+
+ udp_client = connect_to_fd(udp_server, 0);
+ if (!ASSERT_NEQ(udp_client, -1, "connect udp_client"))
+ goto close_udp_server;
+
+ ret = send(udp_client, buf, sizeof(buf), 0);
+ if (!ASSERT_EQ(ret, sizeof(buf), "send udp"))
+ goto close_udp_client;
+
+ memset(buf, 0, sizeof(buf));
+
+ /* recv() ensures TC ingress BPF has processed the skb */
+ ret = recv(udp_server, buf, sizeof(buf), 0);
+ if (!ASSERT_EQ(ret, sizeof(buf), "recv udp"))
+ goto close_udp_client;
+
+ ASSERT_STREQ(buf, "test", "recv data");
+ ASSERT_EQ(skel->bss->udp_intercepted, true, "udp_intercepted");
+
+close_udp_client:
+ close(udp_client);
+close_udp_server:
+ close(udp_server);
+close_tcp:
+ close(tcp_server);
+}
+
+void test_tcp_custom_syncookie_protocol_check(void)
+{
+ struct test_tcp_custom_syncookie *skel;
+ int i;
+
+ if (setup_netns())
+ return;
+ skel = test_tcp_custom_syncookie__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open_and_load"))
+ return;
+
+ if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie_badproto)))
+ goto destroy_skel;
+
+ for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+ if (!test__start_subtest(test_cases[i].name))
+ continue;
+
+ run_protocol_check(skel, test_cases[i].family,
+ test_cases[i].addr);
+ }
+
+destroy_skel:
+ system("tc qdisc del dev lo clsact");
test_tcp_custom_syncookie__destroy(skel);
}
diff --git a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
index 7d5293de19523..5cf99d2a12aa5 100644
--- a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
+++ b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
@@ -588,4 +588,106 @@ int tcp_custom_syncookie(struct __sk_buff *skb)
return tcp_handle_ack(&ctx);
}
+/* Test: call bpf_sk_assign_tcp_reqsk() on a UDP skb. */
+bool udp_intercepted;
+
+static int badproto_lookup_assign(struct __sk_buff *skb,
+ struct bpf_sock_tuple *tuple, u32 tuple_size)
+{
+ struct bpf_tcp_req_attrs attrs = {};
+ struct bpf_sock *skc;
+ struct sock *sk;
+
+ skc = bpf_skc_lookup_tcp(skb, tuple, tuple_size, -1, 0);
+ if (!skc)
+ return TC_ACT_OK;
+
+ if (skc->state != TCP_LISTEN) {
+ bpf_sk_release(skc);
+ return TC_ACT_OK;
+ }
+
+ sk = (struct sock *)bpf_skc_to_tcp_sock(skc);
+ if (!sk) {
+ bpf_sk_release(skc);
+ return TC_ACT_OK;
+ }
+
+ attrs.mss = 1460;
+ attrs.wscale_ok = 1;
+ attrs.snd_wscale = 7;
+ attrs.rcv_wscale = 7;
+ attrs.sack_ok = 1;
+
+ bpf_sk_assign_tcp_reqsk(skb, sk, &attrs, sizeof(attrs));
+
+ bpf_sk_release(skc);
+ return TC_ACT_OK;
+}
+
+SEC("tc")
+int tcp_custom_syncookie_badproto(struct __sk_buff *skb)
+{
+ void *data = (void *)(long)skb->data;
+ void *data_end = (void *)(long)skb->data_end;
+ struct bpf_sock_tuple tuple = {};
+ struct ethhdr *eth;
+ struct iphdr *iph;
+ struct ipv6hdr *ip6h;
+ struct udphdr *udp;
+
+ eth = (struct ethhdr *)data;
+ if (eth + 1 > data_end)
+ return TC_ACT_OK;
+
+ switch (bpf_ntohs(eth->h_proto)) {
+ case ETH_P_IP:
+ iph = (struct iphdr *)(eth + 1);
+ if (iph + 1 > data_end)
+ return TC_ACT_OK;
+
+ if (iph->protocol != IPPROTO_UDP)
+ return TC_ACT_OK;
+
+ udp = (struct udphdr *)(iph + 1);
+ if (udp + 1 > data_end)
+ return TC_ACT_OK;
+
+ udp_intercepted = true;
+
+ tuple.ipv4.saddr = iph->saddr;
+ tuple.ipv4.daddr = iph->daddr;
+ tuple.ipv4.sport = udp->source;
+ tuple.ipv4.dport = udp->dest;
+
+ return badproto_lookup_assign(skb, &tuple,
+ sizeof(tuple.ipv4));
+ case ETH_P_IPV6:
+ ip6h = (struct ipv6hdr *)(eth + 1);
+ if (ip6h + 1 > data_end)
+ return TC_ACT_OK;
+
+ if (ip6h->nexthdr != IPPROTO_UDP)
+ return TC_ACT_OK;
+
+ udp = (struct udphdr *)(ip6h + 1);
+ if (udp + 1 > data_end)
+ return TC_ACT_OK;
+
+ udp_intercepted = true;
+
+ __builtin_memcpy(tuple.ipv6.saddr, &ip6h->saddr,
+ sizeof(tuple.ipv6.saddr));
+ __builtin_memcpy(tuple.ipv6.daddr, &ip6h->daddr,
+ sizeof(tuple.ipv6.daddr));
+ tuple.ipv6.sport = udp->source;
+ tuple.ipv6.dport = udp->dest;
+
+ return badproto_lookup_assign(skb, &tuple,
+ sizeof(tuple.ipv6));
+ default:
+ return TC_ACT_OK;
+ }
+}
+
char _license[] SEC("license") = "GPL";
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
2026-06-08 12:58 ` [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
@ 2026-06-08 13:07 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
1 sibling, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-08 13:07 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] Multi-line comment opening `/*` is not on its own line, violating BPF subsystem style.
--
commit f9aa00c40e1871ad0a9a7344e06b70dd2c1b3f81
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
This commit adds a new BPF selftest test_tcp_custom_syncookie_protocol_check
to verify that bpf_sk_assign_tcp_reqsk() correctly handles assigning a TCP
socket to a UDP skb. It ensures the network stack doesn't crash when passing
the skb through the UDP receive path.
> diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> --- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> +++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
[ ... ]
> +/* TCP and UDP servers share the same port. The BPF program intercepts
> + * the UDP packet, looks up the TCP listener via the dest port, and
> + * attempts to assign a TCP reqsk to the UDP skb.
> + * Although bpf_sk_assign_tcp_reqsk() assign udp skb to tcp reqsk, the
> + * network stack should not crash.
> + */
> +static void run_protocol_check(struct test_tcp_custom_syncookie *skel,
> + int family, const char *addr)
[Severity: Low]
This isn't a bug, but does this multi-line comment format match the
preferred BPF subsystem style? The guidelines require the opening /* to be
on its own line.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260608125846.157004-1-jiayuan.chen@linux.dev?part=2
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
@ 2026-06-08 13:31 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
2026-06-08 17:21 ` Kuniyuki Iwashima
2 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-06-08 13:31 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [Critical] The patch adds a protocol validation check to skb_steal_sock() to prevent mismatched sockets from causing type confusion. However, it restricts this check exclusively to 'prefetched' sockets. Sockets assigned via TCP early demux (tcp_v4_early_demux()) evaluate prefetched to false. An attacker can use a Netfilter rule at the NF_INET_LOCAL_IN hook to rewrite the IP protocol of an incoming TCP packet to IPPROTO_UDP after early demux has attached a struct tcp_sock. The packet will then be routed to udp_rcv(), which calls skb_steal_sock() and bypasses the check.
- [High] The protocol validation in skb_steal_sock() prevents mismatches like UDP sockets on TCP packets, but fails to prevent type confusion with RAW sockets. A RAW socket can be created with an arbitrary protocol (e.g., socket(AF_INET, SOCK_RAW, IPPROTO_TCP)). Because the patch only verifies sk->sk_protocol against the expected protocol without checking sk->sk_type, a malicious BPF program can assign a RAW socket to a TCP packet. The receive path will then dangerously cast the raw_sock to a tcp_sock, causing out-of-bounds access and kernel panics.
--
commit 5523558843e68f090fac91db8bdd58ed67db17e4
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
This commit adds protocol validation in skb_steal_sock() to reject mismatched
sockets, preventing malicious BPF programs from causing kernel panics by
assigning incorrect socket types to packets. It updates various callers to
pass their expected protocol during the socket steal operation.
Link: https://lore.kernel.org/bpf/20260403015851.148209-1-jiayuan.chen@linux.dev/
Link: https://sashiko.dev/#/patchset/20260403015851.148209-1-jiayuan.chen%40linux.dev
> diff --git a/include/net/request_sock.h b/include/net/request_sock.h
> --- a/include/net/request_sock.h
> +++ b/include/net/request_sock.h
[ ... ]
> @@ -103,6 +105,18 @@ static inline struct sock *skb_steal_sock(struct sk_buff *skb,
>
> *prefetched = skb_sk_is_prefetched(skb);
> if (*prefetched) {
[Severity: Critical]
This is a pre-existing issue, but does wrapping the validation inside the
prefetched check leave early demux sockets vulnerable?
Sockets assigned via TCP early demux evaluate prefetched to false because
their destructor is sock_edemux. An attacker could use a Netfilter rule at
the NF_INET_LOCAL_IN hook to rewrite the IP protocol of an incoming TCP
packet to IPPROTO_UDP after early demux has attached a tcp_sock.
The packet would then be routed to udp_rcv(), which calls skb_steal_sock()
and bypasses the validation entirely since prefetched is false. This could
allow the UDP receive path to operate on a tcp_sock, potentially leading to
memory corruption or arbitrary execution via overlapping fields like
encap_rcv.
Should the protocol validation be moved outside of this check?
> + /* A non-full socket here is either a reqsk or a
> + * timewait sock, both only contain sock_common and
> + * lack sk_protocol. Since both can only be TCP,
> + * use IPPROTO_TCP as the protocol.
> + */
> + if (unlikely(((sk_fullsock(sk) ? sk->sk_protocol : IPPROTO_TCP) != protocol))) {
[Severity: High]
This is a pre-existing issue, but does checking only sk_protocol without
verifying sk_type allow type confusion with RAW sockets?
A RAW socket can be created with an arbitrary protocol, such as
socket(AF_INET, SOCK_RAW, IPPROTO_TCP). If a BPF program assigns this RAW
socket to a TCP packet, the sk_protocol check will pass, but the receive
path will dangerously cast the raw_sock to a tcp_sock.
Since raw_sock and tcp_sock have completely different layouts, could this
lead to out-of-bounds accesses and kernel panics when the TCP stack tries
to use it? Does sk_type also need to be validated here?
> + skb_orphan(skb);
> + *prefetched = false;
> + *refcounted = false;
> + return NULL;
> + }
> +
> #if IS_ENABLED(CONFIG_SYN_COOKIES)
> if (sk->sk_state == TCP_NEW_SYN_RECV && inet_reqsk(sk)->syncookie) {
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260608125846.157004-1-jiayuan.chen@linux.dev?part=1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
2026-06-08 13:31 ` sashiko-bot
@ 2026-06-08 13:31 ` bot+bpf-ci
2026-06-08 17:21 ` Kuniyuki Iwashima
2 siblings, 0 replies; 14+ messages in thread
From: bot+bpf-ci @ 2026-06-08 13:31 UTC (permalink / raw)
To: jiayuan.chen, bpf
Cc: jiayuan.chen, davem, edumazet, kuba, pabeni, horms,
willemdebruijn.kernel, ast, daniel, andrii, martin.lau, eddyz87,
memxor, song, yonghong.song, jolsa, shuah, joe, kuniyu, netdev,
linux-kernel, linux-kselftest, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1536 bytes --]
> diff --git a/include/net/request_sock.h b/include/net/request_sock.h
> index 5a9c826a7092..c2b8c6350b62 100644
> --- a/include/net/request_sock.h
> +++ b/include/net/request_sock.h
[ ... ]
> @@ -103,6 +105,18 @@ static inline struct sock *skb_steal_sock(struct sk_buff *skb,
>
> *prefetched = skb_sk_is_prefetched(skb);
> if (*prefetched) {
> + /* A non-full socket here is either a reqsk or a
> + * timewait sock, both only contain sock_common and
> + * lack sk_protocol. Since both can only be TCP,
> + * use IPPROTO_TCP as the protocol.
> + */
> + if (unlikely(((sk_fullsock(sk) ? sk->sk_protocol : IPPROTO_TCP) != protocol))) {
^^^
This isn't a bug, but there are extra parentheses around the expression
inside unlikely(). Kuniyuki Iwashima requested adding unlikely() in v7
review, but the v8 code has unlikely(((expression))) instead of
unlikely(expression).
Reference: https://lore.kernel.org/bpf/CAAVpQUA6z4bZVReNDMToRVL1V7xmYg4Tq2YMnWyU8AYXWoeB5A@mail.gmail.com/
> + skb_orphan(skb);
> + *prefetched = false;
> + *refcounted = false;
> + return NULL;
> + }
> +
> #if IS_ENABLED(CONFIG_SYN_COOKIES)
> if (sk->sk_state == TCP_NEW_SYN_RECV && inet_reqsk(sk)->syncookie) {
> struct request_sock *req = inet_reqsk(sk);
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27139886597
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
2026-06-08 12:58 ` [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
2026-06-08 13:07 ` sashiko-bot
@ 2026-06-08 13:31 ` bot+bpf-ci
1 sibling, 0 replies; 14+ messages in thread
From: bot+bpf-ci @ 2026-06-08 13:31 UTC (permalink / raw)
To: jiayuan.chen, bpf
Cc: jiayuan.chen, davem, edumazet, kuba, pabeni, horms,
willemdebruijn.kernel, andrii, eddyz87, ast, daniel, martin.lau,
memxor, song, yonghong.song, jolsa, shuah, joe, kuniyu, netdev,
linux-kernel, linux-kselftest, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]
> diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> index eaf441dc7e79..b644acd3e903 100644
> --- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> +++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
[ ... ]
> @@ -145,6 +145,85 @@ void test_tcp_custom_syncookie(void)
>
> destroy_skel:
> system("tc qdisc del dev lo clsact");
> + test_tcp_custom_syncookie__destroy(skel);
> +}
> +
> +/* TCP and UDP servers share the same port. The BPF program intercepts
The BPF subsystem style guidelines ask that the opening /* of a multi-line
comment be on its own line. Should this be:
/*
* TCP and UDP servers share the same port. The BPF program intercepts
* the UDP packet, looks up the TCP listener via the dest port, and
* attempts to assign a TCP reqsk to the UDP skb.
...
*/
The same nit was raised on v8 by the automated review at:
https://lore.kernel.org/bpf/20260608130746.218891F00898@smtp.kernel.org
> + * the UDP packet, looks up the TCP listener via the dest port, and
> + * attempts to assign a TCP reqsk to the UDP skb.
> + * Although bpf_sk_assign_tcp_reqsk() assign udp skb to tcp reqsk, the
> + * network stack should not crash.
> + */
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/27139886597
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
2026-06-08 13:31 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
@ 2026-06-08 17:21 ` Kuniyuki Iwashima
2026-06-08 20:02 ` Alexei Starovoitov
2 siblings, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-08 17:21 UTC (permalink / raw)
To: Jiayuan Chen, Alexei Starovoitov
Cc: bpf, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
netdev, linux-kernel, linux-kselftest
On Mon, Jun 8, 2026 at 5:59 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
> causing a panic when the skb enters the wrong L4 receive path [1].
> An initial attempt tried to fix this in the BPF helper by checking
> iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
> this check via a TOCTOU attack by modifying iph->protocol around the
> call:
>
> iph->protocol = IPPROTO_TCP;
> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> iph->protocol = IPPROTO_UDP;
>
> Furthermore, bpf_sk_assign() has had the same class of vulnerability
> since its introduction — it can assign any socket type to any skb
> without protocol validation. Since the BPF helper check alone cannot
> prevent a malicious BPF program from crashing the kernel, add protocol
I'm curious about the BPF maintainers' stance on this kind of "bug"
where admin tries to shoot oneself in the foot.
I saw Alexei said this recently, and I guess it applies here as well ?
https://lore.kernel.org/bpf/CAADnVQLh7VEKAtckzz=XOVPT8ovpDQshvPPCWHDQu2OWQx27_w@mail.gmail.com/
---8<---
Not every "bug" needs a fix.
If a malicious bpf user wants to crash the kernel they will
find a way to do so. Especially with agents.
We cannot realistically close all of the holes.
Right now the priority is to fix the issues that normal
users can hit and not bots.
---8<---
> validation in skb_steal_sock() to reject mismatched sockets regardless
> of how they were assigned.
>
> The check is applied to all prefetched sockets. Early demux paths
> already only assign matching protocols (e.g., UDP early demux only
> assigns UDP sockets to UDP skbs), so they pass the check naturally and
> the extra branch is negligible.
>
> Pass the expected protocol from callers rather than extracting it from
> the IP header. For IPv6, walking extension headers to find the L4
> protocol is complex and unnecessary since each caller already knows
> the protocol it handles.
>
> [1] https://lore.kernel.org/bpf/20260403015851.148209-1-jiayuan.chen@linux.dev/
> [2] https://sashiko.dev/#/patchset/20260403015851.148209-1-jiayuan.chen%40linux.dev
>
> Fixes: cf7fbe660f2d ("bpf: Add socket assign support")
> Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> include/net/inet6_hashtables.h | 7 ++++---
> include/net/inet_hashtables.h | 7 ++++---
> include/net/request_sock.h | 16 +++++++++++++++-
> net/ipv4/udp.c | 2 +-
> net/ipv6/udp.c | 2 +-
> 5 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
> index 2cc5d416bbb5a..218498373a9c1 100644
> --- a/include/net/inet6_hashtables.h
> +++ b/include/net/inet6_hashtables.h
> @@ -106,12 +106,13 @@ static inline
> struct sock *inet6_steal_sock(struct net *net, struct sk_buff *skb, int doff,
> const struct in6_addr *saddr, const __be16 sport,
> const struct in6_addr *daddr, const __be16 dport,
> - bool *refcounted, inet6_ehashfn_t *ehashfn)
> + bool *refcounted, inet6_ehashfn_t *ehashfn,
> + int protocol)
> {
> struct sock *sk, *reuse_sk;
> bool prefetched;
>
> - sk = skb_steal_sock(skb, refcounted, &prefetched);
> + sk = skb_steal_sock(skb, refcounted, &prefetched, protocol);
> if (!sk)
> return NULL;
>
> @@ -153,7 +154,7 @@ static inline struct sock *__inet6_lookup_skb(struct sk_buff *skb, int doff,
> struct sock *sk;
>
> sk = inet6_steal_sock(net, skb, doff, &ip6h->saddr, sport, &ip6h->daddr, dport,
> - refcounted, inet6_ehashfn);
> + refcounted, inet6_ehashfn, IPPROTO_TCP);
> if (IS_ERR(sk))
> return NULL;
> if (sk)
> diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
> index 6e2fe186d0dcb..a2a044f93cc4b 100644
> --- a/include/net/inet_hashtables.h
> +++ b/include/net/inet_hashtables.h
> @@ -446,12 +446,13 @@ static inline
> struct sock *inet_steal_sock(struct net *net, struct sk_buff *skb, int doff,
> const __be32 saddr, const __be16 sport,
> const __be32 daddr, const __be16 dport,
> - bool *refcounted, inet_ehashfn_t *ehashfn)
> + bool *refcounted, inet_ehashfn_t *ehashfn,
> + int protocol)
> {
> struct sock *sk, *reuse_sk;
> bool prefetched;
>
> - sk = skb_steal_sock(skb, refcounted, &prefetched);
> + sk = skb_steal_sock(skb, refcounted, &prefetched, protocol);
> if (!sk)
> return NULL;
>
> @@ -494,7 +495,7 @@ static inline struct sock *__inet_lookup_skb(struct sk_buff *skb,
> struct sock *sk;
>
> sk = inet_steal_sock(net, skb, doff, iph->saddr, sport, iph->daddr, dport,
> - refcounted, inet_ehashfn);
> + refcounted, inet_ehashfn, IPPROTO_TCP);
> if (IS_ERR(sk))
> return NULL;
> if (sk)
> diff --git a/include/net/request_sock.h b/include/net/request_sock.h
> index 5a9c826a7092d..c2b8c6350b624 100644
> --- a/include/net/request_sock.h
> +++ b/include/net/request_sock.h
> @@ -89,9 +89,11 @@ static inline struct sock *req_to_sk(struct request_sock *req)
> * @skb: sk_buff to steal the socket from
> * @refcounted: is set to true if the socket is reference-counted
> * @prefetched: is set to true if the socket was assigned from bpf
> + * @protocol: expected L4 protocol
> */
> static inline struct sock *skb_steal_sock(struct sk_buff *skb,
> - bool *refcounted, bool *prefetched)
> + bool *refcounted, bool *prefetched,
> + int protocol)
> {
> struct sock *sk = skb->sk;
>
> @@ -103,6 +105,18 @@ static inline struct sock *skb_steal_sock(struct sk_buff *skb,
>
> *prefetched = skb_sk_is_prefetched(skb);
> if (*prefetched) {
> + /* A non-full socket here is either a reqsk or a
> + * timewait sock, both only contain sock_common and
> + * lack sk_protocol. Since both can only be TCP,
> + * use IPPROTO_TCP as the protocol.
> + */
> + if (unlikely(((sk_fullsock(sk) ? sk->sk_protocol : IPPROTO_TCP) != protocol))) {
> + skb_orphan(skb);
> + *prefetched = false;
> + *refcounted = false;
> + return NULL;
> + }
> +
> #if IS_ENABLED(CONFIG_SYN_COOKIES)
> if (sk->sk_state == TCP_NEW_SYN_RECV && inet_reqsk(sk)->syncookie) {
> struct request_sock *req = inet_reqsk(sk);
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 0ac2bf4f8759b..ceb4d29a64aca 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -2618,7 +2618,7 @@ int udp_rcv(struct sk_buff *skb)
> goto csum_error;
>
> sk = inet_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
> - &refcounted, udp_ehashfn);
> + &refcounted, udp_ehashfn, IPPROTO_UDP);
> if (IS_ERR(sk))
> goto no_sk;
>
> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> index 15e032194eccc..d9c12cce5acef 100644
> --- a/net/ipv6/udp.c
> +++ b/net/ipv6/udp.c
> @@ -1106,7 +1106,7 @@ INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb)
>
> /* Check if the socket is already available, e.g. due to early demux */
> sk = inet6_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
> - &refcounted, udp6_ehashfn);
> + &refcounted, udp6_ehashfn, IPPROTO_UDP);
> if (IS_ERR(sk))
> goto no_sk;
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 17:21 ` Kuniyuki Iwashima
@ 2026-06-08 20:02 ` Alexei Starovoitov
2026-06-08 20:55 ` Kuniyuki Iwashima
0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2026-06-08 20:02 UTC (permalink / raw)
To: Kuniyuki Iwashima, Jiayuan Chen, Alexei Starovoitov
Cc: bpf, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
netdev, linux-kernel, linux-kselftest
On Mon Jun 8, 2026 at 10:21 AM PDT, Kuniyuki Iwashima wrote:
> On Mon, Jun 8, 2026 at 5:59 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>>
>> bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
>> causing a panic when the skb enters the wrong L4 receive path [1].
>> An initial attempt tried to fix this in the BPF helper by checking
>> iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
>> this check via a TOCTOU attack by modifying iph->protocol around the
>> call:
>>
>> iph->protocol = IPPROTO_TCP;
>> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
>> iph->protocol = IPPROTO_UDP;
>>
>> Furthermore, bpf_sk_assign() has had the same class of vulnerability
>> since its introduction — it can assign any socket type to any skb
>> without protocol validation. Since the BPF helper check alone cannot
>> prevent a malicious BPF program from crashing the kernel, add protocol
>
> I'm curious about the BPF maintainers' stance on this kind of "bug"
> where admin tries to shoot oneself in the foot.
>
> I saw Alexei said this recently, and I guess it applies here as well ?
> https://lore.kernel.org/bpf/CAADnVQLh7VEKAtckzz=XOVPT8ovpDQshvPPCWHDQu2OWQx27_w@mail.gmail.com/
>
> ---8<---
> Not every "bug" needs a fix.
> If a malicious bpf user wants to crash the kernel they will
> find a way to do so. Especially with agents.
> We cannot realistically close all of the holes.
> Right now the priority is to fix the issues that normal
> users can hit and not bots.
> ---8<---
In addition to that I have to add that skb_steal_sock() is performance
critical path of networking stack. Adding runtime overheard there
because bots can find a way to abuse the interfaces is not a good trade off.
If there is no simple way to fix it completely on the bpf side
then we have to flag this issue as "won't fix" and move on.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 20:02 ` Alexei Starovoitov
@ 2026-06-08 20:55 ` Kuniyuki Iwashima
2026-06-08 21:25 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-08 20:55 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jiayuan Chen, Alexei Starovoitov, bpf, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
netdev, linux-kernel, linux-kselftest
On Mon, Jun 8, 2026 at 1:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon Jun 8, 2026 at 10:21 AM PDT, Kuniyuki Iwashima wrote:
> > On Mon, Jun 8, 2026 at 5:59 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >>
> >> bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
> >> causing a panic when the skb enters the wrong L4 receive path [1].
> >> An initial attempt tried to fix this in the BPF helper by checking
> >> iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
> >> this check via a TOCTOU attack by modifying iph->protocol around the
> >> call:
> >>
> >> iph->protocol = IPPROTO_TCP;
> >> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> >> iph->protocol = IPPROTO_UDP;
> >>
> >> Furthermore, bpf_sk_assign() has had the same class of vulnerability
> >> since its introduction — it can assign any socket type to any skb
> >> without protocol validation. Since the BPF helper check alone cannot
> >> prevent a malicious BPF program from crashing the kernel, add protocol
> >
> > I'm curious about the BPF maintainers' stance on this kind of "bug"
> > where admin tries to shoot oneself in the foot.
> >
> > I saw Alexei said this recently, and I guess it applies here as well ?
> > https://lore.kernel.org/bpf/CAADnVQLh7VEKAtckzz=XOVPT8ovpDQshvPPCWHDQu2OWQx27_w@mail.gmail.com/
> >
> > ---8<---
> > Not every "bug" needs a fix.
> > If a malicious bpf user wants to crash the kernel they will
> > find a way to do so. Especially with agents.
> > We cannot realistically close all of the holes.
> > Right now the priority is to fix the issues that normal
> > users can hit and not bots.
> > ---8<---
>
> In addition to that I have to add that skb_steal_sock() is performance
> critical path of networking stack. Adding runtime overheard there
> because bots can find a way to abuse the interfaces is not a good trade off.
> If there is no simple way to fix it completely on the bpf side
> then we have to flag this issue as "won't fix" and move on.
I asked if we could fix this in the verifier but it seems difficult. [0]
Since this is triggered only by BPF misuse, treating it as "won't fix"
makes more sense to me.
[0]: https://lore.kernel.org/all/202648182730.i4ki.martin.lau@linux.dev/
Thanks
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 20:55 ` Kuniyuki Iwashima
@ 2026-06-08 21:25 ` Alexei Starovoitov
2026-06-08 21:35 ` Kuniyuki Iwashima
0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2026-06-08 21:25 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Jiayuan Chen, Alexei Starovoitov, bpf, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Network Development, LKML, open list:KERNEL SELFTEST FRAMEWORK
On Mon, Jun 8, 2026 at 1:55 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>
> On Mon, Jun 8, 2026 at 1:02 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon Jun 8, 2026 at 10:21 AM PDT, Kuniyuki Iwashima wrote:
> > > On Mon, Jun 8, 2026 at 5:59 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> > >>
> > >> bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
> > >> causing a panic when the skb enters the wrong L4 receive path [1].
> > >> An initial attempt tried to fix this in the BPF helper by checking
> > >> iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
> > >> this check via a TOCTOU attack by modifying iph->protocol around the
> > >> call:
> > >>
> > >> iph->protocol = IPPROTO_TCP;
> > >> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> > >> iph->protocol = IPPROTO_UDP;
> > >>
> > >> Furthermore, bpf_sk_assign() has had the same class of vulnerability
> > >> since its introduction — it can assign any socket type to any skb
> > >> without protocol validation. Since the BPF helper check alone cannot
> > >> prevent a malicious BPF program from crashing the kernel, add protocol
> > >
> > > I'm curious about the BPF maintainers' stance on this kind of "bug"
> > > where admin tries to shoot oneself in the foot.
> > >
> > > I saw Alexei said this recently, and I guess it applies here as well ?
> > > https://lore.kernel.org/bpf/CAADnVQLh7VEKAtckzz=XOVPT8ovpDQshvPPCWHDQu2OWQx27_w@mail.gmail.com/
> > >
> > > ---8<---
> > > Not every "bug" needs a fix.
> > > If a malicious bpf user wants to crash the kernel they will
> > > find a way to do so. Especially with agents.
> > > We cannot realistically close all of the holes.
> > > Right now the priority is to fix the issues that normal
> > > users can hit and not bots.
> > > ---8<---
> >
> > In addition to that I have to add that skb_steal_sock() is performance
> > critical path of networking stack. Adding runtime overheard there
> > because bots can find a way to abuse the interfaces is not a good trade off.
> > If there is no simple way to fix it completely on the bpf side
> > then we have to flag this issue as "won't fix" and move on.
>
> I asked if we could fix this in the verifier but it seems difficult. [0]
> Since this is triggered only by BPF misuse, treating it as "won't fix"
> makes more sense to me.
>
> [0]: https://lore.kernel.org/all/202648182730.i4ki.martin.lau@linux.dev/
btw is earlier sashiko TOCTOU concern real?
iph->protocol = IPPROTO_TCP;
bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
iph->protocol = IPPROTO_UDP;
as far as I can tell past bpf_sk_assign_tcp_reqsk()
the networking stack only looks at skb->protocol,
so modifying iph->protocol will only mess up
things on the wire (if this packet goes out).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 21:25 ` Alexei Starovoitov
@ 2026-06-08 21:35 ` Kuniyuki Iwashima
2026-06-08 22:16 ` Alexei Starovoitov
0 siblings, 1 reply; 14+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-08 21:35 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jiayuan Chen, Alexei Starovoitov, bpf, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Network Development, LKML, open list:KERNEL SELFTEST FRAMEWORK
On Mon, Jun 8, 2026 at 2:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jun 8, 2026 at 1:55 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
> >
> > On Mon, Jun 8, 2026 at 1:02 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon Jun 8, 2026 at 10:21 AM PDT, Kuniyuki Iwashima wrote:
> > > > On Mon, Jun 8, 2026 at 5:59 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> > > >>
> > > >> bpf_sk_assign_tcp_reqsk() can assign a TCP reqsk to a non-TCP skb,
> > > >> causing a panic when the skb enters the wrong L4 receive path [1].
> > > >> An initial attempt tried to fix this in the BPF helper by checking
> > > >> iph->protocol, but Sashiko [2] revealed that BPF programs can bypass
> > > >> this check via a TOCTOU attack by modifying iph->protocol around the
> > > >> call:
> > > >>
> > > >> iph->protocol = IPPROTO_TCP;
> > > >> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> > > >> iph->protocol = IPPROTO_UDP;
> > > >>
> > > >> Furthermore, bpf_sk_assign() has had the same class of vulnerability
> > > >> since its introduction — it can assign any socket type to any skb
> > > >> without protocol validation. Since the BPF helper check alone cannot
> > > >> prevent a malicious BPF program from crashing the kernel, add protocol
> > > >
> > > > I'm curious about the BPF maintainers' stance on this kind of "bug"
> > > > where admin tries to shoot oneself in the foot.
> > > >
> > > > I saw Alexei said this recently, and I guess it applies here as well ?
> > > > https://lore.kernel.org/bpf/CAADnVQLh7VEKAtckzz=XOVPT8ovpDQshvPPCWHDQu2OWQx27_w@mail.gmail.com/
> > > >
> > > > ---8<---
> > > > Not every "bug" needs a fix.
> > > > If a malicious bpf user wants to crash the kernel they will
> > > > find a way to do so. Especially with agents.
> > > > We cannot realistically close all of the holes.
> > > > Right now the priority is to fix the issues that normal
> > > > users can hit and not bots.
> > > > ---8<---
> > >
> > > In addition to that I have to add that skb_steal_sock() is performance
> > > critical path of networking stack. Adding runtime overheard there
> > > because bots can find a way to abuse the interfaces is not a good trade off.
> > > If there is no simple way to fix it completely on the bpf side
> > > then we have to flag this issue as "won't fix" and move on.
> >
> > I asked if we could fix this in the verifier but it seems difficult. [0]
> > Since this is triggered only by BPF misuse, treating it as "won't fix"
> > makes more sense to me.
> >
> > [0]: https://lore.kernel.org/all/202648182730.i4ki.martin.lau@linux.dev/
>
> btw is earlier sashiko TOCTOU concern real?
Yes, the selftest in patch 2 should reproduce null-ptr-deref.
(I tested one in a prior version)
> iph->protocol = IPPROTO_TCP;
> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> iph->protocol = IPPROTO_UDP;
>
> as far as I can tell past bpf_sk_assign_tcp_reqsk()
> the networking stack only looks at skb->protocol,
> so modifying iph->protocol will only mess up
> things on the wire (if this packet goes out).
ip_rcv_finish_core() uses iph->protocol for early demux
and ip_local_deliver_finish() also uses it for demux.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 21:35 ` Kuniyuki Iwashima
@ 2026-06-08 22:16 ` Alexei Starovoitov
2026-06-08 22:34 ` Kuniyuki Iwashima
0 siblings, 1 reply; 14+ messages in thread
From: Alexei Starovoitov @ 2026-06-08 22:16 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: Jiayuan Chen, Alexei Starovoitov, bpf, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Network Development, LKML, open list:KERNEL SELFTEST FRAMEWORK
On Mon Jun 8, 2026 at 2:35 PM PDT, Kuniyuki Iwashima wrote:
>>
>> btw is earlier sashiko TOCTOU concern real?
>
> Yes, the selftest in patch 2 should reproduce null-ptr-deref.
> (I tested one in a prior version)
>
>
>> iph->protocol = IPPROTO_TCP;
>> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
>> iph->protocol = IPPROTO_UDP;
>>
>> as far as I can tell past bpf_sk_assign_tcp_reqsk()
>> the networking stack only looks at skb->protocol,
>> so modifying iph->protocol will only mess up
>> things on the wire (if this packet goes out).
>
> ip_rcv_finish_core() uses iph->protocol for early demux
> and ip_local_deliver_finish() also uses it for demux.
ok. Another idea... we can keep the checks on bpf side and
teach the verifier to invalidate packet pointers at bpf_sk_assign_tcp_reqsk()
and introduce new concept of "readonly skb".
So after invalidation the reloaded sbk->data will be read only.
There is no such thing today. The verifier only understands PTR_TO_PACKET
as read/write. So it's not easy, but probably good thing to have
for this and other cases where bpf prog shouldn't touch the packet
once it called some kfunc/helper.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets
2026-06-08 22:16 ` Alexei Starovoitov
@ 2026-06-08 22:34 ` Kuniyuki Iwashima
0 siblings, 0 replies; 14+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-08 22:34 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jiayuan Chen, Alexei Starovoitov, bpf, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
Willem de Bruijn, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, Shuah Khan, Joe Stringer,
Network Development, LKML, open list:KERNEL SELFTEST FRAMEWORK
On Mon, Jun 8, 2026 at 3:16 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon Jun 8, 2026 at 2:35 PM PDT, Kuniyuki Iwashima wrote:
> >>
> >> btw is earlier sashiko TOCTOU concern real?
> >
> > Yes, the selftest in patch 2 should reproduce null-ptr-deref.
> > (I tested one in a prior version)
> >
> >
> >> iph->protocol = IPPROTO_TCP;
> >> bpf_sk_assign_tcp_reqsk(udp_skb, tcp_sk);
> >> iph->protocol = IPPROTO_UDP;
> >>
> >> as far as I can tell past bpf_sk_assign_tcp_reqsk()
> >> the networking stack only looks at skb->protocol,
> >> so modifying iph->protocol will only mess up
> >> things on the wire (if this packet goes out).
> >
> > ip_rcv_finish_core() uses iph->protocol for early demux
> > and ip_local_deliver_finish() also uses it for demux.
>
> ok. Another idea... we can keep the checks on bpf side and
> teach the verifier to invalidate packet pointers at bpf_sk_assign_tcp_reqsk()
> and introduce new concept of "readonly skb".
> So after invalidation the reloaded sbk->data will be read only.
Yes, this is exactly what I had in mind in the other thread :)
---8<---
On top of L4 validation in bpf_sk_assign() and bpf_sk_assign_tcp_reqsk(),
can't we mark such an skb immutable after the helpers and catch
subsequent writes to skb->data on the verifier ?
---8<---
> There is no such thing today. The verifier only understands PTR_TO_PACKET
> as read/write. So it's not easy, but probably good thing to have
> for this and other cases where bpf prog shouldn't touch the packet
> once it called some kfunc/helper.
This would be useful and we could (re)move some validations from
the fast path.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-06-08 22:35 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 12:58 [PATCH bpf v8 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
2026-06-08 12:58 ` [PATCH bpf v8 1/2] net: Validate protocol in skb_steal_sock() for BPF-assigned sockets Jiayuan Chen
2026-06-08 13:31 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
2026-06-08 17:21 ` Kuniyuki Iwashima
2026-06-08 20:02 ` Alexei Starovoitov
2026-06-08 20:55 ` Kuniyuki Iwashima
2026-06-08 21:25 ` Alexei Starovoitov
2026-06-08 21:35 ` Kuniyuki Iwashima
2026-06-08 22:16 ` Alexei Starovoitov
2026-06-08 22:34 ` Kuniyuki Iwashima
2026-06-08 12:58 ` [PATCH bpf v8 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
2026-06-08 13:07 ` sashiko-bot
2026-06-08 13:31 ` bot+bpf-ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox