* [PATCH bpf v6 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie @ 2026-04-03 1:58 Jiayuan Chen 2026-04-03 1:58 ` [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen 2026-04-03 1:58 ` [PATCH bpf v6 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen 0 siblings, 2 replies; 6+ messages in thread From: Jiayuan Chen @ 2026-04-03 1:58 UTC (permalink / raw) To: bpf Cc: Jiayuan Chen, Martin KaFai Lau, Daniel Borkmann, John Fastabend, Stanislav Fomichev, Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Jiayuan Chen, Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest From: Jiayuan Chen <jiayuan.chen@shopee.com> bpf_sk_assign_tcp_reqsk() does not validate the L4 protocol of the skb, only checking skb->protocol (L3). A BPF program that calls this kfunc on a non-TCP skb (e.g. UDP) will succeed, attaching a TCP reqsk to the skb. When the skb enters the UDP receive path, skb_steal_sock() returns the TCP listener socket from the reqsk. The UDP code then casts this TCP socket to udp_sock and accesses UDP-specific fields at invalid offsets, causing a null pointer dereference: BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0 Read of size 4 at addr 0000000000000008 by task test_progs/537 CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:123) print_report (mm/kasan/report.c:487) kasan_report (mm/kasan/report.c:597) __kasan_check_read (mm/kasan/shadow.c:32) __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719) udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500) udp_queue_rcv_skb (net/ipv4/udp.c:2532) udp_unicast_rcv_skb (net/ipv4/udp.c:2684) __udp4_lib_rcv (net/ipv4/udp.c:2742) udp_rcv (net/ipv4/udp.c:2937) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209) ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242) ip_local_deliver (net/ipv4/ip_input.c:265) __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4)) __netif_receive_skb (net/core/dev.c:6280) Solution Patch 1: Add L4 protocol validation in bpf_sk_assign_tcp_reqsk(). Check ip_hdr(skb)->protocol (IPv4) and ipv6_hdr(skb)->nexthdr (IPv6) against IPPROTO_TCP, returning -EINVAL for non-TCP skbs. Patch 2: Add selftest that calls bpf_sk_assign_tcp_reqsk() on a UDP skb and verifies the kfunc rejects it. --- v1: https://lore.kernel.org/bpf/20260323105510.51990-1-jiayuan.chen@linux.dev/ v2: https://lore.kernel.org/bpf/20260326062657.88446-1-jiayuan.chen@linux.dev/ v3: https://lore.kernel.org/bpf/20260327133915.286037-1-jiayuan.chen@linux.dev/ v4: https://lore.kernel.org/bpf/20260330080746.319680-1-jiayuan.chen@linux.dev/ v5: https://lore.kernel.org/bpf/20260401110511.73355-1-jiayuan.chen@linux.dev/ Changes in v5: - use skb_header_pointer instead of pskb_may_pull. Changes in v5: - Add pskb_may_pull before accessing IP/IPv6 headers in kfunc - Use buf[] instead of buf[32], verify recv data with ASSERT_STREQ - Remove unnecessary variable initializations in selftest and BPF Changes in v4: - Check if assign_ret is EINVAL instead of checking if it is 0 Changes in v3: - Add IPv6 test coverage, reuse test_cases[] to iterate over both address families - Share TCP/UDP port to simplify BPF program, remove unnecessary global variables - Use connect_to_fd() + send()/recv() instead of manual sockaddr construction - Suggested by Kuniyuki Iwashima Changes in v2: - Add Reviewed-by tag from Kuniyuki Iwashima for patch 1 - Use UDP socket recv() instead of kern_sync_rcu() for synchronization in selftest Jiayuan Chen (2): bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() net/core/filter.c | 20 +++- .../bpf/prog_tests/tcp_custom_syncookie.c | 93 ++++++++++++++- .../bpf/progs/test_tcp_custom_syncookie.c | 109 ++++++++++++++++++ 3 files changed, 216 insertions(+), 6 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() 2026-04-03 1:58 [PATCH bpf v6 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen @ 2026-04-03 1:58 ` Jiayuan Chen 2026-04-03 2:26 ` Eric Dumazet 2026-04-06 19:53 ` Martin KaFai Lau 2026-04-03 1:58 ` [PATCH bpf v6 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen 1 sibling, 2 replies; 6+ messages in thread From: Jiayuan Chen @ 2026-04-03 1:58 UTC (permalink / raw) To: bpf Cc: Jiayuan Chen, Kuniyuki Iwashima, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev, linux-kernel, linux-kselftest bpf_sk_assign_tcp_reqsk() only validates skb->protocol (L3) but does not check the L4 protocol in the IP header. A BPF program can call this kfunc on a UDP skb with a valid TCP listener socket, which will succeed and attach a TCP reqsk to the UDP skb. When the UDP skb enters the UDP receive path, skb_steal_sock() returns the TCP listener from the reqsk. The UDP code then passes this TCP socket to udp_unicast_rcv_skb() -> __udp_enqueue_schedule_skb(), which casts it to udp_sock and accesses UDP-specific fields at invalid offsets, causing a null pointer dereference and kernel panic: BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0 Read of size 4 at addr 0000000000000008 by task test_progs/537 CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:123) print_report (mm/kasan/report.c:487) kasan_report (mm/kasan/report.c:597) __kasan_check_read (mm/kasan/shadow.c:32) __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719) udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500) udp_queue_rcv_skb (net/ipv4/udp.c:2532) udp_unicast_rcv_skb (net/ipv4/udp.c:2684) __udp4_lib_rcv (net/ipv4/udp.c:2742) udp_rcv (net/ipv4/udp.c:2937) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209) ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242) ip_local_deliver (net/ipv4/ip_input.c:265) __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4)) __netif_receive_skb (net/core/dev.c:6280) Fix this by checking the IP header's protocol field in bpf_sk_assign_tcp_reqsk() and rejecting non-TCP skbs with -EINVAL. Note that for IPv6, the nexthdr check does not walk extension headers. This is uncommon for TCP SYN packets in practice, and keeping it simple was agreed upon by Kuniyuki Iwashima. Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.") Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> --- net/core/filter.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 78b548158fb05..523a63584cba0 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -12247,15 +12247,31 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk, return -ENETUNREACH; switch (skb->protocol) { - case htons(ETH_P_IP): + case htons(ETH_P_IP): { + struct iphdr *iph, _iph; + + iph = skb_header_pointer(skb, skb_network_offset(skb), + sizeof(*iph), &_iph); + if (!iph || iph->protocol != IPPROTO_TCP) + return -EINVAL; + ops = &tcp_request_sock_ops; min_mss = 536; break; + } #if IS_BUILTIN(CONFIG_IPV6) - case htons(ETH_P_IPV6): + case htons(ETH_P_IPV6): { + struct ipv6hdr *ip6h, _ip6h; + + ip6h = skb_header_pointer(skb, skb_network_offset(skb), + sizeof(*ip6h), &_ip6h); + if (!ip6h || ip6h->nexthdr != IPPROTO_TCP) + return -EINVAL; + ops = &tcp6_request_sock_ops; min_mss = IPV6_MIN_MTU - 60; break; + } #endif default: return -EINVAL; -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() 2026-04-03 1:58 ` [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen @ 2026-04-03 2:26 ` Eric Dumazet 2026-04-06 19:53 ` Martin KaFai Lau 1 sibling, 0 replies; 6+ messages in thread From: Eric Dumazet @ 2026-04-03 2:26 UTC (permalink / raw) To: Jiayuan Chen Cc: bpf, Kuniyuki Iwashima, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev, David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev, linux-kernel, linux-kselftest On Thu, Apr 2, 2026 at 6:59 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote: > > bpf_sk_assign_tcp_reqsk() only validates skb->protocol (L3) but does not > check the L4 protocol in the IP header. A BPF program can call this kfunc > on a UDP skb with a valid TCP listener socket, which will succeed and > attach a TCP reqsk to the UDP skb. > > When the UDP skb enters the UDP receive path, skb_steal_sock() returns > the TCP listener from the reqsk. The UDP code then passes this TCP socket > to udp_unicast_rcv_skb() -> __udp_enqueue_schedule_skb(), which casts > it to udp_sock and accesses UDP-specific fields at invalid offsets, > causing a null pointer dereference and kernel panic: > > BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0 > Read of size 4 at addr 0000000000000008 by task test_progs/537 > > Fix this by checking the IP header's protocol field in > bpf_sk_assign_tcp_reqsk() and rejecting non-TCP skbs with -EINVAL. > > Note that for IPv6, the nexthdr check does not walk extension headers. > This is uncommon for TCP SYN packets in practice, and keeping it simple > was agreed upon by Kuniyuki Iwashima. > > Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.") > Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> > Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() 2026-04-03 1:58 ` [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen 2026-04-03 2:26 ` Eric Dumazet @ 2026-04-06 19:53 ` Martin KaFai Lau 2026-04-07 5:22 ` Jiayuan Chen 1 sibling, 1 reply; 6+ messages in thread From: Martin KaFai Lau @ 2026-04-06 19:53 UTC (permalink / raw) To: Jiayuan Chen Cc: bpf, Kuniyuki Iwashima, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev, linux-kernel, linux-kselftest On Fri, Apr 03, 2026 at 09:58:27AM +0800, Jiayuan Chen wrote: > bpf_sk_assign_tcp_reqsk() only validates skb->protocol (L3) but does not > check the L4 protocol in the IP header. A BPF program can call this kfunc > on a UDP skb with a valid TCP listener socket, which will succeed and > attach a TCP reqsk to the UDP skb. > > When the UDP skb enters the UDP receive path, skb_steal_sock() returns > the TCP listener from the reqsk. The UDP code then passes this TCP socket > to udp_unicast_rcv_skb() -> __udp_enqueue_schedule_skb(), which casts > it to udp_sock and accesses UDP-specific fields at invalid offsets, > causing a null pointer dereference and kernel panic: > > BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0 > Read of size 4 at addr 0000000000000008 by task test_progs/537 > > CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT > Call Trace: > <IRQ> > dump_stack_lvl (lib/dump_stack.c:123) > print_report (mm/kasan/report.c:487) > kasan_report (mm/kasan/report.c:597) > __kasan_check_read (mm/kasan/shadow.c:32) > __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719) > udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500) > udp_queue_rcv_skb (net/ipv4/udp.c:2532) > udp_unicast_rcv_skb (net/ipv4/udp.c:2684) > __udp4_lib_rcv (net/ipv4/udp.c:2742) > udp_rcv (net/ipv4/udp.c:2937) > ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209) > ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242) > ip_local_deliver (net/ipv4/ip_input.c:265) > __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4)) > __netif_receive_skb (net/core/dev.c:6280) > > Fix this by checking the IP header's protocol field in > bpf_sk_assign_tcp_reqsk() and rejecting non-TCP skbs with -EINVAL. > > Note that for IPv6, the nexthdr check does not walk extension headers. > This is uncommon for TCP SYN packets in practice, and keeping it simple > was agreed upon by Kuniyuki Iwashima. sashiko has flagged a similar issue with larger scope. Please take a look. Thanks. https://sashiko.dev/#/patchset/20260403015851.148209-1-jiayuan.chen%40linux.dev ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() 2026-04-06 19:53 ` Martin KaFai Lau @ 2026-04-07 5:22 ` Jiayuan Chen 0 siblings, 0 replies; 6+ messages in thread From: Jiayuan Chen @ 2026-04-07 5:22 UTC (permalink / raw) To: Martin KaFai Lau, Eric Dumazet, Kuniyuki Iwashima Cc: bpf, Kuniyuki Iwashima, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, netdev, linux-kernel, linux-kselftest On 4/7/26 3:53 AM, Martin KaFai Lau wrote: > On Fri, Apr 03, 2026 at 09:58:27AM +0800, Jiayuan Chen wrote: >> bpf_sk_assign_tcp_reqsk() only validates skb->protocol (L3) but does not >> check the L4 protocol in the IP header. A BPF program can call this kfunc >> on a UDP skb with a valid TCP listener socket, which will succeed and >> attach a TCP reqsk to the UDP skb. >> >> When the UDP skb enters the UDP receive path, skb_steal_sock() returns >> the TCP listener from the reqsk. The UDP code then passes this TCP socket >> to udp_unicast_rcv_skb() -> __udp_enqueue_schedule_skb(), which casts >> it to udp_sock and accesses UDP-specific fields at invalid offsets, >> causing a null pointer dereference and kernel panic: >> >> BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0 >> Read of size 4 at addr 0000000000000008 by task test_progs/537 >> >> CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT >> Call Trace: >> <IRQ> >> dump_stack_lvl (lib/dump_stack.c:123) >> print_report (mm/kasan/report.c:487) >> kasan_report (mm/kasan/report.c:597) >> __kasan_check_read (mm/kasan/shadow.c:32) >> __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719) >> udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500) >> udp_queue_rcv_skb (net/ipv4/udp.c:2532) >> udp_unicast_rcv_skb (net/ipv4/udp.c:2684) >> __udp4_lib_rcv (net/ipv4/udp.c:2742) >> udp_rcv (net/ipv4/udp.c:2937) >> ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209) >> ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242) >> ip_local_deliver (net/ipv4/ip_input.c:265) >> __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4)) >> __netif_receive_skb (net/core/dev.c:6280) >> >> Fix this by checking the IP header's protocol field in >> bpf_sk_assign_tcp_reqsk() and rejecting non-TCP skbs with -EINVAL. >> >> Note that for IPv6, the nexthdr check does not walk extension headers. >> This is uncommon for TCP SYN packets in practice, and keeping it simple >> was agreed upon by Kuniyuki Iwashima. > sashiko has flagged a similar issue with larger scope. > Please take a look. Thanks. > > https://sashiko.dev/#/patchset/20260403015851.148209-1-jiayuan.chen%40linux.dev Thanks a lot Martin, sashiko actually dug into a deeper issue here. Eric and Kuniyuki, I think the AI review has a point. Since BPF can modify skb fields, the following sequence still bypasses the protocol check in bpf_sk_assign_tcp_reqsk(): // for a UDP skb iph->protocol = TCP bpf_sk_assign_tcp_reqsk() iph->protocol = UDP On top of that, bpf_sk_assign() already has the same problem — it doesn't validate L4 protocol at all. So I think we should add a check matching skb against sk in skb_steal_sock() instead of adding check in bpf helper. (Also, we probably need some MIB counters for this.) Below is my diff: //The core reason for passing protocol as a parameter rather than reading it from //the skb is that IPv6 extension headers make it non-trivial to get the actual nexthdr. //It's much simpler to just let the caller tell us what protocol it expects. diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h index c16de5b7963fd..c1032bb9ea864 100644 --- a/include/net/inet6_hashtables.h +++ b/include/net/inet6_hashtables.h @@ -104,12 +104,13 @@ static inline struct sock *inet6_steal_sock(struct net *net, struct sk_buff *skb, int doff, const struct in6_addr *saddr, const __be16 sport, const struct in6_addr *daddr, const __be16 dport, - bool *refcounted, inet6_ehashfn_t *ehashfn) + bool *refcounted, inet6_ehashfn_t *ehashfn, + int protocol) { struct sock *sk, *reuse_sk; bool prefetched; - sk = skb_steal_sock(skb, refcounted, &prefetched); + sk = skb_steal_sock(skb, refcounted, &prefetched, protocol); if (!sk) return NULL; @@ -151,7 +152,7 @@ static inline struct sock *__inet6_lookup_skb(struct sk_buff *skb, int doff, struct sock *sk; sk = inet6_steal_sock(net, skb, doff, &ip6h->saddr, sport, &ip6h->daddr, dport, - refcounted, inet6_ehashfn); + refcounted, inet6_ehashfn, IPPROTO_TCP); if (IS_ERR(sk)) return NULL; if (sk) diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 5a979dcab5383..0ad73135ece7d 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -433,12 +433,13 @@ static inline struct sock *inet_steal_sock(struct net *net, struct sk_buff *skb, int doff, const __be32 saddr, const __be16 sport, const __be32 daddr, const __be16 dport, - bool *refcounted, inet_ehashfn_t *ehashfn) + bool *refcounted, inet_ehashfn_t *ehashfn, + int protocol) { struct sock *sk, *reuse_sk; bool prefetched; - sk = skb_steal_sock(skb, refcounted, &prefetched); + sk = skb_steal_sock(skb, refcounted, &prefetched, protocol); if (!sk) return NULL; @@ -469,7 +470,7 @@ struct sock *inet_steal_sock(struct net *net, struct sk_buff *skb, int doff, return reuse_sk; } -static inline struct sock *__inet_lookup_skb(struct sk_buff *skb, +static inline struct sock *__tcp_lookup_skb(struct sk_buff *skb, int doff, const __be16 sport, const __be16 dport, @@ -481,7 +482,7 @@ static inline struct sock *__inet_lookup_skb(struct sk_buff *skb, struct sock *sk; sk = inet_steal_sock(net, skb, doff, iph->saddr, sport, iph->daddr, dport, - refcounted, inet_ehashfn); + refcounted, inet_ehashfn, IPPROTO_TCP); if (IS_ERR(sk)) return NULL; if (sk) diff --git a/include/net/request_sock.h b/include/net/request_sock.h index 5a9c826a7092d..80bd209b2323b 100644 --- a/include/net/request_sock.h +++ b/include/net/request_sock.h @@ -91,7 +91,8 @@ static inline struct sock *req_to_sk(struct request_sock *req) * @prefetched: is set to true if the socket was assigned from bpf */ static inline struct sock *skb_steal_sock(struct sk_buff *skb, - bool *refcounted, bool *prefetched) + bool *refcounted, bool *prefetched, + int protocol) { struct sock *sk = skb->sk; @@ -103,6 +104,24 @@ static inline struct sock *skb_steal_sock(struct sk_buff *skb, *prefetched = skb_sk_is_prefetched(skb); if (*prefetched) { + /* Validate that the stolen socket matches the expected L4 + * protocol. BPF (bpf_sk_assign) can mistakenly or maliciously + * assign a socket of the wrong type. If the protocols don't + * match, clean up the assignment and return NULL so the caller + * falls through to the normal hash table lookup. + * + * For full sockets, check sk_protocol directly. + * For request sockets (only created by bpf_sk_assign_tcp_reqsk), + * sk_protocol is not in sock_common and cannot be accessed. + * Request sockets are always TCP, so assume IPPROTO_TCP. + */ + if ((sk_fullsock(sk) ? sk->sk_protocol : IPPROTO_TCP) != protocol) { + skb_orphan(skb); + *prefetched = false; + *refcounted = false; + return NULL; + } + #if IS_ENABLED(CONFIG_SYN_COOKIES) if (sk->sk_state == TCP_NEW_SYN_RECV && inet_reqsk(sk)->syncookie) { struct request_sock *req = inet_reqsk(sk); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c7b2463c2e254..442a9379aefe6 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2188,7 +2188,7 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); lookup: - sk = __inet_lookup_skb(skb, __tcp_hdrlen(th), th->source, + sk = __tcp_lookup_skb(skb, __tcp_hdrlen(th), th->source, th->dest, sdif, &refcounted); if (!sk) goto no_tcp_socket; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index b60fad393e182..61f7fbf5ffd9c 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2728,7 +2728,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, goto csum_error; sk = inet_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest, - &refcounted, udp_ehashfn); + &refcounted, udp_ehashfn, IPPROTO_UDP); if (IS_ERR(sk)) goto no_sk; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 010b909275dd0..9c3c25c938ec8 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1115,7 +1115,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, /* Check if the socket is already available, e.g. due to early demux */ sk = inet6_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest, - &refcounted, udp6_ehashfn); + &refcounted, udp6_ehashfn, IPPROTO_UDP); if (IS_ERR(sk)) goto no_sk; --- This fix touches the net core path (skb_steal_sock / inet_steal_sock), so it probably makes more sense to review it on the net side rather than the bpf side. ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH bpf v6 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() 2026-04-03 1:58 [PATCH bpf v6 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen 2026-04-03 1:58 ` [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen @ 2026-04-03 1:58 ` Jiayuan Chen 1 sibling, 0 replies; 6+ messages in thread From: Jiayuan Chen @ 2026-04-03 1:58 UTC (permalink / raw) To: bpf Cc: Jiayuan Chen, Martin KaFai Lau, Daniel Borkmann, John Fastabend, Stanislav Fomichev, Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan, Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest Add test_tcp_custom_syncookie_protocol_check to verify that bpf_sk_assign_tcp_reqsk() rejects non-TCP skbs. The test sends a UDP packet through TC ingress where a BPF program calls bpf_sk_assign_tcp_reqsk() on it and checks that the kfunc returns an error. A UDP server recv() is used as synchronization to ensure the BPF program has finished processing before checking the result. Without the fix in bpf_sk_assign_tcp_reqsk(), the kfunc succeeds and attaches a TCP reqsk to the UDP skb, which causes a null pointer dereference panic when the kernel processes it through the UDP receive path. Test result: ./test_progs -a tcp_custom_syncookie_protocol_check -v setup_netns:PASS:create netns 0 nsec setup_netns:PASS:ip 0 nsec write_sysctl:PASS:open sysctl 0 nsec write_sysctl:PASS:write sysctl 0 nsec setup_netns:PASS:write_sysctl 0 nsec test_tcp_custom_syncookie_protocol_check:PASS:open_and_load 0 nsec setup_tc:PASS:qdisc add dev lo clsact 0 nsec setup_tc:PASS:filter add dev lo ingress 0 nsec run_protocol_check:PASS:start tcp_server 0 nsec run_protocol_check:PASS:start udp_server 0 nsec run_protocol_check:PASS:connect udp_client 0 nsec run_protocol_check:PASS:send udp 0 nsec run_protocol_check:PASS:recv udp 0 nsec run_protocol_check:PASS:recv data 0 nsec run_protocol_check:PASS:udp_intercepted 0 nsec run_protocol_check:PASS:assign_ret 0 nsec #471/1 tcp_custom_syncookie_protocol_check/IPv4 TCP:OK run_protocol_check:PASS:start tcp_server 0 nsec run_protocol_check:PASS:start udp_server 0 nsec run_protocol_check:PASS:connect udp_client 0 nsec run_protocol_check:PASS:send udp 0 nsec run_protocol_check:PASS:recv udp 0 nsec run_protocol_check:PASS:recv data 0 nsec run_protocol_check:PASS:udp_intercepted 0 nsec run_protocol_check:PASS:assign_ret 0 nsec #471/2 tcp_custom_syncookie_protocol_check/IPv6 TCP:OK #471 tcp_custom_syncookie_protocol_check:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> --- .../bpf/prog_tests/tcp_custom_syncookie.c | 93 ++++++++++++++- .../bpf/progs/test_tcp_custom_syncookie.c | 109 ++++++++++++++++++ 2 files changed, 198 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c index eaf441dc7e79b..017836581669f 100644 --- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c +++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c @@ -5,6 +5,7 @@ #include <sched.h> #include <stdlib.h> #include <net/if.h> +#include <netinet/in.h> #include "test_progs.h" #include "cgroup_helpers.h" @@ -47,11 +48,10 @@ static int setup_netns(void) return -1; } -static int setup_tc(struct test_tcp_custom_syncookie *skel) +static int setup_tc(int prog_fd) { LIBBPF_OPTS(bpf_tc_hook, qdisc_lo, .attach_point = BPF_TC_INGRESS); - LIBBPF_OPTS(bpf_tc_opts, tc_attach, - .prog_fd = bpf_program__fd(skel->progs.tcp_custom_syncookie)); + LIBBPF_OPTS(bpf_tc_opts, tc_attach, .prog_fd = prog_fd); qdisc_lo.ifindex = if_nametoindex("lo"); if (!ASSERT_OK(bpf_tc_hook_create(&qdisc_lo), "qdisc add dev lo clsact")) @@ -127,7 +127,7 @@ void test_tcp_custom_syncookie(void) if (!ASSERT_OK_PTR(skel, "open_and_load")) return; - if (setup_tc(skel)) + if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie))) goto destroy_skel; for (i = 0; i < ARRAY_SIZE(test_cases); i++) { @@ -145,6 +145,91 @@ void test_tcp_custom_syncookie(void) destroy_skel: system("tc qdisc del dev lo clsact"); + test_tcp_custom_syncookie__destroy(skel); +} + +/* Test: bpf_sk_assign_tcp_reqsk() should reject non-TCP skb. + * + * Send a UDP packet through TC ingress where a BPF program calls + * bpf_sk_assign_tcp_reqsk() on it. The kfunc should return an error + * because the skb carries UDP, not TCP. + * + * TCP and UDP servers share the same port. The BPF program intercepts + * the UDP packet, looks up the TCP listener via the dest port, and + * attempts to assign a TCP reqsk to the UDP skb. + */ +static void run_protocol_check(struct test_tcp_custom_syncookie *skel, + int family, const char *addr) +{ + int tcp_server, udp_server, udp_client; + char buf[] = "test"; + int port, ret; + + tcp_server = start_server(family, SOCK_STREAM, addr, 0, 0); + if (!ASSERT_NEQ(tcp_server, -1, "start tcp_server")) + return; + + port = ntohs(get_socket_local_port(tcp_server)); + + /* UDP server on same port for synchronization and port sharing */ + udp_server = start_server(family, SOCK_DGRAM, addr, port, 0); + if (!ASSERT_NEQ(udp_server, -1, "start udp_server")) + goto close_tcp; + + skel->bss->udp_intercepted = false; + skel->bss->assign_ret = 0; + + udp_client = connect_to_fd(udp_server, 0); + if (!ASSERT_NEQ(udp_client, -1, "connect udp_client")) + goto close_udp_server; + + ret = send(udp_client, buf, sizeof(buf), 0); + if (!ASSERT_EQ(ret, sizeof(buf), "send udp")) + goto close_udp_client; + + memset(buf, 0, sizeof(buf)); + + /* recv() ensures TC ingress BPF has processed the skb */ + ret = recv(udp_server, buf, sizeof(buf), 0); + if (!ASSERT_EQ(ret, sizeof(buf), "recv udp")) + goto close_udp_client; + + ASSERT_STREQ(buf, "test", "recv data"); + ASSERT_EQ(skel->bss->udp_intercepted, true, "udp_intercepted"); + ASSERT_EQ(skel->bss->assign_ret, -EINVAL, "assign_ret"); + +close_udp_client: + close(udp_client); +close_udp_server: + close(udp_server); +close_tcp: + close(tcp_server); +} + +void test_tcp_custom_syncookie_protocol_check(void) +{ + struct test_tcp_custom_syncookie *skel; + int i; + + if (setup_netns()) + return; + + skel = test_tcp_custom_syncookie__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open_and_load")) + return; + + if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie_badproto))) + goto destroy_skel; + + for (i = 0; i < ARRAY_SIZE(test_cases); i++) { + if (!test__start_subtest(test_cases[i].name)) + continue; + run_protocol_check(skel, test_cases[i].family, + test_cases[i].addr); + } + +destroy_skel: + system("tc qdisc del dev lo clsact"); test_tcp_custom_syncookie__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c index 7d5293de19523..7e87773a95adb 100644 --- a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c +++ b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c @@ -588,4 +588,113 @@ int tcp_custom_syncookie(struct __sk_buff *skb) return tcp_handle_ack(&ctx); } +/* Test: call bpf_sk_assign_tcp_reqsk() on a UDP skb. + * The kfunc should reject it because the skb is not TCP. + * + * TCP and UDP servers share the same port. The BPF program intercepts + * UDP packets, looks up the TCP listener on the same port, and tries + * to assign a TCP reqsk to the UDP skb. + */ +int assign_ret; +bool udp_intercepted; + +static int badproto_lookup_assign(struct __sk_buff *skb, + struct bpf_sock_tuple *tuple, u32 tuple_size) +{ + struct bpf_tcp_req_attrs attrs = {}; + struct bpf_sock *skc; + struct sock *sk; + + skc = bpf_skc_lookup_tcp(skb, tuple, tuple_size, -1, 0); + if (!skc) + return TC_ACT_OK; + + if (skc->state != TCP_LISTEN) { + bpf_sk_release(skc); + return TC_ACT_OK; + } + + sk = (struct sock *)bpf_skc_to_tcp_sock(skc); + if (!sk) { + bpf_sk_release(skc); + return TC_ACT_OK; + } + + attrs.mss = 1460; + attrs.wscale_ok = 1; + attrs.snd_wscale = 7; + attrs.rcv_wscale = 7; + attrs.sack_ok = 1; + + assign_ret = bpf_sk_assign_tcp_reqsk(skb, sk, &attrs, sizeof(attrs)); + + bpf_sk_release(skc); + return TC_ACT_OK; +} + +SEC("tc") +int tcp_custom_syncookie_badproto(struct __sk_buff *skb) +{ + void *data = (void *)(long)skb->data; + void *data_end = (void *)(long)skb->data_end; + struct bpf_sock_tuple tuple = {}; + struct ethhdr *eth; + struct iphdr *iph; + struct ipv6hdr *ip6h; + struct udphdr *udp; + + eth = (struct ethhdr *)data; + if (eth + 1 > data_end) + return TC_ACT_OK; + + switch (bpf_ntohs(eth->h_proto)) { + case ETH_P_IP: + iph = (struct iphdr *)(eth + 1); + if (iph + 1 > data_end) + return TC_ACT_OK; + + if (iph->protocol != IPPROTO_UDP) + return TC_ACT_OK; + + udp = (struct udphdr *)(iph + 1); + if (udp + 1 > data_end) + return TC_ACT_OK; + + udp_intercepted = true; + + tuple.ipv4.saddr = iph->saddr; + tuple.ipv4.daddr = iph->daddr; + tuple.ipv4.sport = udp->source; + tuple.ipv4.dport = udp->dest; + + return badproto_lookup_assign(skb, &tuple, + sizeof(tuple.ipv4)); + case ETH_P_IPV6: + ip6h = (struct ipv6hdr *)(eth + 1); + if (ip6h + 1 > data_end) + return TC_ACT_OK; + + if (ip6h->nexthdr != IPPROTO_UDP) + return TC_ACT_OK; + + udp = (struct udphdr *)(ip6h + 1); + if (udp + 1 > data_end) + return TC_ACT_OK; + + udp_intercepted = true; + + __builtin_memcpy(tuple.ipv6.saddr, &ip6h->saddr, + sizeof(tuple.ipv6.saddr)); + __builtin_memcpy(tuple.ipv6.daddr, &ip6h->daddr, + sizeof(tuple.ipv6.daddr)); + tuple.ipv6.sport = udp->source; + tuple.ipv6.dport = udp->dest; + + return badproto_lookup_assign(skb, &tuple, + sizeof(tuple.ipv6)); + default: + return TC_ACT_OK; + } +} + char _license[] SEC("license") = "GPL"; -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-07 5:22 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-03 1:58 [PATCH bpf v6 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen 2026-04-03 1:58 ` [PATCH bpf v6 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen 2026-04-03 2:26 ` Eric Dumazet 2026-04-06 19:53 ` Martin KaFai Lau 2026-04-07 5:22 ` Jiayuan Chen 2026-04-03 1:58 ` [PATCH bpf v6 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox