[PATCH bpf v4 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf v4 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie
@ 2026-03-30  8:07 Jiayuan Chen
  2026-03-30  8:07 ` [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen
  2026-03-30  8:07 ` [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
  0 siblings, 2 replies; 9+ messages in thread
From: Jiayuan Chen @ 2026-03-30  8:07 UTC (permalink / raw)
  To: bpf
  Cc: Jiayuan Chen, Martin KaFai Lau, Daniel Borkmann, John Fastabend,
	Stanislav Fomichev, Alexei Starovoitov, Andrii Nakryiko,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh, Hao Luo,
	Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Shuah Khan, Jiayuan Chen,
	Kuniyuki Iwashima, netdev, linux-kernel, linux-kselftest

From: Jiayuan Chen <jiayuan.chen@shopee.com>

bpf_sk_assign_tcp_reqsk() does not validate the L4 protocol of the skb,
only checking skb->protocol (L3). A BPF program that calls this kfunc on
a non-TCP skb (e.g. UDP) will succeed, attaching a TCP reqsk to the skb.

When the skb enters the UDP receive path, skb_steal_sock() returns the
TCP listener socket from the reqsk. The UDP code then casts this TCP
socket to udp_sock and accesses UDP-specific fields at invalid offsets,
causing a null pointer dereference:

  BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0
  Read of size 4 at addr 0000000000000008 by task test_progs/537

  CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT
  Call Trace:
   <IRQ>
   dump_stack_lvl (lib/dump_stack.c:123)
   print_report (mm/kasan/report.c:487)
   kasan_report (mm/kasan/report.c:597)
   __kasan_check_read (mm/kasan/shadow.c:32)
   __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719)
   udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500)
   udp_queue_rcv_skb (net/ipv4/udp.c:2532)
   udp_unicast_rcv_skb (net/ipv4/udp.c:2684)
   __udp4_lib_rcv (net/ipv4/udp.c:2742)
   udp_rcv (net/ipv4/udp.c:2937)
   ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209)
   ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242)
   ip_local_deliver (net/ipv4/ip_input.c:265)
   __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4))
   __netif_receive_skb (net/core/dev.c:6280)

Solution

Patch 1: Add L4 protocol validation in bpf_sk_assign_tcp_reqsk(). Check
ip_hdr(skb)->protocol (IPv4) and ipv6_hdr(skb)->nexthdr (IPv6) against
IPPROTO_TCP, returning -EINVAL for non-TCP skbs.
Patch 2: Add selftest that calls bpf_sk_assign_tcp_reqsk() on a UDP skb
and verifies the kfunc rejects it.


---
v1: https://lore.kernel.org/bpf/20260323105510.51990-1-jiayuan.chen@linux.dev/
v2: https://lore.kernel.org/bpf/20260326062657.88446-1-jiayuan.chen@linux.dev/
v3: https://lore.kernel.org/bpf/20260327133915.286037-1-jiayuan.chen@linux.dev/

Changes in v4:
- Check if assign_ret is EINVAL instead of checking if it is 0. 
  Suggested by Martin KaFai Lau

Changes in v3:
- Add IPv6 test coverage, reuse test_cases[] to iterate over both
  address families
- Share TCP/UDP port to simplify BPF program, remove unnecessary
  global variables
- Use connect_to_fd() + send()/recv() instead of manual sockaddr
  construction
- Suggested by Kuniyuki Iwashima

Changes in v2:
- Add Reviewed-by tag from Kuniyuki Iwashima for patch 1
- Use UDP socket recv() instead of kern_sync_rcu() for synchronization
  in selftest

Jiayuan Chen (2):
  bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()

 net/core/filter.c                             |   6 +
 .../bpf/prog_tests/tcp_custom_syncookie.c     |  91 ++++++++++++++-
 .../bpf/progs/test_tcp_custom_syncookie.c     | 109 ++++++++++++++++++
 3 files changed, 202 insertions(+), 4 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-30  8:07 [PATCH bpf v4 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
@ 2026-03-30  8:07 ` Jiayuan Chen
  2026-03-30  9:00   ` bot+bpf-ci
  2026-03-30  8:07 ` [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
  1 sibling, 1 reply; 9+ messages in thread
From: Jiayuan Chen @ 2026-03-30  8:07 UTC (permalink / raw)
  To: bpf
  Cc: Jiayuan Chen, Kuniyuki Iwashima, Martin KaFai Lau,
	Daniel Borkmann, John Fastabend, Stanislav Fomichev,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, netdev, linux-kernel, linux-kselftest

bpf_sk_assign_tcp_reqsk() only validates skb->protocol (L3) but does not
check the L4 protocol in the IP header. A BPF program can call this kfunc
on a UDP skb with a valid TCP listener socket, which will succeed and
attach a TCP reqsk to the UDP skb.

When the UDP skb enters the UDP receive path, skb_steal_sock() returns
the TCP listener from the reqsk. The UDP code then passes this TCP socket
to udp_unicast_rcv_skb() -> __udp_enqueue_schedule_skb(), which casts
it to udp_sock and accesses UDP-specific fields at invalid offsets,
causing a null pointer dereference and kernel panic:

  BUG: KASAN: null-ptr-deref in __udp_enqueue_schedule_skb+0x19d/0x1df0
  Read of size 4 at addr 0000000000000008 by task test_progs/537

  CPU: 1 UID: 0 PID: 537 Comm: test_progs Not tainted 7.0.0-rc4+ #46 PREEMPT
  Call Trace:
   <IRQ>
   dump_stack_lvl (lib/dump_stack.c:123)
   print_report (mm/kasan/report.c:487)
   kasan_report (mm/kasan/report.c:597)
   __kasan_check_read (mm/kasan/shadow.c:32)
   __udp_enqueue_schedule_skb (net/ipv4/udp.c:1719)
   udp_queue_rcv_one_skb (net/ipv4/udp.c:2370 net/ipv4/udp.c:2500)
   udp_queue_rcv_skb (net/ipv4/udp.c:2532)
   udp_unicast_rcv_skb (net/ipv4/udp.c:2684)
   __udp4_lib_rcv (net/ipv4/udp.c:2742)
   udp_rcv (net/ipv4/udp.c:2937)
   ip_protocol_deliver_rcu (net/ipv4/ip_input.c:209)
   ip_local_deliver_finish (./include/linux/rcupdate.h:879 net/ipv4/ip_input.c:242)
   ip_local_deliver (net/ipv4/ip_input.c:265)
   __netif_receive_skb_one_core (net/core/dev.c:6164 (discriminator 4))
   __netif_receive_skb (net/core/dev.c:6280)

Fix this by checking the IP header's protocol field in
bpf_sk_assign_tcp_reqsk() and rejecting non-TCP skbs with -EINVAL.

Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.")
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 net/core/filter.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 78b548158fb0..fb975bcce804 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -12248,11 +12248,17 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk,
 
 	switch (skb->protocol) {
 	case htons(ETH_P_IP):
+		if (ip_hdr(skb)->protocol != IPPROTO_TCP)
+			return -EINVAL;
+
 		ops = &tcp_request_sock_ops;
 		min_mss = 536;
 		break;
 #if IS_BUILTIN(CONFIG_IPV6)
 	case htons(ETH_P_IPV6):
+		if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
+			return -EINVAL;
+
 		ops = &tcp6_request_sock_ops;
 		min_mss = IPV6_MIN_MTU - 60;
 		break;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
  2026-03-30  8:07 [PATCH bpf v4 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
  2026-03-30  8:07 ` [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen
@ 2026-03-30  8:07 ` Jiayuan Chen
  2026-03-31  1:50   ` Kuniyuki Iwashima
  1 sibling, 1 reply; 9+ messages in thread
From: Jiayuan Chen @ 2026-03-30  8:07 UTC (permalink / raw)
  To: bpf
  Cc: Jiayuan Chen, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Shuah Khan, Kuniyuki Iwashima, netdev,
	linux-kernel, linux-kselftest

Add test_tcp_custom_syncookie_protocol_check to verify that
bpf_sk_assign_tcp_reqsk() rejects non-TCP skbs. The test sends a UDP
packet through TC ingress where a BPF program calls
bpf_sk_assign_tcp_reqsk() on it and checks that the kfunc returns an
error. A UDP server recv() is used as synchronization to ensure the
BPF program has finished processing before checking the result.

Without the fix in bpf_sk_assign_tcp_reqsk(), the kfunc succeeds and
attaches a TCP reqsk to the UDP skb, which causes a null pointer
dereference panic when the kernel processes it through the UDP receive
path.

Test result:

  ./test_progs -a tcp_custom_syncookie_protocol_check -v
  setup_netns:PASS:create netns 0 nsec
  setup_netns:PASS:ip 0 nsec
  write_sysctl:PASS:open sysctl 0 nsec
  write_sysctl:PASS:write sysctl 0 nsec
  setup_netns:PASS:write_sysctl 0 nsec
  test_tcp_custom_syncookie_protocol_check:PASS:open_and_load 0 nsec
  setup_tc:PASS:qdisc add dev lo clsact 0 nsec
  setup_tc:PASS:filter add dev lo ingress 0 nsec
  run_protocol_check:PASS:start tcp_server 0 nsec
  run_protocol_check:PASS:start udp_server 0 nsec
  run_protocol_check:PASS:connect udp_client 0 nsec
  run_protocol_check:PASS:send udp 0 nsec
  run_protocol_check:PASS:recv udp 0 nsec
  run_protocol_check:PASS:udp_intercepted 0 nsec
  run_protocol_check:PASS:assign_ret 0 nsec
  #471/1   tcp_custom_syncookie_protocol_check/IPv4 TCP:OK
  run_protocol_check:PASS:start tcp_server 0 nsec
  run_protocol_check:PASS:start udp_server 0 nsec
  run_protocol_check:PASS:connect udp_client 0 nsec
  run_protocol_check:PASS:send udp 0 nsec
  run_protocol_check:PASS:recv udp 0 nsec
  run_protocol_check:PASS:udp_intercepted 0 nsec
  run_protocol_check:PASS:assign_ret 0 nsec
  #471/2   tcp_custom_syncookie_protocol_check/IPv6 TCP:OK
  #471     tcp_custom_syncookie_protocol_check:OK
  Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
 .../bpf/prog_tests/tcp_custom_syncookie.c     |  91 ++++++++++++++-
 .../bpf/progs/test_tcp_custom_syncookie.c     | 109 ++++++++++++++++++
 2 files changed, 196 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
index eaf441dc7e79..6e29402c4c59 100644
--- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
@@ -5,6 +5,7 @@
 #include <sched.h>
 #include <stdlib.h>
 #include <net/if.h>
+#include <netinet/in.h>
 
 #include "test_progs.h"
 #include "cgroup_helpers.h"
@@ -47,11 +48,10 @@ static int setup_netns(void)
 	return -1;
 }
 
-static int setup_tc(struct test_tcp_custom_syncookie *skel)
+static int setup_tc(int prog_fd)
 {
 	LIBBPF_OPTS(bpf_tc_hook, qdisc_lo, .attach_point = BPF_TC_INGRESS);
-	LIBBPF_OPTS(bpf_tc_opts, tc_attach,
-		    .prog_fd = bpf_program__fd(skel->progs.tcp_custom_syncookie));
+	LIBBPF_OPTS(bpf_tc_opts, tc_attach, .prog_fd = prog_fd);
 
 	qdisc_lo.ifindex = if_nametoindex("lo");
 	if (!ASSERT_OK(bpf_tc_hook_create(&qdisc_lo), "qdisc add dev lo clsact"))
@@ -127,7 +127,7 @@ void test_tcp_custom_syncookie(void)
 	if (!ASSERT_OK_PTR(skel, "open_and_load"))
 		return;
 
-	if (setup_tc(skel))
+	if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie)))
 		goto destroy_skel;
 
 	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
@@ -145,6 +145,89 @@ void test_tcp_custom_syncookie(void)
 
 destroy_skel:
 	system("tc qdisc del dev lo clsact");
+	test_tcp_custom_syncookie__destroy(skel);
+}
+
+/* Test: bpf_sk_assign_tcp_reqsk() should reject non-TCP skb.
+ *
+ * Send a UDP packet through TC ingress where a BPF program calls
+ * bpf_sk_assign_tcp_reqsk() on it. The kfunc should return an error
+ * because the skb carries UDP, not TCP.
+ *
+ * TCP and UDP servers share the same port. The BPF program intercepts
+ * the UDP packet, looks up the TCP listener via the dest port, and
+ * attempts to assign a TCP reqsk to the UDP skb.
+ */
+static void run_protocol_check(struct test_tcp_custom_syncookie *skel,
+			       int family, const char *addr)
+{
+	int tcp_server = -1, udp_server = -1, udp_client = -1;
+	char buf[32] = "test";
+	int port, ret;
+
+	tcp_server = start_server(family, SOCK_STREAM, addr, 0, 0);
+	if (!ASSERT_NEQ(tcp_server, -1, "start tcp_server"))
+		return;
+
+	port = ntohs(get_socket_local_port(tcp_server));
+
+	/* UDP server on same port for synchronization and port sharing */
+	udp_server = start_server(family, SOCK_DGRAM, addr, port, 0);
+	if (!ASSERT_NEQ(udp_server, -1, "start udp_server"))
+		goto close_tcp;
+
+	skel->bss->udp_intercepted = false;
+	skel->bss->assign_ret = 0;
+
+	udp_client = connect_to_fd(udp_server, 0);
+	if (!ASSERT_NEQ(udp_client, -1, "connect udp_client"))
+		goto close_udp_server;
 
+	ret = send(udp_client, buf, sizeof(buf), 0);
+	if (!ASSERT_EQ(ret, sizeof(buf), "send udp"))
+		goto close_udp_client;
+
+	/* recv() ensures TC ingress BPF has processed the skb */
+	ret = recv(udp_server, buf, sizeof(buf), 0);
+	if (!ASSERT_EQ(ret, sizeof(buf), "recv udp"))
+		goto close_udp_client;
+
+	ASSERT_EQ(skel->bss->udp_intercepted, true, "udp_intercepted");
+
+	ASSERT_EQ(skel->bss->assign_ret, -EINVAL, "assign_ret");
+
+close_udp_client:
+	close(udp_client);
+close_udp_server:
+	close(udp_server);
+close_tcp:
+	close(tcp_server);
+}
+
+void test_tcp_custom_syncookie_protocol_check(void)
+{
+	struct test_tcp_custom_syncookie *skel;
+	int i;
+
+	if (setup_netns())
+		return;
+
+	skel = test_tcp_custom_syncookie__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "open_and_load"))
+		return;
+
+	if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie_badproto)))
+		goto destroy_skel;
+
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+		if (!test__start_subtest(test_cases[i].name))
+			continue;
+
+		run_protocol_check(skel, test_cases[i].family,
+				   test_cases[i].addr);
+	}
+
+destroy_skel:
+	system("tc qdisc del dev lo clsact");
 	test_tcp_custom_syncookie__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
index 7d5293de1952..bd3fad3dd503 100644
--- a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
+++ b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
@@ -588,4 +588,113 @@ int tcp_custom_syncookie(struct __sk_buff *skb)
 	return tcp_handle_ack(&ctx);
 }
 
+/* Test: call bpf_sk_assign_tcp_reqsk() on a UDP skb.
+ * The kfunc should reject it because the skb is not TCP.
+ *
+ * TCP and UDP servers share the same port. The BPF program intercepts
+ * UDP packets, looks up the TCP listener on the same port, and tries
+ * to assign a TCP reqsk to the UDP skb.
+ */
+int assign_ret = 0;
+bool udp_intercepted = false;
+
+static int badproto_lookup_assign(struct __sk_buff *skb, struct udphdr *udp,
+				  struct bpf_sock_tuple *tuple, u32 tuple_size)
+{
+	struct bpf_tcp_req_attrs attrs = {};
+	struct bpf_sock *skc;
+	struct sock *sk;
+
+	skc = bpf_skc_lookup_tcp(skb, tuple, tuple_size, -1, 0);
+	if (!skc)
+		return TC_ACT_OK;
+
+	if (skc->state != TCP_LISTEN) {
+		bpf_sk_release(skc);
+		return TC_ACT_OK;
+	}
+
+	sk = (struct sock *)bpf_skc_to_tcp_sock(skc);
+	if (!sk) {
+		bpf_sk_release(skc);
+		return TC_ACT_OK;
+	}
+
+	attrs.mss = 1460;
+	attrs.wscale_ok = 1;
+	attrs.snd_wscale = 7;
+	attrs.rcv_wscale = 7;
+	attrs.sack_ok = 1;
+
+	assign_ret = bpf_sk_assign_tcp_reqsk(skb, sk, &attrs, sizeof(attrs));
+
+	bpf_sk_release(skc);
+	return TC_ACT_OK;
+}
+
+SEC("tc")
+int tcp_custom_syncookie_badproto(struct __sk_buff *skb)
+{
+	void *data = (void *)(long)skb->data;
+	void *data_end = (void *)(long)skb->data_end;
+	struct bpf_sock_tuple tuple = {};
+	struct ethhdr *eth;
+	struct iphdr *iph;
+	struct ipv6hdr *ip6h;
+	struct udphdr *udp;
+
+	eth = (struct ethhdr *)data;
+	if (eth + 1 > data_end)
+		return TC_ACT_OK;
+
+	switch (bpf_ntohs(eth->h_proto)) {
+	case ETH_P_IP:
+		iph = (struct iphdr *)(eth + 1);
+		if (iph + 1 > data_end)
+			return TC_ACT_OK;
+
+		if (iph->protocol != IPPROTO_UDP)
+			return TC_ACT_OK;
+
+		udp = (struct udphdr *)(iph + 1);
+		if (udp + 1 > data_end)
+			return TC_ACT_OK;
+
+		udp_intercepted = true;
+
+		tuple.ipv4.saddr = iph->saddr;
+		tuple.ipv4.daddr = iph->daddr;
+		tuple.ipv4.sport = udp->source;
+		tuple.ipv4.dport = udp->dest;
+
+		return badproto_lookup_assign(skb, udp, &tuple,
+					      sizeof(tuple.ipv4));
+	case ETH_P_IPV6:
+		ip6h = (struct ipv6hdr *)(eth + 1);
+		if (ip6h + 1 > data_end)
+			return TC_ACT_OK;
+
+		if (ip6h->nexthdr != IPPROTO_UDP)
+			return TC_ACT_OK;
+
+		udp = (struct udphdr *)(ip6h + 1);
+		if (udp + 1 > data_end)
+			return TC_ACT_OK;
+
+		udp_intercepted = true;
+
+		__builtin_memcpy(tuple.ipv6.saddr, &ip6h->saddr,
+				 sizeof(tuple.ipv6.saddr));
+		__builtin_memcpy(tuple.ipv6.daddr, &ip6h->daddr,
+				 sizeof(tuple.ipv6.daddr));
+		tuple.ipv6.sport = udp->source;
+		tuple.ipv6.dport = udp->dest;
+
+		return badproto_lookup_assign(skb, udp, &tuple,
+					      sizeof(tuple.ipv6));
+	default:
+		return TC_ACT_OK;
+	}
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-30  8:07 ` [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen
@ 2026-03-30  9:00   ` bot+bpf-ci
  2026-03-30  9:25     ` Jiayuan Chen
  0 siblings, 1 reply; 9+ messages in thread
From: bot+bpf-ci @ 2026-03-30  9:00 UTC (permalink / raw)
  To: jiayuan.chen, bpf
  Cc: jiayuan.chen, kuniyu, martin.lau, daniel, john.fastabend, sdf,
	ast, andrii, eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa,
	davem, edumazet, kuba, pabeni, horms, shuah, netdev, linux-kernel,
	linux-kselftest, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1404 bytes --]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 78b548158fb0..fb975bcce804 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -12248,11 +12248,17 @@ __bpf_kfunc int bpf_sk_assign_tcp_reqsk(struct __sk_buff *s, struct sock *sk,
>
>  	switch (skb->protocol) {

[ ... ]

> #if IS_BUILTIN(CONFIG_IPV6)
>  	case htons(ETH_P_IPV6):
> +		if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
> +			return -EINVAL;
> +
>  		ops = &tcp6_request_sock_ops;

For IPv6, ipv6_hdr(skb)->nexthdr gives the type of the header
immediately following the base IPv6 header. If extension headers
are present (hop-by-hop options, routing, etc.), nexthdr would be
the extension header type rather than IPPROTO_TCP, even when the
packet is a valid TCP segment.

Would it be worth using ipv6_find_hdr() here, similar to
bpf_update_srh_state() in the same file, to walk past any extension
headers? The IPv4 check above is fine since ip_hdr(skb)->protocol
always identifies the transport protocol directly.

In practice this is very unlikely to matter since TCP SYN packets
with IPv6 extension headers are essentially non-existent, but the
check as written would reject them.

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/23735111188

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-30  9:00   ` bot+bpf-ci
@ 2026-03-30  9:25     ` Jiayuan Chen
  2026-03-31  1:35       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 9+ messages in thread
From: Jiayuan Chen @ 2026-03-30  9:25 UTC (permalink / raw)
  To: bot+bpf-ci, bpf, clm, Kuniyuki Iwashima
  Cc: kuniyu, martin.lau, daniel, john.fastabend, sdf, ast, andrii,
	eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa, davem,
	edumazet, kuba, pabeni, horms, shuah, netdev, linux-kernel,
	linux-kselftest, martin.lau, clm, ihor.solodrai


On 3/30/26 5:00 PM, bot+bpf-ci@kernel.org wrote:
>> #if IS_BUILTIN(CONFIG_IPV6)
>>   	case htons(ETH_P_IPV6):
>> +		if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
>> +			return -EINVAL;
>> +
>>   		ops = &tcp6_request_sock_ops;
> For IPv6, ipv6_hdr(skb)->nexthdr gives the type of the header
> immediately following the base IPv6 header. If extension headers
> are present (hop-by-hop options, routing, etc.), nexthdr would be
> the extension header type rather than IPPROTO_TCP, even when the
> packet is a valid TCP segment.
>
> Would it be worth using ipv6_find_hdr() here, similar to
> bpf_update_srh_state() in the same file, to walk past any extension
> headers? The IPv4 check above is fine since ip_hdr(skb)->protocol
> always identifies the transport protocol directly.
>
> In practice this is very unlikely to matter since TCP SYN packets
> with IPv6 extension headers are essentially non-existent, but the
> check as written would reject them.
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See:https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary:https://github.com/kernel-patches/bpf/actions/runs/23735111188


Thanks for the analysis.

There are many places in the kernel that check nexthdr directly without
walking extension headers.

I'd prefer to keep it simple for now and leave ipv6_find_hdr() as a
potential future improvement if needed.

But happy to change if reviewersthink otherwise.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-30  9:25     ` Jiayuan Chen
@ 2026-03-31  1:35       ` Kuniyuki Iwashima
  2026-03-31  3:18         ` Martin KaFai Lau
  0 siblings, 1 reply; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-03-31  1:35 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bot+bpf-ci, bpf, clm, martin.lau, daniel, john.fastabend, sdf,
	ast, andrii, eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa,
	davem, edumazet, kuba, pabeni, horms, shuah, netdev, linux-kernel,
	linux-kselftest, martin.lau, ihor.solodrai

On Mon, Mar 30, 2026 at 2:26 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
>
> On 3/30/26 5:00 PM, bot+bpf-ci@kernel.org wrote:
> >> #if IS_BUILTIN(CONFIG_IPV6)
> >>      case htons(ETH_P_IPV6):
> >> +            if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
> >> +                    return -EINVAL;
> >> +
> >>              ops = &tcp6_request_sock_ops;
> > For IPv6, ipv6_hdr(skb)->nexthdr gives the type of the header
> > immediately following the base IPv6 header. If extension headers
> > are present (hop-by-hop options, routing, etc.), nexthdr would be
> > the extension header type rather than IPPROTO_TCP, even when the
> > packet is a valid TCP segment.
> >
> > Would it be worth using ipv6_find_hdr() here, similar to
> > bpf_update_srh_state() in the same file, to walk past any extension
> > headers? The IPv4 check above is fine since ip_hdr(skb)->protocol
> > always identifies the transport protocol directly.
> >
> > In practice this is very unlikely to matter since TCP SYN packets
> > with IPv6 extension headers are essentially non-existent, but the
> > check as written would reject them.
> >
> >
> > ---
> > AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> > See:https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> >
> > CI run summary:https://github.com/kernel-patches/bpf/actions/runs/23735111188
>
>
> Thanks for the analysis.
>
> There are many places in the kernel that check nexthdr directly without
> walking extension headers.
>
> I'd prefer to keep it simple for now and leave ipv6_find_hdr() as a
> potential future improvement if needed.

+1.

Given this feature is to create a reqsk to process on this running
host, it does not make sense to support such ext options.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk()
  2026-03-30  8:07 ` [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
@ 2026-03-31  1:50   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-03-31  1:50 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Shuah Khan, netdev, linux-kernel, linux-kselftest

On Mon, Mar 30, 2026 at 1:08 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> Add test_tcp_custom_syncookie_protocol_check to verify that
> bpf_sk_assign_tcp_reqsk() rejects non-TCP skbs. The test sends a UDP
> packet through TC ingress where a BPF program calls
> bpf_sk_assign_tcp_reqsk() on it and checks that the kfunc returns an
> error. A UDP server recv() is used as synchronization to ensure the
> BPF program has finished processing before checking the result.
>
> Without the fix in bpf_sk_assign_tcp_reqsk(), the kfunc succeeds and
> attaches a TCP reqsk to the UDP skb, which causes a null pointer
> dereference panic when the kernel processes it through the UDP receive
> path.
>
> Test result:
>
>   ./test_progs -a tcp_custom_syncookie_protocol_check -v
>   setup_netns:PASS:create netns 0 nsec
>   setup_netns:PASS:ip 0 nsec
>   write_sysctl:PASS:open sysctl 0 nsec
>   write_sysctl:PASS:write sysctl 0 nsec
>   setup_netns:PASS:write_sysctl 0 nsec
>   test_tcp_custom_syncookie_protocol_check:PASS:open_and_load 0 nsec
>   setup_tc:PASS:qdisc add dev lo clsact 0 nsec
>   setup_tc:PASS:filter add dev lo ingress 0 nsec
>   run_protocol_check:PASS:start tcp_server 0 nsec
>   run_protocol_check:PASS:start udp_server 0 nsec
>   run_protocol_check:PASS:connect udp_client 0 nsec
>   run_protocol_check:PASS:send udp 0 nsec
>   run_protocol_check:PASS:recv udp 0 nsec
>   run_protocol_check:PASS:udp_intercepted 0 nsec
>   run_protocol_check:PASS:assign_ret 0 nsec
>   #471/1   tcp_custom_syncookie_protocol_check/IPv4 TCP:OK
>   run_protocol_check:PASS:start tcp_server 0 nsec
>   run_protocol_check:PASS:start udp_server 0 nsec
>   run_protocol_check:PASS:connect udp_client 0 nsec
>   run_protocol_check:PASS:send udp 0 nsec
>   run_protocol_check:PASS:recv udp 0 nsec
>   run_protocol_check:PASS:udp_intercepted 0 nsec
>   run_protocol_check:PASS:assign_ret 0 nsec
>   #471/2   tcp_custom_syncookie_protocol_check/IPv6 TCP:OK
>   #471     tcp_custom_syncookie_protocol_check:OK
>   Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  .../bpf/prog_tests/tcp_custom_syncookie.c     |  91 ++++++++++++++-
>  .../bpf/progs/test_tcp_custom_syncookie.c     | 109 ++++++++++++++++++
>  2 files changed, 196 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> index eaf441dc7e79..6e29402c4c59 100644
> --- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> +++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> @@ -5,6 +5,7 @@
>  #include <sched.h>
>  #include <stdlib.h>
>  #include <net/if.h>
> +#include <netinet/in.h>
>
>  #include "test_progs.h"
>  #include "cgroup_helpers.h"
> @@ -47,11 +48,10 @@ static int setup_netns(void)
>         return -1;
>  }
>
> -static int setup_tc(struct test_tcp_custom_syncookie *skel)
> +static int setup_tc(int prog_fd)
>  {
>         LIBBPF_OPTS(bpf_tc_hook, qdisc_lo, .attach_point = BPF_TC_INGRESS);
> -       LIBBPF_OPTS(bpf_tc_opts, tc_attach,
> -                   .prog_fd = bpf_program__fd(skel->progs.tcp_custom_syncookie));
> +       LIBBPF_OPTS(bpf_tc_opts, tc_attach, .prog_fd = prog_fd);
>
>         qdisc_lo.ifindex = if_nametoindex("lo");
>         if (!ASSERT_OK(bpf_tc_hook_create(&qdisc_lo), "qdisc add dev lo clsact"))
> @@ -127,7 +127,7 @@ void test_tcp_custom_syncookie(void)
>         if (!ASSERT_OK_PTR(skel, "open_and_load"))
>                 return;
>
> -       if (setup_tc(skel))
> +       if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie)))
>                 goto destroy_skel;
>
>         for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
> @@ -145,6 +145,89 @@ void test_tcp_custom_syncookie(void)
>
>  destroy_skel:
>         system("tc qdisc del dev lo clsact");
> +       test_tcp_custom_syncookie__destroy(skel);
> +}
> +
> +/* Test: bpf_sk_assign_tcp_reqsk() should reject non-TCP skb.
> + *
> + * Send a UDP packet through TC ingress where a BPF program calls
> + * bpf_sk_assign_tcp_reqsk() on it. The kfunc should return an error
> + * because the skb carries UDP, not TCP.
> + *
> + * TCP and UDP servers share the same port. The BPF program intercepts
> + * the UDP packet, looks up the TCP listener via the dest port, and
> + * attempts to assign a TCP reqsk to the UDP skb.
> + */
> +static void run_protocol_check(struct test_tcp_custom_syncookie *skel,
> +                              int family, const char *addr)
> +{
> +       int tcp_server = -1, udp_server = -1, udp_client = -1;

nit: no need to init.

> +       char buf[32] = "test";

should be buf[] = "test" since you send data of sizeof(buf) below.


> +       int port, ret;
> +
> +       tcp_server = start_server(family, SOCK_STREAM, addr, 0, 0);
> +       if (!ASSERT_NEQ(tcp_server, -1, "start tcp_server"))
> +               return;
> +
> +       port = ntohs(get_socket_local_port(tcp_server));
> +
> +       /* UDP server on same port for synchronization and port sharing */
> +       udp_server = start_server(family, SOCK_DGRAM, addr, port, 0);
> +       if (!ASSERT_NEQ(udp_server, -1, "start udp_server"))
> +               goto close_tcp;
> +
> +       skel->bss->udp_intercepted = false;
> +       skel->bss->assign_ret = 0;
> +
> +       udp_client = connect_to_fd(udp_server, 0);
> +       if (!ASSERT_NEQ(udp_client, -1, "connect udp_client"))
> +               goto close_udp_server;
>
> +       ret = send(udp_client, buf, sizeof(buf), 0);
> +       if (!ASSERT_EQ(ret, sizeof(buf), "send udp"))
> +               goto close_udp_client;

memset(buf, 0, sizeof(buf)) here and

> +
> +       /* recv() ensures TC ingress BPF has processed the skb */
> +       ret = recv(udp_server, buf, sizeof(buf), 0);
> +       if (!ASSERT_EQ(ret, sizeof(buf), "recv udp"))

check ASSERT_STREQ() here ?

> +               goto close_udp_client;
> +
> +       ASSERT_EQ(skel->bss->udp_intercepted, true, "udp_intercepted");
> +
> +       ASSERT_EQ(skel->bss->assign_ret, -EINVAL, "assign_ret");
> +
> +close_udp_client:
> +       close(udp_client);
> +close_udp_server:
> +       close(udp_server);
> +close_tcp:
> +       close(tcp_server);
> +}
> +
> +void test_tcp_custom_syncookie_protocol_check(void)
> +{
> +       struct test_tcp_custom_syncookie *skel;
> +       int i;
> +
> +       if (setup_netns())
> +               return;
> +
> +       skel = test_tcp_custom_syncookie__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "open_and_load"))
> +               return;
> +
> +       if (setup_tc(bpf_program__fd(skel->progs.tcp_custom_syncookie_badproto)))
> +               goto destroy_skel;
> +
> +       for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
> +               if (!test__start_subtest(test_cases[i].name))
> +                       continue;
> +
> +               run_protocol_check(skel, test_cases[i].family,
> +                                  test_cases[i].addr);
> +       }
> +
> +destroy_skel:
> +       system("tc qdisc del dev lo clsact");
>         test_tcp_custom_syncookie__destroy(skel);
>  }
> diff --git a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
> index 7d5293de1952..bd3fad3dd503 100644
> --- a/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
> +++ b/tools/testing/selftests/bpf/progs/test_tcp_custom_syncookie.c
> @@ -588,4 +588,113 @@ int tcp_custom_syncookie(struct __sk_buff *skb)
>         return tcp_handle_ack(&ctx);
>  }
>
> +/* Test: call bpf_sk_assign_tcp_reqsk() on a UDP skb.
> + * The kfunc should reject it because the skb is not TCP.
> + *
> + * TCP and UDP servers share the same port. The BPF program intercepts
> + * UDP packets, looks up the TCP listener on the same port, and tries
> + * to assign a TCP reqsk to the UDP skb.
> + */
> +int assign_ret = 0;
> +bool udp_intercepted = false;

No need to init var on bss w/ 0 here.
(init in run_protocol_check() is necessary)


> +
> +static int badproto_lookup_assign(struct __sk_buff *skb, struct udphdr *udp,
> +                                 struct bpf_sock_tuple *tuple, u32 tuple_size)
> +{
> +       struct bpf_tcp_req_attrs attrs = {};
> +       struct bpf_sock *skc;
> +       struct sock *sk;
> +
> +       skc = bpf_skc_lookup_tcp(skb, tuple, tuple_size, -1, 0);
> +       if (!skc)
> +               return TC_ACT_OK;
> +
> +       if (skc->state != TCP_LISTEN) {
> +               bpf_sk_release(skc);
> +               return TC_ACT_OK;
> +       }
> +
> +       sk = (struct sock *)bpf_skc_to_tcp_sock(skc);
> +       if (!sk) {
> +               bpf_sk_release(skc);
> +               return TC_ACT_OK;
> +       }
> +
> +       attrs.mss = 1460;
> +       attrs.wscale_ok = 1;
> +       attrs.snd_wscale = 7;
> +       attrs.rcv_wscale = 7;
> +       attrs.sack_ok = 1;
> +
> +       assign_ret = bpf_sk_assign_tcp_reqsk(skb, sk, &attrs, sizeof(attrs));
> +
> +       bpf_sk_release(skc);
> +       return TC_ACT_OK;
> +}
> +
> +SEC("tc")
> +int tcp_custom_syncookie_badproto(struct __sk_buff *skb)
> +{
> +       void *data = (void *)(long)skb->data;
> +       void *data_end = (void *)(long)skb->data_end;
> +       struct bpf_sock_tuple tuple = {};
> +       struct ethhdr *eth;
> +       struct iphdr *iph;
> +       struct ipv6hdr *ip6h;
> +       struct udphdr *udp;
> +
> +       eth = (struct ethhdr *)data;
> +       if (eth + 1 > data_end)
> +               return TC_ACT_OK;
> +
> +       switch (bpf_ntohs(eth->h_proto)) {
> +       case ETH_P_IP:
> +               iph = (struct iphdr *)(eth + 1);
> +               if (iph + 1 > data_end)
> +                       return TC_ACT_OK;
> +
> +               if (iph->protocol != IPPROTO_UDP)
> +                       return TC_ACT_OK;
> +
> +               udp = (struct udphdr *)(iph + 1);
> +               if (udp + 1 > data_end)
> +                       return TC_ACT_OK;
> +
> +               udp_intercepted = true;
> +
> +               tuple.ipv4.saddr = iph->saddr;
> +               tuple.ipv4.daddr = iph->daddr;
> +               tuple.ipv4.sport = udp->source;
> +               tuple.ipv4.dport = udp->dest;
> +
> +               return badproto_lookup_assign(skb, udp, &tuple,
> +                                             sizeof(tuple.ipv4));
> +       case ETH_P_IPV6:
> +               ip6h = (struct ipv6hdr *)(eth + 1);
> +               if (ip6h + 1 > data_end)
> +                       return TC_ACT_OK;
> +
> +               if (ip6h->nexthdr != IPPROTO_UDP)
> +                       return TC_ACT_OK;
> +
> +               udp = (struct udphdr *)(ip6h + 1);
> +               if (udp + 1 > data_end)
> +                       return TC_ACT_OK;
> +
> +               udp_intercepted = true;
> +
> +               __builtin_memcpy(tuple.ipv6.saddr, &ip6h->saddr,
> +                                sizeof(tuple.ipv6.saddr));
> +               __builtin_memcpy(tuple.ipv6.daddr, &ip6h->daddr,
> +                                sizeof(tuple.ipv6.daddr));
> +               tuple.ipv6.sport = udp->source;
> +               tuple.ipv6.dport = udp->dest;
> +
> +               return badproto_lookup_assign(skb, udp, &tuple,
> +                                             sizeof(tuple.ipv6));
> +       default:
> +               return TC_ACT_OK;
> +       }
> +}
> +
>  char _license[] SEC("license") = "GPL";
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-31  1:35       ` Kuniyuki Iwashima
@ 2026-03-31  3:18         ` Martin KaFai Lau
  2026-03-31 22:04           ` Kuniyuki Iwashima
  0 siblings, 1 reply; 9+ messages in thread
From: Martin KaFai Lau @ 2026-03-31  3:18 UTC (permalink / raw)
  To: Kuniyuki Iwashima, Jiayuan Chen
  Cc: bot+bpf-ci, bpf, clm, daniel, john.fastabend, sdf, ast, andrii,
	eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa, davem,
	edumazet, kuba, pabeni, horms, shuah, netdev, linux-kernel,
	linux-kselftest, martin.lau, ihor.solodrai

On 3/30/26 6:35 PM, Kuniyuki Iwashima wrote:
> On Mon, Mar 30, 2026 at 2:26 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>>
>>
>> On 3/30/26 5:00 PM, bot+bpf-ci@kernel.org wrote:
>>>> #if IS_BUILTIN(CONFIG_IPV6)
>>>>       case htons(ETH_P_IPV6):
>>>> +            if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
>>>> +                    return -EINVAL;
>>>> +
>>>>               ops = &tcp6_request_sock_ops;
>>> For IPv6, ipv6_hdr(skb)->nexthdr gives the type of the header
>>> immediately following the base IPv6 header. If extension headers
>>> are present (hop-by-hop options, routing, etc.), nexthdr would be
>>> the extension header type rather than IPPROTO_TCP, even when the
>>> packet is a valid TCP segment.
>>>
>>> Would it be worth using ipv6_find_hdr() here, similar to
>>> bpf_update_srh_state() in the same file, to walk past any extension
>>> headers? The IPv4 check above is fine since ip_hdr(skb)->protocol
>>> always identifies the transport protocol directly.
>>>
>>> In practice this is very unlikely to matter since TCP SYN packets
>>> with IPv6 extension headers are essentially non-existent, but the
>>> check as written would reject them.
>>>
>>>
>>> ---
>>> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
>>> See:https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>>>
>>> CI run summary:https://github.com/kernel-patches/bpf/actions/runs/23735111188
>>
>>
>> Thanks for the analysis.
>>
>> There are many places in the kernel that check nexthdr directly without
>> walking extension headers.
>>
>> I'd prefer to keep it simple for now and leave ipv6_find_hdr() as a
>> potential future improvement if needed.
> 
> +1.
> 
> Given this feature is to create a reqsk to process on this running
> host, it does not make sense to support such ext options.

While at header reading, does it need a pskb_may_pull() before reading?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk()
  2026-03-31  3:18         ` Martin KaFai Lau
@ 2026-03-31 22:04           ` Kuniyuki Iwashima
  0 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-03-31 22:04 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Jiayuan Chen, bot+bpf-ci, bpf, clm, daniel, john.fastabend, sdf,
	ast, andrii, eddyz87, song, yonghong.song, kpsingh, haoluo, jolsa,
	davem, edumazet, kuba, pabeni, horms, shuah, netdev, linux-kernel,
	linux-kselftest, martin.lau, ihor.solodrai

On Mon, Mar 30, 2026 at 8:19 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 3/30/26 6:35 PM, Kuniyuki Iwashima wrote:
> > On Mon, Mar 30, 2026 at 2:26 AM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> >>
> >>
> >> On 3/30/26 5:00 PM, bot+bpf-ci@kernel.org wrote:
> >>>> #if IS_BUILTIN(CONFIG_IPV6)
> >>>>       case htons(ETH_P_IPV6):
> >>>> +            if (ipv6_hdr(skb)->nexthdr != IPPROTO_TCP)
> >>>> +                    return -EINVAL;
> >>>> +
> >>>>               ops = &tcp6_request_sock_ops;
> >>> For IPv6, ipv6_hdr(skb)->nexthdr gives the type of the header
> >>> immediately following the base IPv6 header. If extension headers
> >>> are present (hop-by-hop options, routing, etc.), nexthdr would be
> >>> the extension header type rather than IPPROTO_TCP, even when the
> >>> packet is a valid TCP segment.
> >>>
> >>> Would it be worth using ipv6_find_hdr() here, similar to
> >>> bpf_update_srh_state() in the same file, to walk past any extension
> >>> headers? The IPv4 check above is fine since ip_hdr(skb)->protocol
> >>> always identifies the transport protocol directly.
> >>>
> >>> In practice this is very unlikely to matter since TCP SYN packets
> >>> with IPv6 extension headers are essentially non-existent, but the
> >>> check as written would reject them.
> >>>
> >>>
> >>> ---
> >>> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> >>> See:https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> >>>
> >>> CI run summary:https://github.com/kernel-patches/bpf/actions/runs/23735111188
> >>
> >>
> >> Thanks for the analysis.
> >>
> >> There are many places in the kernel that check nexthdr directly without
> >> walking extension headers.
> >>
> >> I'd prefer to keep it simple for now and leave ipv6_find_hdr() as a
> >> potential future improvement if needed.
> >
> > +1.
> >
> > Given this feature is to create a reqsk to process on this running
> > host, it does not make sense to support such ext options.
>
> While at header reading, does it need a pskb_may_pull() before reading?

Ah good point, now that we read skb->data, pskb_may_pull() is needed indeed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-31 22:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30  8:07 [PATCH bpf v4 0/2] bpf: tcp: Fix null-ptr-deref in arbitrary SYN Cookie Jiayuan Chen
2026-03-30  8:07 ` [PATCH bpf v4 1/2] bpf: tcp: Reject non-TCP skb in bpf_sk_assign_tcp_reqsk() Jiayuan Chen
2026-03-30  9:00   ` bot+bpf-ci
2026-03-30  9:25     ` Jiayuan Chen
2026-03-31  1:35       ` Kuniyuki Iwashima
2026-03-31  3:18         ` Martin KaFai Lau
2026-03-31 22:04           ` Kuniyuki Iwashima
2026-03-30  8:07 ` [PATCH bpf v4 2/2] selftests/bpf: Add protocol check test for bpf_sk_assign_tcp_reqsk() Jiayuan Chen
2026-03-31  1:50   ` Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox