Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net] net: thunderbolt: Fix frags[] overflow by bounding frame_count
From: Maoyi Xie @ 2026-06-16 17:38 UTC (permalink / raw)
  To: Mika Westerberg, Yehezkel Bernat, Andrew Lunn, Jakub Kicinski,
	Paolo Abeni
  Cc: David S. Miller, Eric Dumazet, netdev, linux-kernel

tbnet_poll() assembles a multi-frame ThunderboltIP packet into one skb. The
first frame goes into the skb linear area and every further frame is added as
a page fragment.

	skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
			page, hdr_size, frame_size,
			TBNET_RX_PAGE_SIZE - hdr_size);

A packet of frame_count frames therefore ends up with frame_count - 1
fragments. tbnet_check_frame() only bounds the peer supplied frame_count to
TBNET_RING_SIZE / 4 (64), which is far above MAX_SKB_FRAGS (17 by default). A
peer that sends a packet of 19 or more small frames pushes nr_frags past
MAX_SKB_FRAGS, so skb_add_rx_frag() writes past skb_shinfo()->frags[] and
corrupts memory after the shared info.

Tighten the start of packet bound to MAX_SKB_FRAGS + 1 so a packet can never
produce more fragments than frags[] can hold. This matches the recent skb
frags overflow fixes in other receive paths, for example f0813bcd2d9d ("net:
wwan: t7xx: fix potential skb->frags overflow in RX path") and 600dc40554dc
("net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()").

Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
---
Mika preferred the bound in tbnet_check_frame() over the nr_frags <
MAX_SKB_FRAGS guard in tbnet_poll() that I first floated on the list, so this
rejects the oversized packet up front. Reproduced under KASAN with a harness
that mirrors the per-frame skb_add_rx_frag() loop.

 drivers/net/thunderbolt/main.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/thunderbolt/main.c b/drivers/net/thunderbolt/main.c
index 7aae5d915a1e..ac016890646c 100644
--- a/drivers/net/thunderbolt/main.c
+++ b/drivers/net/thunderbolt/main.c
@@ -787,8 +787,12 @@ static bool tbnet_check_frame(struct tbnet *net, const struct tbnet_frame *tf,
 		return true;
 	}

-	/* Start of packet, validate the frame header */
-	if (frame_count == 0 || frame_count > TBNET_RING_SIZE / 4) {
+	/* Start of packet, validate the frame header. tbnet_poll() puts the
+	 * first frame in the skb linear area and every further frame in a page
+	 * fragment, so a packet may not span more than MAX_SKB_FRAGS + 1 frames
+	 * without overflowing skb_shinfo()->frags[].
+	 */
+	if (frame_count == 0 || frame_count > MAX_SKB_FRAGS + 1) {
 		net->stats.rx_length_errors++;
 		return false;
 	}
-- 
2.34.1

^ permalink raw reply related

* [PATCH net v2] tipc: free bearer discoverer via RCU to fix tipc_disc_rcv UAF
From: Samuel Page @ 2026-06-16 17:53 UTC (permalink / raw)
  To: Jon Maloy
  Cc: Tung Quang Nguyen, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, tipc-discussion, linux-kernel,
	Samuel Page, stable

bearer_disable() tears down a bearer's discovery object with
tipc_disc_delete(), which frees the struct tipc_discoverer with a plain,
synchronous kfree(). The discovery receive path, however, still reads
that object under RCU in softirq context:

  tipc_udp_recv()            // udp_media.c, rcu_dereference(ub->bearer)
    -> tipc_rcv()            // node.c
      -> tipc_disc_rcv()     // discover.c
        -> tipc_disc_addr_trial_msg(b->disc, ...)  // reads d->net etc.

tipc_udp_recv() only gates this path on test_bit(0, &b->up), which is a
TOCTOU check: an RX softirq that observes b->up == 1 before
bearer_disable() does clear_bit_unlock(0, &b->up) can still be executing
inside tipc_disc_rcv() when bearer_disable() reaches

	if (b->disc)
		tipc_disc_delete(b->disc);

and kfree()s the discoverer. The reader then dereferences freed memory
(d->net, inlined into tipc_disc_rcv()) in softirq context [0].

The bearer itself is freed RCU-safely (tipc_bearer_put() ->
kfree_rcu(b, rcu)) because the RX path runs under RCU, but the discoverer
hanging off b->disc is freed synchronously. The same b->disc is also
touched under rcu_read_lock() by
tipc_disc_add_dest()/tipc_disc_remove_dest().

Free the discoverer with the same RCU lifetime as its bearer. Add an
rcu_head to struct tipc_discoverer and defer the kfree_skb()/kfree() to
an RCU callback so any in-flight reader that already loaded b->disc
completes before the memory is released. The timer is still shut down
synchronously up front with timer_shutdown_sync() (which can sleep and
must not run from the RCU callback), and shutting it down before the
grace period prevents the periodic LINK_REQUEST timer from rearming or
re-entering the object.

This mirrors the existing TIPC pattern of pairing call_rcu() with a
cleanup callback (see tipc_node_free()/tipc_aead_free()).

[0]: (trailing page/memory-state dump trimmed)
BUG: KASAN: slab-use-after-free in tipc_disc_addr_trial_msg net/tipc/discover.c:149 [inline]
BUG: KASAN: slab-use-after-free in tipc_disc_rcv+0xe7c/0x103c net/tipc/discover.c:236
Read of size 8 at addr ffff000028f07428 by task ksoftirqd/0/15

CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 7.0.11 #3 PREEMPT
Hardware name: linux,dummy-virt (DT)
Call trace:
 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0xb4/0xd4 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0x118/0x5d8 mm/kasan/report.c:482
 kasan_report+0xb0/0xf4 mm/kasan/report.c:595
 __asan_report_load8_noabort+0x20/0x2c mm/kasan/report_generic.c:381
 tipc_disc_addr_trial_msg net/tipc/discover.c:149 [inline]
 tipc_disc_rcv+0xe7c/0x103c net/tipc/discover.c:236
 tipc_rcv+0x1884/0x2b1c net/tipc/node.c:2126
 tipc_udp_recv+0x22c/0x684 net/tipc/udp_media.c:393
 udp_queue_rcv_one_skb+0x898/0x1798 net/ipv4/udp.c:2441
 udp_queue_rcv_skb+0x1b0/0xa44 net/ipv4/udp.c:2518
 udp_unicast_rcv_skb+0x13c/0x348 net/ipv4/udp.c:2678
 __udp4_lib_rcv+0x1aec/0x246c net/ipv4/udp.c:2754
 udp_rcv+0x78/0xa0 net/ipv4/udp.c:2936
 ip_protocol_deliver_rcu+0x68/0x410 net/ipv4/ip_input.c:207
 ip_local_deliver_finish+0x28c/0x4b4 net/ipv4/ip_input.c:241
 NF_HOOK include/linux/netfilter.h:318 [inline]
 NF_HOOK include/linux/netfilter.h:312 [inline]
 ip_local_deliver+0x29c/0x2ec net/ipv4/ip_input.c:262
 dst_input include/net/dst.h:480 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:453 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:439 [inline]
 NF_HOOK include/linux/netfilter.h:318 [inline]
 NF_HOOK include/linux/netfilter.h:312 [inline]
 ip_rcv+0x21c/0x258 net/ipv4/ip_input.c:573
 __netif_receive_skb_one_core+0x110/0x184 net/core/dev.c:6195
 __netif_receive_skb+0x2c/0x170 net/core/dev.c:6308
 process_backlog+0x178/0x488 net/core/dev.c:6659
 __napi_poll+0xa8/0x540 net/core/dev.c:7726
 napi_poll net/core/dev.c:7789 [inline]
 net_rx_action+0x360/0x964 net/core/dev.c:7946
 handle_softirqs+0x2f0/0x7b0 kernel/softirq.c:622
 run_ksoftirqd kernel/softirq.c:1063 [inline]
 run_ksoftirqd+0x6c/0x88 kernel/softirq.c:1055
 smpboot_thread_fn+0x65c/0x958 kernel/smpboot.c:160
 kthread+0x39c/0x444 kernel/kthread.c:436
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

Allocated by task 68873:
 kasan_save_stack+0x3c/0x64 mm/kasan/common.c:57
 kasan_save_track+0x20/0x3c mm/kasan/common.c:78
 kasan_save_alloc_info+0x40/0x54 mm/kasan/generic.c:570
 poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
 __kasan_kmalloc+0xd4/0xd8 mm/kasan/common.c:415
 kasan_kmalloc include/linux/kasan.h:263 [inline]
 __kmalloc_cache_noprof+0x1b0/0x458 mm/slub.c:5385
 kmalloc_noprof include/linux/slab.h:950 [inline]
 tipc_disc_create+0xdc/0x5e0 net/tipc/discover.c:356
 tipc_enable_bearer+0x8b8/0xf94 net/tipc/bearer.c:348
 __tipc_nl_bearer_enable+0x2a8/0x398 net/tipc/bearer.c:1047
 tipc_nl_bearer_enable+0x2c/0x48 net/tipc/bearer.c:1056
 genl_family_rcv_msg_doit+0x1e4/0x2c0 net/netlink/genetlink.c:1114
 genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
 genl_rcv_msg+0x4e8/0x750 net/netlink/genetlink.c:1209
 netlink_rcv_skb+0x204/0x3cc net/netlink/af_netlink.c:2550
 genl_rcv+0x3c/0x54 net/netlink/genetlink.c:1218
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x638/0x930 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x798/0xc68 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg+0xe0/0x128 net/socket.c:742
 __sys_sendto+0x230/0x2f4 net/socket.c:2206
 __do_sys_sendto net/socket.c:2213 [inline]
 __se_sys_sendto net/socket.c:2209 [inline]
 __arm64_sys_sendto+0xc4/0x13c net/socket.c:2209
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall+0x84/0x2a8 arch/arm64/kernel/syscall.c:49
 el0_svc_common.constprop.0+0xe4/0x294 arch/arm64/kernel/syscall.c:132
 do_el0_svc+0x44/0x5c arch/arm64/kernel/syscall.c:151
 el0_svc+0x38/0xac arch/arm64/kernel/entry-common.c:724
 el0t_64_sync_handler+0xa0/0xe4 arch/arm64/kernel/entry-common.c:743
 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596

Freed by task 60072:
 kasan_save_stack+0x3c/0x64 mm/kasan/common.c:57
 kasan_save_track+0x20/0x3c mm/kasan/common.c:78
 kasan_save_free_info+0x4c/0x74 mm/kasan/generic.c:584
 poison_slab_object mm/kasan/common.c:253 [inline]
 __kasan_slab_free+0x88/0xb8 mm/kasan/common.c:285
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:2685 [inline]
 slab_free mm/slub.c:6170 [inline]
 kfree+0x14c/0x458 mm/slub.c:6488
 tipc_disc_delete+0x50/0x68 net/tipc/discover.c:393
 bearer_disable+0x18c/0x278 net/tipc/bearer.c:418
 tipc_bearer_stop+0xe0/0x198 net/tipc/bearer.c:757
 tipc_net_stop+0x110/0x178 net/tipc/net.c:159
 tipc_exit_net+0x80/0x19c net/tipc/core.c:112
 ops_exit_list net/core/net_namespace.c:199 [inline]
 ops_undo_list+0x244/0x694 net/core/net_namespace.c:252
 cleanup_net+0x3a0/0x830 net/core/net_namespace.c:702
 process_one_work+0x628/0xd38 kernel/workqueue.c:3289
 process_scheduled_works kernel/workqueue.c:3372 [inline]
 worker_thread+0x7a8/0xac0 kernel/workqueue.c:3453
 kthread+0x39c/0x444 kernel/kthread.c:436
 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash values")
Cc: stable@vger.kernel.org
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
v2:
 - Wrap the over-80-column container_of() line in tipc_disc_free_rcu()
   to fix the coding-style issue raised in review.

v1: https://lore.kernel.org/netdev/20260615144233.1730935-1-sam@bynar.io/

 net/tipc/discover.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/net/tipc/discover.c b/net/tipc/discover.c
index 3e54d2df5683..761b625bba5a 100644
--- a/net/tipc/discover.c
+++ b/net/tipc/discover.c
@@ -49,6 +49,7 @@
 
 /**
  * struct tipc_discoverer - information about an ongoing link setup request
+ * @rcu: RCU head used to free the structure after a grace period
  * @bearer_id: identity of bearer issuing requests
  * @net: network namespace instance
  * @dest: destination address for request messages
@@ -60,6 +61,7 @@
  * @timer_intv: current interval between requests (in ms)
  */
 struct tipc_discoverer {
+	struct rcu_head rcu;
 	u32 bearer_id;
 	struct tipc_media_addr dest;
 	struct net *net;
@@ -382,6 +384,18 @@ int tipc_disc_create(struct net *net, struct tipc_bearer *b,
 	return 0;
 }
 
+/* RCU callback: free the discoverer only after any concurrent
+ * tipc_disc_rcv() softirq reader of bearer->disc has finished.
+ */
+static void tipc_disc_free_rcu(struct rcu_head *rp)
+{
+	struct tipc_discoverer *d;
+
+	d = container_of(rp, struct tipc_discoverer, rcu);
+	kfree_skb(d->skb);
+	kfree(d);
+}
+
 /**
  * tipc_disc_delete - destroy object sending periodic link setup requests
  * @d: ptr to link dest structure
@@ -389,8 +403,7 @@ int tipc_disc_create(struct net *net, struct tipc_bearer *b,
 void tipc_disc_delete(struct tipc_discoverer *d)
 {
 	timer_shutdown_sync(&d->timer);
-	kfree_skb(d->skb);
-	kfree(d);
+	call_rcu(&d->rcu, tipc_disc_free_rcu);
 }
 
 /**

base-commit: 47186409c092cd7dd70350999186c700233e854d
-- 
2.54.0


^ permalink raw reply related

* Re: [syzbot] [net?] KASAN: slab-use-after-free Read in fib_rules_lookup
From: Kuniyuki Iwashima @ 2026-06-16 17:59 UTC (permalink / raw)
  To: kuniyu
  Cc: davem, dsahern, edumazet, horms, idosch, kuba, linux-kernel,
	netdev, pabeni, syzbot+965506b59a2de0b6905c, syzkaller-bugs
In-Reply-To: <CAAVpQUB8W6nXOq-OQfSArKC_xzFbQ=dg62Ee3R=0nuX0sW0fMg@mail.gmail.com>

From: Kuniyuki Iwashima <kuniyu@google.com>
Date: Tue, 16 Jun 2026 10:06:55 -0700
> On Tue, Jun 16, 2026 at 8:55 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Jun 16, 2026 at 8:31 AM Ido Schimmel <idosch@nvidia.com> wrote:
> > >
> > > On Tue, Jun 16, 2026 at 07:05:24AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    72dfa4700f78 net: dsa: sja1105: fix lastused timestamp in ..
> > >
> > > This includes commit 759923cf03b0 ("ipv4: fib: Convert
> > > fib_net_exit_batch() to ->exit_rtnl().") that moved ip_fib_net_exit()
> > > (and therefore fib4_rules_exit()) earlier in the netns dismantle path.
> > >
> > > Kuniyuki, can you please take a look?
> > >
> > > You can use this to reproduce:
> > >
> > > #!/bin/bash
> > >
> > > while true; do
> > >         ip netns add ns1
> > >         ip -n ns1 link set dev lo up
> > >         ip -n ns1 address add 192.0.2.1/24 dev lo
> > >         ip -n ns1 link add name dummy1 up type dummy
> > >         ip -n ns1 address add 198.51.100.1/24 dev dummy1
> > >         ip -n ns1 rule add ipproto tcp sport 12345 table 12345
> > >         ip -n ns1 fou add port 5555 ipproto 47 local 192.0.2.1 peer 198.51.100.2 peer_port 54321
> > >         ip netns del ns1
> > > done
> > >
> >
> > Oh right.
> >
> > While looking at this syzbot report I also found an old issue.
> >
> > https://lore.kernel.org/netdev/20260616141317.407791-1-edumazet@google.com/T/#u
> >
> > I guess adding some delays in enqueue_to_backlog() could trigger a
> > similar bug even if we revert Kuniyuki's patch.
> 
> I'll look into it, thank you both !

I'll move fib4_rules_exit() to ->exit().

fib_unmerge() requires RTNL, but it is not needed in ->delete()
in the first place since it's already called in ->configure().

---8<---
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index c7d1f31650d7..42212970d735 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1612,10 +1612,6 @@ static void ip_fib_net_exit(struct net *net)
 			fib_free_table(tb);
 		}
 	}
-
-#ifdef CONFIG_IP_MULTIPLE_TABLES
-	fib4_rules_exit(net);
-#endif
 }
 
 static int __net_init fib_net_init(struct net *net)
@@ -1652,6 +1648,9 @@ static int __net_init fib_net_init(struct net *net)
 	ip_fib_net_exit(net);
 	rtnl_net_unlock(net);
 
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	goto out;
@@ -1671,6 +1670,9 @@ static void __net_exit fib_net_exit_rtnl(struct net *net,
 
 static void __net_exit fib_net_exit(struct net *net)
 {
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	fib4_semantics_exit(net);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 51f0193092f0..0bf6204468c5 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -352,12 +352,6 @@ static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
 static int fib4_rule_delete(struct fib_rule *rule)
 {
 	struct net *net = rule->fr_net;
-	int err;
-
-	/* split local/main if they are not already split */
-	err = fib_unmerge(net);
-	if (err)
-		goto errout;
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
 	if (((struct fib4_rule *)rule)->tclassid)
@@ -368,8 +362,8 @@ static int fib4_rule_delete(struct fib_rule *rule)
 	if (net->ipv4.fib_rules_require_fldissect &&
 	    fib_rule_requires_fldissect(rule))
 		net->ipv4.fib_rules_require_fldissect--;
-errout:
-	return err;
+
+	return 0;
 }
 
 static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
---8<---



> 
> >
> >
> >
> >
> > > Thanks
> > >
> > > > git tree:       net-next
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15794bd2580000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=a0842261b62cdea8
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=965506b59a2de0b6905c
> > > > compiler:       Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/d4e16f50a97c/disk-72dfa470.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/6cd4a736e796/vmlinux-72dfa470.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/548b0011c8e8/bzImage-72dfa470.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+965506b59a2de0b6905c@syzkaller.appspotmail.com
> > > >
> > > > bond0 (unregistering): Released all slaves
> > > > bond1 (unregistering): Released all slaves
> > > > bond2 (unregistering): (slave dummy0): Releasing active interface
> > > > bond2 (unregistering): Released all slaves
> > > > ==================================================================
> > > > BUG: KASAN: slab-use-after-free in fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
> > > > Read of size 8 at addr ffff88804ec4c680 by task kworker/u8:21/12641
> > > >
> > > > CPU: 0 UID: 0 PID: 12641 Comm: kworker/u8:21 Not tainted syzkaller #0 PREEMPT(full)
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
> > > > Workqueue: netns cleanup_net
> > > > Call Trace:
> > > >  <TASK>
> > > >  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
> > > >  print_address_description+0x55/0x1e0 mm/kasan/report.c:378
> > > >  print_report+0x58/0x70 mm/kasan/report.c:482
> > > >  kasan_report+0x117/0x150 mm/kasan/report.c:595
> > > >  fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
> > > >  __fib_lookup+0x106/0x210 net/ipv4/fib_rules.c:96
> > > >  ip_route_output_key_hash_rcu+0x294/0x2720 net/ipv4/route.c:2811
> > > >  ip_route_output_key_hash+0x18d/0x2a0 net/ipv4/route.c:2702
> > > >  __ip_route_output_key include/net/route.h:169 [inline]
> > > >  ip_route_output_flow+0x2a/0x150 net/ipv4/route.c:2929
> > > >  ip4_datagram_release_cb+0x89d/0xbe0 net/ipv4/datagram.c:118
> > > >  release_sock+0x206/0x260 net/core/sock.c:3861
> > > >  inet_shutdown+0x2b1/0x390 net/ipv4/af_inet.c:950
> > > >  udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
> > > >  fou_release net/ipv4/fou_core.c:562 [inline]
> > > >  fou_exit_net+0x17d/0x1f0 net/ipv4/fou_core.c:1230
> > > >  ops_exit_list net/core/net_namespace.c:199 [inline]
> > > >  ops_undo_list+0x43d/0x8d0 net/core/net_namespace.c:252
> > > >  cleanup_net+0x572/0x810 net/core/net_namespace.c:702
> > > >  process_one_work kernel/workqueue.c:3314 [inline]
> > > >  process_scheduled_works+0xa8e/0x14e0 kernel/workqueue.c:3397
> > > >  worker_thread+0xa47/0xfb0 kernel/workqueue.c:3478
> > > >  kthread+0x389/0x470 kernel/kthread.c:436
> > > >  ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> > > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> > > >  </TASK>
> > > >
> > > > Allocated by task 19121:
> > > >  kasan_save_stack mm/kasan/common.c:57 [inline]
> > > >  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> > > >  poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
> > > >  __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
> > > >  kasan_kmalloc include/linux/kasan.h:263 [inline]
> > > >  __do_kmalloc_node mm/slub.c:5296 [inline]
> > > >  __kmalloc_node_track_caller_noprof+0x4d7/0x7b0 mm/slub.c:5408
> > > >  kmemdup_noprof+0x2b/0x70 mm/util.c:138
> > > >  kmemdup_noprof include/linux/fortify-string.h:763 [inline]
> > > >  fib_rules_register+0x2f/0x400 net/core/fib_rules.c:170
> > > >  fib4_rules_init+0x21/0x160 net/ipv4/fib_rules.c:508
> > > >  ip_fib_net_init net/ipv4/fib_frontend.c:1578 [inline]
> > > >  fib_net_init+0x17a/0x3e0 net/ipv4/fib_frontend.c:1628
> > > >  ops_init+0x35d/0x5d0 net/core/net_namespace.c:137
> > > >  setup_net+0x118/0x350 net/core/net_namespace.c:446
> > > >  copy_net_ns+0x4f9/0x720 net/core/net_namespace.c:579
> > > >  create_new_namespaces+0x3f0/0x6b0 kernel/nsproxy.c:132
> > > >  unshare_nsproxy_namespaces+0x149/0x190 kernel/nsproxy.c:234
> > > >  ksys_unshare+0x57d/0xa00 kernel/fork.c:3242
> > > >  __do_sys_unshare kernel/fork.c:3316 [inline]
> > > >  __se_sys_unshare kernel/fork.c:3314 [inline]
> > > >  __x64_sys_unshare+0x38/0x50 kernel/fork.c:3314
> > > >  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> > > >  do_syscall_64+0x174/0x580 arch/x86/entry/syscall_64.c:94
> > > >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > >
> > > > Freed by task 12641:
> > > >  kasan_save_stack mm/kasan/common.c:57 [inline]
> > > >  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
> > > >  kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:584
> > > >  poison_slab_object mm/kasan/common.c:253 [inline]
> > > >  __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
> > > >  kasan_slab_free include/linux/kasan.h:235 [inline]
> > > >  slab_free_hook mm/slub.c:2689 [inline]
> > > >  __rcu_free_sheaf_prepare+0x12d/0x2a0 mm/slub.c:2940
> > > >  rcu_free_sheaf+0x31/0x200 mm/slub.c:5850
> > > >  rcu_do_batch kernel/rcu/tree.c:2617 [inline]
> > > >  rcu_core+0x78b/0x10a0 kernel/rcu/tree.c:2869
> > > >  handle_softirqs+0x225/0x840 kernel/softirq.c:622
> > > >  do_softirq+0x76/0xd0 kernel/softirq.c:523
> > > >  __local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450
> > > >  unregister_netdevice_many_notify+0x1874/0x2150 net/core/dev.c:12445
> > > >  ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
> > > >  ops_undo_list+0x391/0x8d0 net/core/net_namespace.c:248
> > > >  cleanup_net+0x572/0x810 net/core/net_namespace.c:702
> > > >  process_one_work kernel/workqueue.c:3314 [inline]
> > > >  process_scheduled_works+0xa8e/0x14e0 kernel/workqueue.c:3397
> > > >  worker_thread+0xa47/0xfb0 kernel/workqueue.c:3478
> > > >  kthread+0x389/0x470 kernel/kthread.c:436
> > > >  ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
> > > >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> > > >
> > > > The buggy address belongs to the object at ffff88804ec4c600
> > > >  which belongs to the cache kmalloc-192 of size 192
> > > > The buggy address is located 128 bytes inside of
> > > >  freed 192-byte region [ffff88804ec4c600, ffff88804ec4c6c0)
> > > >
> > > > The buggy address belongs to the physical page:
> > > > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4ec4c
> > > > flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
> > > > page_type: f5(slab)
> > > > raw: 00fff00000000000 ffff88813fe163c0 dead000000000100 dead000000000122
> > > > raw: 0000000000000000 0000000800100010 00000000f5000000 0000000000000000
> > > > page dumped because: kasan: bad access detected
> > > > page_owner tracks the page as allocated
> > > > page last allocated via order 0, migratetype Unmovable, gfp_mask 0xd2cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 13856, tgid 13853 (syz.3.2144), ts 351172300879, free_ts 351133053454
> > > >  set_page_owner include/linux/page_owner.h:32 [inline]
> > > >  post_alloc_hook+0x22d/0x280 mm/page_alloc.c:1853
> > > >  prep_new_page mm/page_alloc.c:1861 [inline]
> > > >  get_page_from_freelist+0x24ae/0x2530 mm/page_alloc.c:3941
> > > >  __alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5221
> > > >  alloc_slab_page mm/slub.c:3278 [inline]
> > > >  allocate_slab+0x77/0x660 mm/slub.c:3467
> > > >  new_slab mm/slub.c:3525 [inline]
> > > >  refill_objects+0x336/0x3d0 mm/slub.c:7272
> > > >  refill_sheaf mm/slub.c:2816 [inline]
> > > >  __pcs_replace_empty_main+0x320/0x720 mm/slub.c:4652
> > > >  alloc_from_pcs mm/slub.c:4750 [inline]
> > > >  slab_alloc_node mm/slub.c:4884 [inline]
> > > >  __do_kmalloc_node mm/slub.c:5295 [inline]
> > > >  __kmalloc_noprof+0x464/0x750 mm/slub.c:5308
> > > >  kmalloc_noprof include/linux/slab.h:954 [inline]
> > > >  kzalloc_noprof include/linux/slab.h:1188 [inline]
> > > >  new_dir fs/proc/proc_sysctl.c:966 [inline]
> > > >  get_subdir fs/proc/proc_sysctl.c:1010 [inline]
> > > >  sysctl_mkdir_p fs/proc/proc_sysctl.c:1320 [inline]
> > > >  __register_sysctl_table+0xc02/0x1370 fs/proc/proc_sysctl.c:1395
> > > >  neigh_sysctl_register+0x9b1/0xa90 net/core/neighbour.c:3915
> > > >  addrconf_sysctl_register+0xb3/0x1c0 net/ipv6/addrconf.c:7396
> > > >  ipv6_add_dev+0xd26/0x13a0 net/ipv6/addrconf.c:460
> > > >  addrconf_notify+0x771/0x1050 net/ipv6/addrconf.c:3679
> > > >  notifier_call_chain+0x1a5/0x3d0 kernel/notifier.c:85
> > > >  call_netdevice_notifiers_extack net/core/dev.c:2288 [inline]
> > > >  call_netdevice_notifiers net/core/dev.c:2302 [inline]
> > > >  register_netdevice+0x18db/0x1f00 net/core/dev.c:11474
> > > >  macsec_newlink+0x706/0x1200 drivers/net/macsec.c:4218
> > > >  rtnl_newlink_create+0x310/0xb00 net/core/rtnetlink.c:3905
> > > > page last free pid 12657 tgid 12657 stack trace:
> > > >  reset_page_owner include/linux/page_owner.h:25 [inline]
> > > >  __free_pages_prepare mm/page_alloc.c:1397 [inline]
> > > >  __free_frozen_pages+0xc0d/0xd20 mm/page_alloc.c:2938
> > > >  __tlb_remove_table_free mm/mmu_gather.c:228 [inline]
> > > >  tlb_remove_table_rcu+0x85/0x100 mm/mmu_gather.c:291
> > > >  rcu_do_batch kernel/rcu/tree.c:2617 [inline]
> > > >  rcu_core+0x78b/0x10a0 kernel/rcu/tree.c:2869
> > > >  handle_softirqs+0x225/0x840 kernel/softirq.c:622
> > > >  __do_softirq kernel/softirq.c:656 [inline]
> > > >  invoke_softirq kernel/softirq.c:496 [inline]
> > > >  __irq_exit_rcu+0xca/0x220 kernel/softirq.c:735
> > > >  irq_exit_rcu+0x9/0x30 kernel/softirq.c:752
> > > >  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1061 [inline]
> > > >  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1061
> > > >  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:697
> > > >
> > > > Memory state around the buggy address:
> > > >  ffff88804ec4c580: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > >  ffff88804ec4c600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > > >ffff88804ec4c680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> > > >                    ^
> > > >  ffff88804ec4c700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >  ffff88804ec4c780: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
> > > > ==================================================================
> > > >
> > > >
> > > > ---
> > > > This report is generated by a bot. It may contain errors.
> > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > > >
> > > > syzbot will keep track of this issue. See:
> > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > >
> > > > If the report is already addressed, let syzbot know by replying with:
> > > > #syz fix: exact-commit-title
> > > >
> > > > If you want to overwrite report's subsystems, reply with:
> > > > #syz set subsystems: new-subsystem
> > > > (See the list of subsystem names on the web dashboard)
> > > >
> > > > If the report is a duplicate of another one, reply with:
> > > > #syz dup: exact-subject-of-another-report
> > > >
> > > > If you want to undo deduplication, reply with:
> > > > #syz undup
> 

^ permalink raw reply related

* [PATCH nf-next v3 1/4] netfilter: nf_nat_ftp: replace u_int16_t with u16
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, netdev, linux-kernel
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Use preferred kernel integer type u16 instead of the POSIX u_int16_t
variant.

No functional change.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 net/netfilter/nf_nat_ftp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_nat_ftp.c b/net/netfilter/nf_nat_ftp.c
index c92a436d9c48..ab714629e2b1 100644
--- a/net/netfilter/nf_nat_ftp.c
+++ b/net/netfilter/nf_nat_ftp.c
@@ -69,7 +69,7 @@ static unsigned int nf_nat_ftp(struct sk_buff *skb,
 			       struct nf_conntrack_expect *exp)
 {
 	union nf_inet_addr newaddr;
-	u_int16_t port;
+	u16 port;
 	int dir = CTINFO2DIR(ctinfo);
 	struct nf_conn *ct = exp->master;
 	char buffer[sizeof("|1||65535|") + INET6_ADDRSTRLEN];
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 3/4] netfilter: nf_sockopt: replace u_int8_t with u8
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, linux-kernel, netdev
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int8_t with preferred kernel type u8, update prototype
and struct definition.

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 include/linux/netfilter.h  | 6 +++---
 net/netfilter/nf_sockopt.c | 8 ++++----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index efbbfa770d66..91b68bdba3f5 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -181,7 +181,7 @@ static inline void nf_hook_state_init(struct nf_hook_state *p,
 struct nf_sockopt_ops {
 	struct list_head list;
 
-	u_int8_t pf;
+	u8 pf;
 
 	/* Non-inclusive ranges: use 0/0/NULL to never get called. */
 	int set_optmin;
@@ -357,9 +357,9 @@ NF_HOOK_LIST(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk,
 }
 
 /* Call setsockopt() */
-int nf_setsockopt(struct sock *sk, u_int8_t pf, int optval, sockptr_t opt,
+int nf_setsockopt(struct sock *sk, u8 pf, int optval, sockptr_t opt,
 		  unsigned int len);
-int nf_getsockopt(struct sock *sk, u_int8_t pf, int optval, char __user *opt,
+int nf_getsockopt(struct sock *sk, u8 pf, int optval, char __user *opt,
 		  int *len);
 
 struct flowi;
diff --git a/net/netfilter/nf_sockopt.c b/net/netfilter/nf_sockopt.c
index 34afcd03b6f6..19a1d028158c 100644
--- a/net/netfilter/nf_sockopt.c
+++ b/net/netfilter/nf_sockopt.c
@@ -59,8 +59,8 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg)
 }
 EXPORT_SYMBOL(nf_unregister_sockopt);
 
-static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u_int8_t pf,
-		int val, int get)
+static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u8 pf,
+					      int val, int get)
 {
 	struct nf_sockopt_ops *ops;
 
@@ -89,7 +89,7 @@ static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u_int8_t pf,
 	return ops;
 }
 
-int nf_setsockopt(struct sock *sk, u_int8_t pf, int val, sockptr_t opt,
+int nf_setsockopt(struct sock *sk, u8 pf, int val, sockptr_t opt,
 		  unsigned int len)
 {
 	struct nf_sockopt_ops *ops;
@@ -104,7 +104,7 @@ int nf_setsockopt(struct sock *sk, u_int8_t pf, int val, sockptr_t opt,
 }
 EXPORT_SYMBOL(nf_setsockopt);
 
-int nf_getsockopt(struct sock *sk, u_int8_t pf, int val, char __user *opt,
+int nf_getsockopt(struct sock *sk, u8 pf, int val, char __user *opt,
 		  int *len)
 {
 	struct nf_sockopt_ops *ops;
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 4/4] netfilter: xt_TCPOPTSTRIP: replace u_int8_t and u_int16_t with u8 and u16
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, netdev, linux-kernel
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int8_t/u_int16_t with preferred kernel types u8/u16

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 net/netfilter/xt_TCPOPTSTRIP.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/xt_TCPOPTSTRIP.c b/net/netfilter/xt_TCPOPTSTRIP.c
index 93f064306901..265d21697847 100644
--- a/net/netfilter/xt_TCPOPTSTRIP.c
+++ b/net/netfilter/xt_TCPOPTSTRIP.c
@@ -16,7 +16,7 @@
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter/xt_TCPOPTSTRIP.h>
 
-static inline unsigned int optlen(const u_int8_t *opt, unsigned int offset)
+static inline unsigned int optlen(const u8 *opt, unsigned int offset)
 {
 	/* Beware zero-length options: make finite progress */
 	if (opt[offset] <= TCPOPT_NOP || opt[offset+1] == 0)
@@ -33,8 +33,8 @@ tcpoptstrip_mangle_packet(struct sk_buff *skb,
 	const struct xt_tcpoptstrip_target_info *info = par->targinfo;
 	struct tcphdr *tcph, _th;
 	unsigned int optl, i, j;
-	u_int16_t n, o;
-	u_int8_t *opt;
+	u16 n, o;
+	u8 *opt;
 	int tcp_hdrlen;
 
 	/* This is a fragment, no TCP header is available */
@@ -97,7 +97,7 @@ tcpoptstrip_tg6(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 	int tcphoff;
-	u_int8_t nexthdr;
+	u8 nexthdr;
 	__be16 frag_off;
 
 	nexthdr = ipv6h->nexthdr;
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 2/4] netfilter: nf_nat_irc: replace u_int16_t with u16
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, netdev, linux-kernel
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int16_t with preferred kernel type u16

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 net/netfilter/nf_nat_irc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_nat_irc.c b/net/netfilter/nf_nat_irc.c
index 19c4fcc60c50..14b79cb0171b 100644
--- a/net/netfilter/nf_nat_irc.c
+++ b/net/netfilter/nf_nat_irc.c
@@ -39,7 +39,7 @@ static unsigned int help(struct sk_buff *skb,
 	char buffer[sizeof("4294967296 65635")];
 	struct nf_conn *ct = exp->master;
 	union nf_inet_addr newaddr;
-	u_int16_t port;
+	u16 port;
 
 	/* Reply comes from server. */
 	newaddr = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3;
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 0/4] netfilter: replace u_int*_t with kernel int types
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, linux-kernel, netdev

Hi all! This is my first patch series of many, I hope :)
I'd like to start contributing by helping out with janitor work,
standardizing code and cleaning up.

This patch series replaces POSIX u_int8_t/u_int16_t with the preferred
kernel types u8/u16 across several netfilter files.

u_int*_t appears in many other files, but I wanted to keep this series
small, unless advised otherwise.

No functional changes.

Changes in v3:
- dropping changes to nf_log and xt_DSCP (need deeper understanding of the
  subsystem before converting these correctly)
- link to v2: https://lore.kernel.org/all/20260615133835.51273-1-carlos@carlosgrillet.me

Changes in v2:
- addresses sashiko comments https://sashiko.dev/#/patchset/32368
  - nf_sockopt: update function prototypes and struct definitions
  - nf_log: update the corresponding function declarations and the
    nf_logfn typedef
- link to v1: https://lore.kernel.org/all/20260612125146.75672-1-carlos@carlosgrillet.me

Carlos Grillet (4):
  netfilter: nf_nat_ftp: replace u_int16_t with u16
  netfilter: nf_nat_irc: replace u_int16_t with u16
  netfilter: nf_sockopt: replace u_int8_t with u8
  netfilter: xt_TCPOPTSTRIP: replace u_int8_t and u_int16_t with u8 and u16

 include/linux/netfilter.h      | 6 +++---
 net/netfilter/nf_nat_ftp.c     | 2 +-
 net/netfilter/nf_nat_irc.c     | 2 +-
 net/netfilter/nf_sockopt.c     | 8 ++++----
 net/netfilter/xt_TCPOPTSTRIP.c | 8 ++++----
 5 files changed, 13 insertions(+), 13 deletions(-)

-- 
2.54.0


^ permalink raw reply

* Re: [PATCH bpf] bpf, sockmap: fix lock inversion between stab->lock and sk_callback_lock
From: Sechang Lim @ 2026-06-16 18:40 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, David S . Miller, Jakub Kicinski, Simon Horman,
	netdev, bpf, linux-kernel
In-Reply-To: <575a878e-6d37-4337-a821-4883d3dd3a63@linux.dev>

On Tue, Jun 16, 2026 at 06:17:48PM +0800, Jiayuan Chen wrote:
>
>On 6/16/26 5:11 PM, Sechang Lim wrote:
>>sock_map_update_common() and __sock_map_delete() hold stab->lock and call
>>sock_map_unref() -> sock_map_del_link() under it. sock_map_del_link() takes
>>sk_callback_lock for write to stop the strparser and verdict, giving the
>>lock order stab->lock -> sk_callback_lock.
>>
>>The opposite order comes from an SK_SKB stream parser. On RX,
>>sk_psock_strp_data_ready() holds sk_callback_lock for read while running
>>the parser. The verdict redirects the skb to egress, where a sched_cls
>
>
>The commit message is wrong. A verdict does not redirect to egress
>synchronously — sk_psock_skb_redirect() only queues the skb and
>schedule_delayed_work()s sk_psock_backlog, so egress runs in workqueue
>context, not under sk_callback_lock.
>

Thanks, you're right. it's the inline ACK, not the redirect. Sorry for
the misleading changelog, I'll fix it in v2.

>
>>program calls bpf_map_delete_elem() on a sockmap, which takes stab->lock:
>>
>>   WARNING: possible circular locking dependency detected
>>   7.1.0-rc6 Not tainted
>>   ------------------------------------------------------
>>   syz.9.8824 is trying to acquire lock:
>>   (&stab->lock){+.-.}-{3:3}, at: __sock_map_delete net/core/sock_map.c:421
>>   but task is already holding lock:
>>   (clock-AF_INET){++.-}-{3:3}, at: sk_psock_strp_data_ready net/core/skmsg.c:1173
>>
>>   -> #1 (clock-AF_INET){++.-}-{3:3}:
>>          _raw_write_lock_bh
>>          sock_map_del_link net/core/sock_map.c:167
>>          sock_map_unref net/core/sock_map.c:184
>>          sock_map_update_common net/core/sock_map.c:509
>>          sock_map_update_elem_sys net/core/sock_map.c:588
>>          map_update_elem kernel/bpf/syscall.c:1805
>>
>>   -> #0 (&stab->lock){+.-.}-{3:3}:
>>          _raw_spin_lock_bh
>>          __sock_map_delete net/core/sock_map.c:421
>>          sock_map_delete_elem net/core/sock_map.c:452
>>          bpf_prog_06044d24140080b6
>>          tcx_run net/core/dev.c:4451
>>          sch_handle_egress net/core/dev.c:4541
>>          __dev_queue_xmit net/core/dev.c:4808
>>          ...
>>          tcp_bpf_strp_read_sock net/ipv4/tcp_bpf.c:701
>
>
>I guess it is an ACK. What is the actual purpose of a sched_cls 
>program calling
>
>sockmap delete on the TX path of an ACK? If there is no real use case 
>for it, this is
>
>just broken BPF usage, not a kernel bug worth this change.
>
>

I don't have a real use case for that exact program. But the verifier
allows sockmap delete from tc, and it deadlocks when the strparser's
socket is concurrently removed from the same map. The fix only moves
sock_map_unref() out from under stab->lock.

Best,
Sechang

^ permalink raw reply

* [net PATCH v2] octeontx2-pf: mcs: Fix mcs resources free on PF shutdown
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep
In-Reply-To: <1781636420-19816-1-git-send-email-sbhatta@marvell.com>

From: Geetha sowjanya <gakula@marvell.com>

On PF shutdown, the current driver free mcs hardware
resources though mcs resources are not allocated to it.
This patch checks the mcs resources status and if resources
are allocated then only sends mailbox message to free them.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
v2 changes:
 Fixed AI review so that pfvf->macsec_cfg is freed correctly

 .../net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c    | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
index 2cc1bdfd9b2e..4d3a7f4be962 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
@@ -1776,11 +1776,16 @@ int cn10k_mcs_init(struct otx2_nic *pfvf)
 
 void cn10k_mcs_free(struct otx2_nic *pfvf)
 {
+	struct cn10k_mcs_cfg *cfg = pfvf->macsec_cfg;
+
 	if (!test_bit(CN10K_HW_MACSEC, &pfvf->hw.cap_flag))
 		return;
 
-	cn10k_mcs_free_rsrc(pfvf, MCS_TX, MCS_RSRC_TYPE_SECY, 0, true);
-	cn10k_mcs_free_rsrc(pfvf, MCS_RX, MCS_RSRC_TYPE_SECY, 0, true);
+	if (!list_empty(&cfg->txsc_list)) {
+		cn10k_mcs_free_rsrc(pfvf, MCS_TX, MCS_RSRC_TYPE_SECY, 0, true);
+		cn10k_mcs_free_rsrc(pfvf, MCS_RX, MCS_RSRC_TYPE_SECY, 0, true);
+	}
+
 	kfree(pfvf->macsec_cfg);
 	pfvf->macsec_cfg = NULL;
 }
-- 
2.48.1


^ permalink raw reply related

* [net PATCH v2] octeontx2-af: mcs: Fix unsupported secy stats read
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep

From: Geetha sowjanya <gakula@marvell.com>

Secy control stats counter doesn't exist for CNF10KB platform.
Skip reading this respective register for CNF10KB silicon while
fetching secy stats.

Fixes: 9312150af8da ("octeontx2-af: cn10k: mcs: Support for stats collection")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
v2 changes:
 Fixed AI review by modifying debugfs also NOT to access
 Secy control stats counter

 drivers/net/ethernet/marvell/octeontx2/af/mcs.c         | 6 +++---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mcs.c b/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
index c1775bd01c2b..a07e0b3d8d00 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
@@ -120,13 +120,13 @@ void mcs_get_rx_secy_stats(struct mcs *mcs, struct mcs_secy_stats *stats, int id
 	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDX(id);
 	stats->pkt_untaged_cnt = mcs_reg_read(mcs, reg);
 
-	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(id);
-	stats->pkt_ctl_cnt = mcs_reg_read(mcs, reg);
-
 	if (mcs->hw->mcs_blks > 1) {
 		reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYNOTAGX(id);
 		stats->pkt_notag_cnt = mcs_reg_read(mcs, reg);
+		return;
 	}
+	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(id);
+	stats->pkt_ctl_cnt = mcs_reg_read(mcs, reg);
 }
 
 void mcs_get_flowid_stats(struct mcs *mcs, struct mcs_flowid_stats *stats,
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
index fa461489acdd..ca2704b188a5 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
@@ -482,10 +482,11 @@ static int rvu_dbg_mcs_rx_secy_stats_display(struct seq_file *filp, void *unused
 		seq_printf(filp, "secy%d: Tagged ctrl pkts: %lld\n", secy_id,
 			   stats.pkt_tagged_ctl_cnt);
 		seq_printf(filp, "secy%d: Untaged pkts: %lld\n", secy_id, stats.pkt_untaged_cnt);
-		seq_printf(filp, "secy%d: Ctrl pkts: %lld\n", secy_id, stats.pkt_ctl_cnt);
 		if (mcs->hw->mcs_blks > 1)
 			seq_printf(filp, "secy%d: pkts notag: %lld\n", secy_id,
 				   stats.pkt_notag_cnt);
+		else
+			seq_printf(filp, "secy%d: Ctrl pkts: %lld\n", secy_id, stats.pkt_ctl_cnt);
 	}
 	mutex_unlock(&mcs->stats_lock);
 	return 0;
-- 
2.48.1


^ permalink raw reply related

* [PATCH] octeontx2-pf: Clear stats of all resources when freeing resources
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep
In-Reply-To: <1781636420-19816-1-git-send-email-sbhatta@marvell.com>

When all MCS resources mapped to a PF are being freed then clear
stats of all those resources too.

Fixes: 815debbbf7b5 ("octeontx2-pf: mcs: Clear stats before freeing resource")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
index 4d3a7f4be962..9524d38f1582 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
@@ -182,6 +182,7 @@ static void cn10k_mcs_free_rsrc(struct otx2_nic *pfvf, enum mcs_direction dir,
 	clear_req->id = hw_rsrc_id;
 	clear_req->type = type;
 	clear_req->dir = dir;
+	clear_req->all = all;
 
 	req = otx2_mbox_alloc_msg_mcs_free_resources(mbox);
 	if (!req)
-- 
2.48.1


^ permalink raw reply related

* [PATCH 6.1] net: gro: don't merge zcopy skbs
From: Alexander Martyniuk @ 2026-06-16 22:00 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Alexander Martyniuk, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Sasha Levin, Sabrina Dubroca,
	Hyunwoo Kim, Pavel Begunkov, netdev, linux-kernel, lvc-project,
	Huzaifa Sidhpurwala, Willem de Bruijn

From: Sabrina Dubroca <sd@queasysnail.net>

commit 4db79a322db8c97f7b73b8a347395ef4d685eb40 upstream.

skb_gro_receive() can currently copy frags between the source and GRO
skb, without checking the zerocopy status, and in particular the
SKBFL_MANAGED_FRAG_REFS flag.

When SKBFL_MANAGED_FRAG_REFS is set, the skb doesn't hold a reference
on the pages in shinfo->frags. Appending those frags to another skb's
frags without fixing up the page refcount can lead to UAF.

When either the last skb in the GRO chain (the one we would append
frags to) or the source skb is zerocopy, don't merge the skbs.

Fixes: 753f1ca4e1e5 ("net: introduce managed frags infrastructure")
Reported-by: Huzaifa Sidhpurwala <huzaifas@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/c3b7f906bbfcbdfd7b4fa9d6c18a438870df85be.1779307748.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Martyniuk <alexevgmart@gmail.com>
---
Backport fix for CVE-2026-46323
 net/core/gro.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/gro.c b/net/core/gro.c
index ea6571c01faa..c5a9733d929a 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -171,6 +171,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 	if (p->pp_recycle != skb->pp_recycle)
 		return -ETOOMANYREFS;
 
+	if (skb_zcopy(p) || skb_zcopy(skb))
+		return -ETOOMANYREFS;
+
 	/* pairs with WRITE_ONCE() in netif_set_gro_max_size() */
 	gro_max_size = READ_ONCE(p->dev->gro_max_size);
 
-- 
2.30.2


^ permalink raw reply related

* [PATCH v1 net-next] ipv4: fib_rule: Move fib4_rules_exit() to ->exit().
From: Kuniyuki Iwashima @ 2026-06-16 19:13 UTC (permalink / raw)
  To: David Ahern, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev,
	syzbot+965506b59a2de0b6905c

syzbot reported use-after-free of net->ipv4.rules_ops. [0]

It can be reproduced with these commands:

  while true; do
  	ip netns add ns1
  	ip -n ns1 link set dev lo up
  	ip -n ns1 address add 192.0.2.1/24 dev lo
  	ip -n ns1 link add name dummy1 up type dummy
  	ip -n ns1 address add 198.51.100.1/24 dev dummy1
  	ip -n ns1 rule add ipproto tcp sport 12345 table 12345
  	ip -n ns1 fou add port 5555 ipproto 47 local 192.0.2.1 peer 198.51.100.2 peer_port 54321
  	ip netns del ns1
  done

The cited commit moved fib4_rules_exit() earlier to ->exit_rtnl(),
but the kernel socket destroyed in ->exit() could eventually reach
__fib_lookup().

I left fib4_rules_exit() in ->exit_rtnl() because fib4_rule_delete()
calls fib_unmerge(), which requires RTNL.

However, when ->delete() is called, ->configure() has already been
called, thus fib_unmerge() in ->delete() has no effect.

Let's remove fib_unmerge() in fib4_rule_delete() and move
fib4_rules_exit() to ->exit().

Many thanks to Ido Schimmel for providing the nice repro very quickly.

Note that we can make fib_rules_ops.delete() return void once
net-next opens.

[0]:
BUG: KASAN: slab-use-after-free in fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
Read of size 8 at addr ffff88804ec4c680 by task kworker/u8:21/12641

CPU: 0 UID: 0 PID: 12641 Comm: kworker/u8:21 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description+0x55/0x1e0 mm/kasan/report.c:378
 print_report+0x58/0x70 mm/kasan/report.c:482
 kasan_report+0x117/0x150 mm/kasan/report.c:595
 fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
 __fib_lookup+0x106/0x210 net/ipv4/fib_rules.c:96
 ip_route_output_key_hash_rcu+0x294/0x2720 net/ipv4/route.c:2811
 ip_route_output_key_hash+0x18d/0x2a0 net/ipv4/route.c:2702
 __ip_route_output_key include/net/route.h:169 [inline]
 ip_route_output_flow+0x2a/0x150 net/ipv4/route.c:2929
 ip4_datagram_release_cb+0x89d/0xbe0 net/ipv4/datagram.c:118
 release_sock+0x206/0x260 net/core/sock.c:3861
 inet_shutdown+0x2b1/0x390 net/ipv4/af_inet.c:950
 udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
 fou_release net/ipv4/fou_core.c:562 [inline]
 fou_exit_net+0x17d/0x1f0 net/ipv4/fou_core.c:1230
 ops_exit_list net/core/net_namespace.c:199 [inline]
 ops_undo_list+0x43d/0x8d0 net/core/net_namespace.c:252
 cleanup_net+0x572/0x810 net/core/net_namespace.c:702
 process_one_work kernel/workqueue.c:3314 [inline]
 process_scheduled_works+0xa8e/0x14e0 kernel/workqueue.c:3397
 worker_thread+0xa47/0xfb0 kernel/workqueue.c:3478
 kthread+0x389/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

Fixes: 759923cf03b0 ("ipv4: fib: Convert fib_net_exit_batch() to ->exit_rtnl().")
Reported-by: syzbot+965506b59a2de0b6905c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a315824.b0403584.28d0ff.0000.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv4/fib_frontend.c | 10 ++++++----
 net/ipv4/fib_rules.c    | 11 ++---------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index c7d1f31650d7..42212970d735 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1612,10 +1612,6 @@ static void ip_fib_net_exit(struct net *net)
 			fib_free_table(tb);
 		}
 	}
-
-#ifdef CONFIG_IP_MULTIPLE_TABLES
-	fib4_rules_exit(net);
-#endif
 }
 
 static int __net_init fib_net_init(struct net *net)
@@ -1652,6 +1648,9 @@ static int __net_init fib_net_init(struct net *net)
 	ip_fib_net_exit(net);
 	rtnl_net_unlock(net);
 
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	goto out;
@@ -1671,6 +1670,9 @@ static void __net_exit fib_net_exit_rtnl(struct net *net,
 
 static void __net_exit fib_net_exit(struct net *net)
 {
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	fib4_semantics_exit(net);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 51f0193092f0..e068a5bace73 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -352,24 +352,17 @@ static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
 static int fib4_rule_delete(struct fib_rule *rule)
 {
 	struct net *net = rule->fr_net;
-	int err;
-
-	/* split local/main if they are not already split */
-	err = fib_unmerge(net);
-	if (err)
-		goto errout;
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
 	if (((struct fib4_rule *)rule)->tclassid)
 		atomic_dec(&net->ipv4.fib_num_tclassid_users);
 #endif
-	net->ipv4.fib_has_custom_rules = true;
 
 	if (net->ipv4.fib_rules_require_fldissect &&
 	    fib_rule_requires_fldissect(rule))
 		net->ipv4.fib_rules_require_fldissect--;
-errout:
-	return err;
+
+	return 0;
 }
 
 static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* Re: [Bug] incompatibility between 'e1000e' and Aruba AOS-CX switches (too small inter-packet gap)
From: Andrew Lunn @ 2026-06-16 19:34 UTC (permalink / raw)
  To: Philippe Andersson; +Cc: netdev, Ludovic Calmant, Fabian Noël
In-Reply-To: <457d1617-bd7f-44c5-a9af-7ba8aa9250f4@iba-group.com>

> A support ticket has already been opened with Aruba, but it's unclear at
> this stage that the problem is on their side.

How easy is it to reproduce? Can you run a git bisect from the last
known good kernel version to the first known bad version?

      Andrew

^ permalink raw reply

* Re: [PATCH v27 3/5] cxl/sfc: Initialize dpa without a mailbox
From: Dan Williams (nvidia) @ 2026-06-16 19:35 UTC (permalink / raw)
  To: Alejandro Lucero Palau, Dan Williams (nvidia),
	alejandro.lucero-palau, linux-cxl, netdev, edward.cree, davem,
	kuba, pabeni, edumazet, dave.jiang
  Cc: Dan Williams, Ben Cheatham, Jonathan Cameron
In-Reply-To: <17b68fb1-768e-49f6-884d-49e0952621b8@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 6/10/26 00:24, Dan Williams (nvidia) wrote:
> > alejandro.lucero-palau@ wrote:
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
> >> memdev state params which end up being used for DPA initialization.
> >>
> >> Allow a Type2 driver to initialize DPA simply by giving the size of its
> >> volatile hardware partition.
> >>
> >> Move related functions to memdev.
> > The code movement is not strictly necessary. Just add cxl_set_capacity()
> > and we can consider a move later if mbox.o and memdev.o are ever not
> > both included in cxl_core.o by default.
> 
> 
> I think it is the right thing to do as the new function uses add_part() 
> (moved) and the other add_part() client is the other function moved, 
> cxl_mem_dpa_fetch().
> 
> Note cxl_mem_get_partition() used by cxl_mem_dpa_fetch() is the one 
> working with mbox commands and it remains in the same place inside 
> core/mbox.c and the only cxl_mem_dpa_fetch() client is cxl/pci.c
> 
> 
> This was reviewed and accepted so no reason for not doing it ...

Sure, I am ok to let it go as is.

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: Guard conntrack opts error writes
From: Alexei Starovoitov @ 2026-06-16 19:36 UTC (permalink / raw)
  To: Yiyang Chen, bpf, netfilter-devel
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms, andrii,
	eddyz87, ast, daniel, memxor, martin.lau, song, yonghong.song,
	jolsa, emil, shuah, kartikey406, coreteam, netdev, linux-kernel,
	linux-kselftest
In-Reply-To: <70aeec0ab762aebe65129cf6052e132c7329edc2.1781586477.git.chenyy23@mails.tsinghua.edu.cn>

On Mon Jun 15, 2026 at 10:42 PM PDT, Yiyang Chen wrote:
> The conntrack lookup and allocation kfuncs take an opts pointer
> together with an opts__sz argument. The verifier checks only the memory
> range described by opts__sz, but the wrappers unconditionally write
> opts->error whenever the internal lookup or allocation helper returns an
> error.
>
> For an invalid size smaller than the end of opts->error, that write can
> land outside the verifier-checked range. Keep returning NULL for invalid
> arguments, but only report the error through opts->error when the
> supplied size includes the field.
>
> This preserves error reporting for the supported 12-byte and 16-byte
> layouts, and for other invalid sizes that still include opts->error.
>
> Fixes: b4c2b9593a1c ("net/netfilter: Add unstable CT lookup helpers for XDP and TC-BPF")
> Fixes: d7e79c97c00c ("net: netfilter: Add kfuncs to allocate and insert CT")
> Signed-off-by: Yiyang Chen <chenyy23@mails.tsinghua.edu.cn>
> ---
>  net/netfilter/nf_conntrack_bpf.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
> index 40c261cd0af38..3c182024ec509 100644
> --- a/net/netfilter/nf_conntrack_bpf.c
> +++ b/net/netfilter/nf_conntrack_bpf.c
> @@ -65,6 +65,11 @@ enum {
>  	NF_BPF_CT_OPTS_SZ = 16,
>  };
>  
> +static bool bpf_ct_opts_has_error(u32 opts_len)
> +{
> +	return opts_len >= offsetofend(struct bpf_ct_opts, error);
> +}
> +
>  static int bpf_nf_ct_tuple_parse(struct bpf_sock_tuple *bpf_tuple,
>  				 u32 tuple_len, u8 protonum, u8 dir,
>  				 struct nf_conntrack_tuple *tuple)
> @@ -298,7 +303,8 @@ bpf_xdp_ct_alloc(struct xdp_md *xdp_ctx, struct bpf_sock_tuple *bpf_tuple,
>  	nfct = __bpf_nf_ct_alloc_entry(dev_net(ctx->rxq->dev), bpf_tuple, tuple__sz,
>  				       opts, opts__sz, 10);
>  	if (IS_ERR(nfct)) {
> -		opts->error = PTR_ERR(nfct);
> +		if (bpf_ct_opts_has_error(opts__sz))
> +			opts->error = PTR_ERR(nfct);

LLMs have no taste.

Above two lines could have been one helper
   bpf_ct_opts_set_error(opts, opts__sz, PTR_ERR(nfct));

Or we can do a step further and simplify the code more.
Turn this:
   if (IS_ERR(nfct)) {
           opts->error = PTR_ERR(nfct);
           return NULL;
   }
   return (struct nf_conn___init *)nfct;
into:
   return (struct nf_conn___init *)bpf_ct_opts_result(opts, opts__sz, nfct);

static void *bpf_ct_opts_result(struct bpf_ct_opts *opts, u32 opts__sz, void *ret)
{
  if (!IS_ERR(ret))
    return ret;
  if (opts__sz >= offsetofend(struct bpf_ct_opts, error))
    opts->error = PTR_ERR(ret);
  return NULL;
}

This kind of small improvements should be obvious to any human developer.
Please do NOT send us patches straight out of LLM.
Review it first and think how to improve it.

pw-bot: cr

^ permalink raw reply

* Re: [PATCH net-next v6 1/2] dinghai: add ZTE network driver support
From: Andrew Lunn @ 2026-06-16 19:39 UTC (permalink / raw)
  To: han.junyang
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
	netdev, ran.ming, han.chengfei, zhang.yanze
In-Reply-To: <20260616213057452I2KLm3mVgWYl_SUTy_YYS@zte.com.cn>

> +++ b/drivers/net/ethernet/zte/dinghai/en_pf.h
> +static inline void *dh_core_alloc_priv(struct dh_core_dev *dh_dev,
> +				       size_t size)
> +{
> +	void *priv = kzalloc(size, GFP_KERNEL);
> +
> +	if (priv)
> +		dh_dev->priv = priv;
> +	return priv;
> +}
> +
> +static inline void dh_core_free_priv(struct dh_core_dev *dh_dev)
> +{
> +	kfree(dh_dev->priv);
> +}

It is unusual for these to be inline functions in a header. Why is
this?

	Andrew

^ permalink raw reply

* Re: [PATCH net-next v6 2/2] dinghai: add hardware register access and PCI? capability scanning
From: Andrew Lunn @ 2026-06-16 19:49 UTC (permalink / raw)
  To: han.junyang
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
	netdev, ran.ming, han.chengfei, zhang.yanze
In-Reply-To: <20260616213550502kLzSZF2DiQyd9Dl0Dv0Gz@zte.com.cn>

> +int zxdh_pf_common_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +	struct zxdh_pf_device *pf_dev = dh_dev->priv;
> +	struct pci_dev *pdev = dh_dev->pdev;
> +	int common;
> +
> +	/* check for a common config: if not, use legacy mode (bar 0). */
> +	common = zxdh_pf_pci_find_capability(pdev, ZXDH_PCI_CAP_COMMON_CFG,
> +					     IORESOURCE_IO | IORESOURCE_MEM,
> +					     &pf_dev->modern_bars);
> +	if (common == 0) {
> +		dev_err(dh_dev->device,
> +			"missing capabilities %i, leaving for legacy driver\n",
> +			common);

That looks double odd. Normally you would use !common. Also, you know
common is 0, so why use "%i", when it could be just '0'.

> +int zxdh_pf_notify_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +	struct zxdh_pf_device *pf_dev = dh_dev->priv;
> +	struct pci_dev *pdev = dh_dev->pdev;
> +	u32 notify_length;
> +	u32 notify_offset;
> +	int notify;
> +
> +	/* If common is there, these should be too... */
> +	notify = zxdh_pf_pci_find_capability(pdev, ZXDH_PCI_CAP_NOTIFY_CFG,
> +					     IORESOURCE_IO | IORESOURCE_MEM,
> +					     &pf_dev->modern_bars);
> +	if (notify == 0) {
> +		dev_err(dh_dev->device, "missing capabilities %i\n", notify);
> +		return -EINVAL;
> +	}
> +

Same again.

    Andrew

---
pw-bot: cr

^ permalink raw reply

* Re: [PATCH v27 4/5] sfc: obtain and map cxl range using devm_cxl_probe_mem
From: Dan Williams (nvidia) @ 2026-06-16 19:51 UTC (permalink / raw)
  To: Alejandro Lucero Palau, Dan Williams (nvidia),
	alejandro.lucero-palau, linux-cxl, netdev, edward.cree, davem,
	kuba, pabeni, edumazet, dave.jiang
In-Reply-To: <50d8e423-8248-4e26-901b-010d14d22e67@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 6/10/26 14:56, Alejandro Lucero Palau wrote:
> >
> > On 6/10/26 07:10, Alejandro Lucero Palau wrote:
> >>
> >> On 6/10/26 00:30, Dan Williams (nvidia) wrote:
> >>> alejandro.lucero-palau@ wrote:
> >>>> From: Alejandro Lucero <alucerop@amd.com>
> >>>>
> >>>> Use core API for safely obtain the CXL range linked to an HDM 
> >>>> committed
> >>>> by the BIOS. Map such a range for being used as the ctpio buffer.
> >>>>
> >>>> A potential user space action through sysfs unbinding or core cxl
> >>>> modules remove will trigger sfc driver device detachment, with that 
> >>>> case
> >>>> not racing with this mapping as this is done during driver probe and
> >>>> therefore protected with device lock against those user space actions.
> >>>>
> >>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >>>> ---
> >>>>   drivers/net/ethernet/sfc/efx.c     |  1 +
> >>>>   drivers/net/ethernet/sfc/efx_cxl.c | 24 ++++++++++++++++++++++++
> >>>>   drivers/net/ethernet/sfc/efx_cxl.h |  3 +++
> >>>>   3 files changed, 28 insertions(+)
> >>>>
> >>>> diff --git a/drivers/net/ethernet/sfc/efx.c 
> >>>> b/drivers/net/ethernet/sfc/efx.c
> >>>> index 90ccbe310386..578054c21e79 100644
> >>>> --- a/drivers/net/ethernet/sfc/efx.c
> >>>> +++ b/drivers/net/ethernet/sfc/efx.c
> >>>> @@ -984,6 +984,7 @@ static void efx_pci_remove(struct pci_dev 
> >>>> *pci_dev)
> >>>>       efx_fini_io(efx);
> >>>>         probe_data = container_of(efx, struct efx_probe_data, efx);
> >>>> +    efx_cxl_exit(probe_data);
> >>>>         pci_dbg(efx->pci_dev, "shutdown successful\n");
> >>>>   diff --git a/drivers/net/ethernet/sfc/efx_cxl.c 
> >>>> b/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> index 4d55c08cf2a1..d5766a40e2cf 100644
> >>>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> @@ -18,6 +18,7 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> >>>>   {
> >>>>       struct efx_nic *efx = &probe_data->efx;
> >>>>       struct pci_dev *pci_dev = efx->pci_dev;
> >>>> +    struct range cxl_pio_range;
> >>>>       struct efx_cxl *cxl;
> >>>>       u16 dvsec;
> >>>>       int rc;
> >>>> @@ -75,9 +76,32 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> >>>>           return -ENODEV;
> >>>>       }
> >>>>   +    cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
> >>>> +    if (IS_ERR(cxl->cxlmd)) {
> >>>> +        pci_err(pci_dev, "CXL accel memdev creation failed\n");
> >>>> +        return PTR_ERR(cxl->cxlmd);
> >>>> +    }
> >>>> +
> >>>> +    cxl->ctpio_cxl = ioremap_wc(cxl_pio_range.start,
> >>>> +                    range_len(&cxl_pio_range));
> >>>> +    if (!cxl->ctpio_cxl) {
> >>>> +        pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
> >>>> +            &cxl_pio_range);
> >>>> +        return -ENOMEM;
> >>> Dave caught the iounmap leak, but another concern is since you want to
> >>> continue operation if efx_cxl_init() fails then you probably also want
> >>> to release the successful attachment to the CXL domain if this happens.
> >>
> >>
> >> I will do that.
> >>
> >
> > Looking at this issue, I think an error when creating the memdev or 
> > during the region attach triggers the memdev removal, but ...
> >
> >
> >>
> >>> Minor since something else is likely to fail if ioremap is not 
> >>> reliable.
> >
> >
> > .. if we want to specifically do that with an unlikely (but possible) 
> > ioremap error something else needs to be exported like 
> > cxl_memdev_unregister(). Are you happy with that approach?
> >
> 
> I have just tested with this:
> 
> +void cxl_memdev_remove(void *_cxlmd)
> +{
> +       struct cxl_memdev *cxlmd = _cxlmd;
> +       struct device *dev = &cxlmd->dev;
> +
> +       devm_remove_action_nowarn(cxlmd->cxlds->dev, cxl_memdev_unregister,
> +                                 cxlmd);
> +
> +       cdev_device_del(&cxlmd->cdev, dev);
> +       cxl_memdev_shutdown(dev);
> +       put_device(dev);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_remove, "CXL");
> 
> 
> only called if the ioremap fails.
> 
> 
> Please, let me know if you like this approach before sending another 
> version.

A devres group can automatically cleanup after devm_cxl_memdev_probe()
in the error path with no new exports needed from the CXL core.
Something like:

        void *group = devres_open_group(cxl->cxlds.dev, NULL, GFP_KERNEL);
        int rc = 0;

        if (!group)
                return -ENOMEM;
        
        cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
        if (IS_ERR(cxl->cxlmd)) {
                pci_err(pci_dev, "CXL accel memdev creation failed\n");
                rc = PTR_ERR(cxl->cxlmd);
                goto out;
        }

        cxl->ctpio_cxl =
                ioremap_wc(cxl_pio_range.start, range_len(&cxl_pio_range));
        if (!cxl->ctpio_cxl) {
                pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
                        &cxl_pio_range);
                rc = -ENOMEM;
        }

out:
        if (rc)
                devres_release_group(group);
        else
                devres_remove_group(group);
        return rc;

^ permalink raw reply

* Landlock: LANDLOCK_ACCESS_NET_CONNECT_TCP bypass via TCP Fast Open
From: Bryam Vargas @ 2026-06-16 20:16 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Matthieu Buffet, Paul Moore, Eric Dumazet,
	Neal Cardwell, linux-security-module, netdev, linux-kernel

Hello Mickaël, and Landlock folks,

A task confined by a Landlock ruleset that handles
LANDLOCK_ACCESS_NET_CONNECT_TCP and is denied connecting to a given port can
still establish a TCP connection to that port by using TCP Fast Open, i.e.
sendto(fd, ..., MSG_FASTOPEN, &dst, dstlen) on a fresh stream socket. The
network-egress confinement for TCP connect is silently bypassed.

Affected
--------
Any kernel with CONFIG_SECURITY_LANDLOCK=y and Landlock enabled that supports
the TCP network access rights (Landlock ABI >= 4, since Linux 6.7). Confirmed by
source inspection on mainline (v7.1-rc7) and reproduced on Linux 7.0.11
(Landlock ABI 8). No CONFIG beyond Landlock + IPv4/IPv6 TCP; TCP Fast Open client
is enabled by the per-netns default (net.ipv4.tcp_fastopen has TFO_CLIENT_ENABLE
set), so no sysctl change and no setsockopt are required.

Root cause
----------
LANDLOCK_ACCESS_NET_CONNECT_TCP is enforced only by the socket_connect LSM hook
(hook_socket_connect -> current_check_access_socket). security_socket_connect()
has exactly one call site in the tree, net/socket.c (the connect(2) syscall).

TCP Fast Open performs an implicit connect inside sendmsg:

  tcp_sendmsg_locked()            net/ipv4/tcp.c  (MSG_FASTOPEN branch)
   -> tcp_sendmsg_fastopen()      net/ipv4/tcp.c
   -> __inet_stream_connect(..., is_sendmsg=1)  net/ipv4/af_inet.c
   -> sk->sk_prot->connect()      net/ipv4/af_inet.c  -> tcp_v4_connect()

This path establishes the connection to the address taken from msg_name but
never calls security_socket_connect(). The only LSM hook fired on the sendmsg
path is security_socket_sendmsg(), and Landlock registers no socket_sendmsg
hook, so LANDLOCK_ACCESS_NET_CONNECT_TCP is never re-checked. __inet_stream_connect()
itself carries no LSM hook (only the cgroup-BPF pre_connect, a different
mechanism).

Notably the kernel already mediates the analogous AF_UNIX implicit-connect on the
send path via the unix_may_send hook, which Landlock does register
(hook_unix_may_send) -- so the sendmsg-implies-connect pattern is recognized, but
the TCP Fast Open case has no equivalent coverage. The MPTCP fast-open path
(mptcp_sendmsg_fastopen -> __inet_stream_connect) is a second producer of the
same unmediated connect (by source inspection; not separately reproduced).

Reproducer
----------
A self-contained, fully unprivileged PoC is available on request. It forks an
unconfined TFO-capable loopback listener, then in a child applies a Landlock
ruleset handling LANDLOCK_ACCESS_NET_CONNECT_TCP with no allow rule
(landlock_create_ruleset() with handled_access_net =
LANDLOCK_ACCESS_NET_CONNECT_TCP, no landlock_add_rule(), then
landlock_restrict_self(); every TCP connect is denied) and tries the forbidden
port two ways:

  (1) connect(fd, &dst)                 -> -EACCES   (Landlock enforces CONNECT_TCP)
  (2) sendto(fd2, buf, len, MSG_FASTOPEN, &dst, dstlen)
                                        -> succeeds; the listener accepts the
                                           connection and reads the payload.

Observed on Linux 7.0.11 (Landlock ABI 8):

  [1] connect(2)            -> ret=-1 errno=13 (Permission denied)
  [2] sendto(MSG_FASTOPEN)  -> ret=14 errno=0 (OK/queued)
  [+] listener ACCEPTED the confined child's connection; payload="..."

connect(2) to the port is denied while sendto(MSG_FASTOPEN) reaches the identical
port and delivers data.

Impact
------
A sandbox that uses LANDLOCK_ACCESS_NET_CONNECT_TCP to restrict outbound TCP
(e.g. to keep a confined component from reaching an internal service or a
metadata endpoint) can be escaped by an unprivileged, self-confined task with no
CAP and no namespace transition -- for any destination port, since the
implicit-connect path never consults the connect hook regardless of address (the
run above shows one port). It is an integrity
bypass of the network-confinement property; no memory safety is involved.
I score it CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:N (6.5 Medium) -- the
confined task escapes the policy authority that defined its sandbox, a scope
change; 5.5 if you treat the Landlock boundary as the same authority (S:U).

Note on the in-flight UDP series
--------------------------------
The "landlock: Add UDP access control support" series (v5, Matthieu Buffet,
https://lore.kernel.org/r/20260611162107.49278-3-matthieu@buffet.re) adds a
socket_sendmsg hook, hook_socket_sendmsg(), but it returns 0 for non-UDP
sockets:

    if (sk_is_udp(sock->sk))
            access_request = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
    else
            return 0;

so a TCP socket using MSG_FASTOPEN still bypasses LANDLOCK_ACCESS_NET_CONNECT_TCP
even after that series lands. It may be most convenient to fix this there.

Suggested direction
-------------------
Re-check LANDLOCK_ACCESS_NET_CONNECT_TCP on the implicit-connect path: either have
the socket_sendmsg hook evaluate CONNECT_TCP for stream sockets when the call
performs an implicit connect (mirroring the AF_UNIX unix_may_send handling), or
place the check inside __inet_stream_connect() so a single chokepoint covers
connect(2), TCP Fast Open, and the MPTCP fast-open sibling.

I am happy to send a patch for this if you would like me to.

Best regards,

Bryam Vargas
Independent security researcher, HEXLAB S.A.S., Cali, Colombia
hexlabsecurity@proton.me


^ permalink raw reply

* Re: [PATCH net-next 2/2] udp: convert udp_lib_getsockopt to sockopt_t
From: David Laight @ 2026-06-16 20:16 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Stanislav Fomichev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Willem de Bruijn, Shuah Khan, netdev,
	linux-kernel, linux-kselftest, kernel-team
In-Reply-To: <ajF4Odi_L28LdIXC@gmail.com>

On Tue, 16 Jun 2026 09:22:52 -0700
Breno Leitao <leitao@debian.org> wrote:

> On Fri, Jun 12, 2026 at 07:10:15PM -0700, Stanislav Fomichev wrote:
> > On 06/12, Breno Leitao wrote:  
> 
> > >  int udp_lib_getsockopt(struct sock *sk, int level, int optname,
> > > -		       char __user *optval, int __user *optlen)
> > > +		       sockopt_t *opt)
> > >  {
> > >  	struct udp_sock *up = udp_sk(sk);
> > >  	int val, len;
> > >  
> > > -	if (get_user(len, optlen))
> > > -		return -EFAULT;  
> > 
> > [..]
> >   
> > > -	if (len < 0)
> > > -		return -EINVAL;  
> > 
> > I see this part now in sockopt_init_user, but you mention that it's a
> > transitional helper. When we drop it, will we loose this <0 check?
> > Maybe keep `if ((int)opt->optlen < 0))` here for backwards
> > compatibility?  
> 
> Good idea. I will do it and respin (once net-next reopens).

The best place for the negative length check is in the syscall wrapper code.
Pass an unsigned length through to all the protocol code.
No need to require every function to do the test.

Note that the length check was actually broken in many protocols
going way back well before git.
There has pretty much always been an unsigned min() check that converted
negative values to small(ish) positive ones before the check for it being
negative.
(That predates min() being a #define.)

The recent change to actually error optlen < 0 might actually have broken
some applications that passed uninitialised stack that was always negative!

-- David

> 
> Thanks for the review,
> --breno
> 

^ permalink raw reply

* Re: [PATCH] ice: retry reading NVM if admin queue returns EBUSY
From: kernel test robot @ 2026-06-16 20:18 UTC (permalink / raw)
  To: Robert Malz, anthony.l.nguyen, przemyslaw.kitszel
  Cc: oe-kbuild-all, intel-wired-lan, netdev
In-Reply-To: <20260616104521.1545053-1-robert.malz@canonical.com>

Hi Robert,

kernel test robot noticed the following build errors:

[auto build test ERROR on tnguy-next-queue/dev-queue]
[also build test ERROR on tnguy-net-queue/dev-queue linus/master v7.1 next-20260616]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Robert-Malz/ice-retry-reading-NVM-if-admin-queue-returns-EBUSY/20260616-185349
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git dev-queue
patch link:    https://lore.kernel.org/r/20260616104521.1545053-1-robert.malz%40canonical.com
patch subject: [PATCH] ice: retry reading NVM if admin queue returns EBUSY
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260616/202606162237.EIrFZKip-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260616/202606162237.EIrFZKip-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606162237.EIrFZKip-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/net/ethernet/intel/ice/ice_nvm.c: In function 'ice_read_flat_nvm':
>> drivers/net/ethernet/intel/ice/ice_nvm.c:101:58: error: 'ICE_AQ_RC_EBUSY' undeclared (first use in this function); did you mean 'LIBIE_AQ_RC_EBUSY'?
     101 |                         if (hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY ||
         |                                                          ^~~~~~~~~~~~~~~
         |                                                          LIBIE_AQ_RC_EBUSY
   drivers/net/ethernet/intel/ice/ice_nvm.c:101:58: note: each undeclared identifier is reported only once for each function it appears in


vim +101 drivers/net/ethernet/intel/ice/ice_nvm.c

    48	
    49	/**
    50	 * ice_read_flat_nvm - Read portion of NVM by flat offset
    51	 * @hw: pointer to the HW struct
    52	 * @offset: offset from beginning of NVM
    53	 * @length: (in) number of bytes to read; (out) number of bytes actually read
    54	 * @data: buffer to return data in (sized to fit the specified length)
    55	 * @read_shadow_ram: if true, read from shadow RAM instead of NVM
    56	 *
    57	 * Reads a portion of the NVM, as a flat memory space. This function correctly
    58	 * breaks read requests across Shadow RAM sectors and ensures that no single
    59	 * read request exceeds the maximum 4KB read for a single AdminQ command.
    60	 *
    61	 * Returns a status code on failure. Note that the data pointer may be
    62	 * partially updated if some reads succeed before a failure.
    63	 */
    64	int
    65	ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
    66			  bool read_shadow_ram)
    67	{
    68		u32 inlen = *length;
    69		u32 bytes_read = 0;
    70		int retry_cnt = 0;
    71		bool last_cmd;
    72		int status;
    73	
    74		*length = 0;
    75	
    76		/* Verify the length of the read if this is for the Shadow RAM */
    77		if (read_shadow_ram && ((offset + inlen) > (hw->flash.sr_words * 2u))) {
    78			ice_debug(hw, ICE_DBG_NVM, "NVM error: requested offset is beyond Shadow RAM limit\n");
    79			return -EINVAL;
    80		}
    81	
    82		do {
    83			u32 read_size, sector_offset;
    84	
    85			/* ice_aq_read_nvm cannot read more than 4KB at a time.
    86			 * Additionally, a read from the Shadow RAM may not cross over
    87			 * a sector boundary. Conveniently, the sector size is also
    88			 * 4KB.
    89			 */
    90			sector_offset = offset % ICE_AQ_MAX_BUF_LEN;
    91			read_size = min_t(u32, ICE_AQ_MAX_BUF_LEN - sector_offset,
    92					  inlen - bytes_read);
    93	
    94			last_cmd = !(bytes_read + read_size < inlen);
    95	
    96			status = ice_aq_read_nvm(hw, ICE_AQC_NVM_START_POINT,
    97						 offset, read_size,
    98						 data + bytes_read, last_cmd,
    99						 read_shadow_ram, NULL);
   100			if (status) {
 > 101				if (hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY ||
   102				    retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
   103					break;
   104				ice_debug(hw, ICE_DBG_NVM,
   105					  "NVM read EBUSY error, retry %d\n",
   106					  retry_cnt + 1);
   107				last_cmd = false;
   108				ice_release_nvm(hw);
   109				msleep(ICE_SQ_SEND_DELAY_TIME_MS);
   110				status = ice_acquire_nvm(hw, ICE_RES_READ);
   111				if (status)
   112					break;
   113				retry_cnt++;
   114			} else {
   115				bytes_read += read_size;
   116				offset += read_size;
   117				retry_cnt = 0;
   118			}
   119		} while (!last_cmd);
   120	
   121		*length = bytes_read;
   122		return status;
   123	}
   124	

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [PATCH] net: faraday: ftmac100: convert to devm resource management
From: Jack Lee @ 2026-06-16 20:32 UTC (permalink / raw)
  To: davem, kuba
  Cc: andrew+netdev, edumazet, pabeni, netdev, linux-kernel, Jack Lee

Replace manual resource management with device-managed alternatives:
- alloc_etherdev() -> devm_alloc_etherdev()
- request_mem_region() + ioremap() -> devm_platform_ioremap_resource()

This simplifies error handling by removing manual cleanup in error
paths and the remove function, and eliminates the risk of resource
leaks.

Signed-off-by: Jack Lee <skunkolee@gmail.com>
---
 drivers/net/ethernet/faraday/ftmac100.c | 47 +++++--------------------
 1 file changed, 9 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftmac100.c b/drivers/net/ethernet/faraday/ftmac100.c
index 5803a382f0ba..adb318925f44 100644
--- a/drivers/net/ethernet/faraday/ftmac100.c
+++ b/drivers/net/ethernet/faraday/ftmac100.c
@@ -49,7 +49,6 @@ struct ftmac100_descs {
 };
 
 struct ftmac100 {
-	struct resource *res;
 	void __iomem *base;
 	int irq;
 
@@ -1137,11 +1136,9 @@ static int ftmac100_probe(struct platform_device *pdev)
 		return irq;
 
 	/* setup net_device */
-	netdev = alloc_etherdev(sizeof(*priv));
-	if (!netdev) {
-		err = -ENOMEM;
-		goto err_alloc_etherdev;
-	}
+	netdev = devm_alloc_etherdev(&pdev->dev, sizeof(*priv));
+	if (!netdev)
+		return -ENOMEM;
 
 	SET_NETDEV_DEV(netdev, &pdev->dev);
 	netdev->ethtool_ops = &ftmac100_ethtool_ops;
@@ -1150,7 +1147,7 @@ static int ftmac100_probe(struct platform_device *pdev)
 
 	err = platform_get_ethdev_address(&pdev->dev, netdev);
 	if (err == -EPROBE_DEFER)
-		goto defer_get_mac;
+		return err;
 
 	platform_set_drvdata(pdev, netdev);
 
@@ -1165,20 +1162,9 @@ static int ftmac100_probe(struct platform_device *pdev)
 	netif_napi_add(netdev, &priv->napi, ftmac100_poll);
 
 	/* map io memory */
-	priv->res = request_mem_region(res->start, resource_size(res),
-				       dev_name(&pdev->dev));
-	if (!priv->res) {
-		dev_err(&pdev->dev, "Could not reserve memory region\n");
-		err = -ENOMEM;
-		goto err_req_mem;
-	}
-
-	priv->base = ioremap(res->start, resource_size(res));
-	if (!priv->base) {
-		dev_err(&pdev->dev, "Failed to ioremap ethernet registers\n");
-		err = -EIO;
-		goto err_ioremap;
-	}
+	priv->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(priv->base))
+		return PTR_ERR(priv->base);
 
 	priv->irq = irq;
 
@@ -1208,32 +1194,17 @@ static int ftmac100_probe(struct platform_device *pdev)
 	return 0;
 
 err_register_netdev:
-	iounmap(priv->base);
-err_ioremap:
-	release_resource(priv->res);
-err_req_mem:
 	netif_napi_del(&priv->napi);
-defer_get_mac:
-	free_netdev(netdev);
-err_alloc_etherdev:
 	return err;
 }
 
 static void ftmac100_remove(struct platform_device *pdev)
 {
-	struct net_device *netdev;
-	struct ftmac100 *priv;
-
-	netdev = platform_get_drvdata(pdev);
-	priv = netdev_priv(netdev);
+	struct net_device *netdev = platform_get_drvdata(pdev);
+	struct ftmac100 *priv = netdev_priv(netdev);
 
 	unregister_netdev(netdev);
-
-	iounmap(priv->base);
-	release_resource(priv->res);
-
 	netif_napi_del(&priv->napi);
-	free_netdev(netdev);
 }
 
 static const struct of_device_id ftmac100_of_ids[] = {
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH net v2 1/2] iov_iter: export iov_iter_restore
From: Jens Axboe @ 2026-06-16 20:47 UTC (permalink / raw)
  To: Octavian Purdila, netdev
  Cc: Alexander Viro, Andrew Morton, Arseniy Krasnov, David S. Miller,
	Eric Dumazet, Eugenio Pérez, Jakub Kicinski, Jason Wang, kvm,
	linux-block, linux-fsdevel, linux-kernel, Michael S. Tsirkin,
	Paolo Abeni, Simon Horman, Stefan Hajnoczi, Stefano Garzarella,
	virtualization, Xuan Zhuo
In-Reply-To: <20260613000953.467473-2-tavip@google.com>

On 6/12/26 6:09 PM, Octavian Purdila wrote:
> Export iov_iter_restore so that it can be used by modules.
> 
> This is needed by the virtio vsock transport (which can be built as a
> module) to restore the msg_iter state when transmission fails.
> 
> Signed-off-by: Octavian Purdila <tavip@google.com>
> ---
>  lib/iov_iter.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 243662af1af73..067e745f9ef53 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1491,6 +1491,7 @@ void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state)
>  		i->__iov -= state->nr_segs - i->nr_segs;
>  	i->nr_segs = state->nr_segs;
>  }
> +EXPORT_SYMBOL(iov_iter_restore);

I don't have a problem exporting this to modules, but any new export
should be _GPL. So please change it to that.

-- 
Jens Axboe

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox