Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net] tipc: fix UAF in cleanup_bearer() due to premature dst_cache_destroy()
From: Eric Dumazet @ 2026-06-22 17:10 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, netdev, eric.dumazet, Eric Dumazet,
	syzbot+e14bc5d4942756023b77, Xin Long, Jon Maloy

TIPC UDP media bearer teardown calls dst_cache_destroy() on its
replicast caches before calling synchronize_net() to wait for
concurrent RCU readers (transmitters) to finish:

static void cleanup_bearer(struct work_struct *work)
{
...
	list_for_each_entry_safe(rcast, tmp, &ub->rcast.list, list) {
		dst_cache_destroy(&rcast->dst_cache);
		list_del_rcu(&rcast->list);
		kfree_rcu(rcast, rcu);
	}
...
	dst_cache_destroy(&ub->rcast.dst_cache);
	udp_tunnel_sock_release(ub->sk);
	synchronize_net();
...
}

This is highly buggy because dst_cache_destroy() immediately frees the
per-CPU cache memory (free_percpu()) and releases the cached dst
entries without any synchronization.

If a concurrent transmitter (e.g., tipc_udp_xmit()) is running on another
CPU under RCU protection, it can call dst_cache_get() concurrently,
leading to:
1. Use-After-Free on the per-CPU cache pointer itself (crash).
2. "rcuref - imbalanced put()" warning if it attempts to release a
   dst that was concurrently released by dst_cache_destroy().

Furthermore, calling kfree(ub) immediately after synchronize_net() without
closing the socket first (or waiting after closing it) leaves a window
where a concurrent receiver (tipc_udp_recv()) could start after
synchronize_net(), access ub, and suffer a UAF when kfree(ub) runs.

To fix this, we must defer dst_cache_destroy() and kfree(ub) until after
we have ensured that no more readers can see the bearer/socket and all
existing readers have finished:

1. Move the rcast entries from the public list to a private list
   and delete them using list_del_rcu() (stops new transmit readers).
2. Release the bearer socket using udp_tunnel_sock_release() (stops
   new receive readers).
3. Call synchronize_net() to wait for all outstanding RCU readers
   (both transmit and receive) to finish.
4. Now that it is safe, call dst_cache_destroy() on all isolated
   rcast entries and the main bearer cache, and free the memory.

Fixes: e9c1a793210f ("tipc: add dst_cache support for udp media")
Reported-by: syzbot+e14bc5d4942756023b77@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a396a66.52ae72c2.136ac7.0003.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/udp_media.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c
index 988b8a7f953ad6da860e6190f1f244650f121dce..befaf7137caf642462b7203a2429a60386e64db8 100644
--- a/net/tipc/udp_media.c
+++ b/net/tipc/udp_media.c
@@ -808,21 +808,26 @@ static void cleanup_bearer(struct work_struct *work)
 {
 	struct udp_bearer *ub = container_of(work, struct udp_bearer, work);
 	struct udp_replicast *rcast, *tmp;
+	LIST_HEAD(private_list);
 	struct tipc_net *tn;

 	list_for_each_entry_safe(rcast, tmp, &ub->rcast.list, list) {
-		dst_cache_destroy(&rcast->dst_cache);
 		list_del_rcu(&rcast->list);
-		kfree_rcu(rcast, rcu);
+		list_add(&rcast->list, &private_list);
 	}

 	tn = tipc_net(sock_net(ub->sk));

-	dst_cache_destroy(&ub->rcast.dst_cache);
 	udp_tunnel_sock_release(ub->sk);

-	/* Note: could use a call_rcu() to avoid another synchronize_net() */
 	synchronize_net();
+
+	list_for_each_entry_safe(rcast, tmp, &private_list, list) {
+		dst_cache_destroy(&rcast->dst_cache);
+		kfree(rcast);
+	}
+
+	dst_cache_destroy(&ub->rcast.dst_cache);
 	atomic_dec(&tn->wq_count);
 	kfree(ub);
 }
-- 
2.55.0.rc0.799.gd6f94ed593-goog

^ permalink raw reply related

* Re: [PATCH] net: sungem: fix probe error cleanup
From: Simon Horman @ 2026-06-22 17:02 UTC (permalink / raw)
  To: Ruoyu Wang
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260620155326.80582-1-ruoyuw560@gmail.com>

On Sat, Jun 20, 2026 at 11:53:26PM +0800, Ruoyu Wang wrote:
> gem_init_one() calls gem_remove_one() when register_netdev() fails.
> That path unregisters and frees resources owned by the net_device,
> then probe continues into its own cleanup labels and touches the same
> state again.
> 
> Clear the driver data and remove the NAPI instance on this error path,
> then let the existing probe cleanup labels release the resources once.

Hi Ruoyu,

I think it would be useful to explain how this problem was found,
naming any publicly available tools that were used.

And to explain what testing has occurred.

> 

A fixes tag should go here (no blank line between it and your
Signed-off-by line).
> Signed-off-by: Ruoyu Wang <ruoyuw560@gmail.com>

...

-- 
pw-bot: changes-requested

^ permalink raw reply

* [syzbot] [net?] WARNING in rcuref_put_slowpath
From: syzbot @ 2026-06-22 17:01 UTC (permalink / raw)
  To: davem, edumazet, horms, kuba, linux-kernel, netdev, pabeni,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    4fa3f5fabb30 Add linux-next specific files for 20260616
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1427abd2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=6c414e1864e61ef6
dashboard link: https://syzkaller.appspot.com/bug?extid=e14bc5d4942756023b77
compiler:       Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/bf5b803a695d/disk-4fa3f5fa.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/47871e7c589e/vmlinux-4fa3f5fa.xz
kernel image: https://storage.googleapis.com/syzbot-assets/53cd9ef32a2b/bzImage-4fa3f5fa.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e14bc5d4942756023b77@syzkaller.appspotmail.com

------------[ cut here ]------------
rcuref - imbalanced put()
WARNING: lib/rcuref.c:266 at rcuref_put_slowpath+0x16e/0x1d0 lib/rcuref.c:266, CPU#0: ktimers/0/16
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Not tainted syzkaller #0 PREEMPT_{RT,(full)} 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
RIP: 0010:rcuref_put_slowpath+0x16e/0x1d0 lib/rcuref.c:266
Code: c1 e8 03 42 0f b6 04 38 84 c0 75 48 c7 03 00 00 00 a0 31 c0 e9 6d ff ff ff e8 fe 13 80 06 e8 49 d7 14 fd 48 8d 3d 62 f3 e6 0a <67> 48 0f b9 3a 48 89 df be 04 00 00 00 e8 40 b3 80 fd 48 89 d8 48
RSP: 0018:ffffc90000157560 EFLAGS: 00010246
RAX: ffffffff84b04017 RBX: ffff888025328740 RCX: ffff88801d680000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff8f973380
RBP: ffffc900001575e8 R08: 0000000000000000 R09: 0000000000000100
R10: dffffc0000000000 R11: ffffed1004a650e9 R12: 1ffff9200002aeac
R13: dffffc0000000000 R14: 00000000dfffffff R15: dffffc0000000000
FS:  0000000000000000(0000) GS:ffff888125ed3000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f67d5f2c558 CR3: 0000000026bb4000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 __rcuref_put include/linux/rcuref.h:117 [inline]
 rcuref_put+0x15b/0x170 include/linux/rcuref.h:173
 dst_release+0x31/0x1b0 net/core/dst.c:251
 dst_cache_per_cpu_get+0x25a/0x2d0 net/core/dst_cache.c:57
 dst_cache_get+0x10d/0x1e0 net/core/dst_cache.c:75
 tipc_udp_xmit+0xcf/0xb30 net/tipc/udp_media.c:177
 tipc_bearer_xmit_skb+0x2b3/0x400 net/tipc/bearer.c:576
 tipc_disc_timeout+0x642/0x790 net/tipc/discover.c:338
 call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748
 expire_timers kernel/time/timer.c:1799 [inline]
 __run_timers kernel/time/timer.c:2374 [inline]
 __run_timer_base+0x67b/0x9b0 kernel/time/timer.c:2386
 run_timer_base kernel/time/timer.c:2395 [inline]
 run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2405
 handle_softirqs+0x1d9/0x6c0 kernel/softirq.c:626
 __do_softirq kernel/softirq.c:660 [inline]
 run_ktimerd+0x69/0x100 kernel/softirq.c:1155
 smpboot_thread_fn+0x57c/0xa80 kernel/smpboot.c:160
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>
----------------
Code disassembly (best guess):
   0:	c1 e8 03             	shr    $0x3,%eax
   3:	42 0f b6 04 38       	movzbl (%rax,%r15,1),%eax
   8:	84 c0                	test   %al,%al
   a:	75 48                	jne    0x54
   c:	c7 03 00 00 00 a0    	movl   $0xa0000000,(%rbx)
  12:	31 c0                	xor    %eax,%eax
  14:	e9 6d ff ff ff       	jmp    0xffffff86
  19:	e8 fe 13 80 06       	call   0x680141c
  1e:	e8 49 d7 14 fd       	call   0xfd14d76c
  23:	48 8d 3d 62 f3 e6 0a 	lea    0xae6f362(%rip),%rdi        # 0xae6f38c
* 2a:	67 48 0f b9 3a       	ud1    (%edx),%rdi <-- trapping instruction
  2f:	48 89 df             	mov    %rbx,%rdi
  32:	be 04 00 00 00       	mov    $0x4,%esi
  37:	e8 40 b3 80 fd       	call   0xfd80b37c
  3c:	48 89 d8             	mov    %rbx,%rax
  3f:	48                   	rex.W


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

* Re: [PATCH net] sctp: fix err_chunk memory leaks in INIT handling
From: Simon Horman @ 2026-06-22 16:54 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, linux-sctp, davem, kuba, Eric Dumazet, Paolo Abeni,
	Marcelo Ricardo Leitner
In-Reply-To: <0656704f1b0158287c98aec09ba36c83e4a537ab.1781970534.git.lucien.xin@gmail.com>

On Sat, Jun 20, 2026 at 11:48:54AM -0400, Xin Long wrote:
> When sctp_verify_init() encounters unrecognized parameters, it allocates an
> err_chunk to report them. However, this chunk is leaked in several code
> paths:
> 
> 1. In sctp_sf_do_5_1B_init(), if security_sctp_assoc_request() fails after
>    sctp_verify_init() has populated err_chunk, the function returns
>    immediately without freeing it.
> 
> 2. In sctp_sf_do_unexpected_init(), the same leak occurs on the
>    security_sctp_assoc_request() failure path.
> 
> 3. In sctp_sf_do_unexpected_init(), on the success path after copying
>    unrecognized parameters to the INIT-ACK, the function returns without
>    freeing err_chunk, unlike sctp_sf_do_5_1B_init() which properly frees
>    it.
> 
> Fix all three leaks by adding sctp_chunk_free(err_chunk) calls before
> returning in the error paths and on the success path in
> sctp_sf_do_unexpected_init().
> 
> Fixes: c081d53f97a1 ("security: pass asoc to sctp_assoc_request and sctp_sk_clone")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net v2 2/2] sctp: add INIT verification after cookie unpacking
From: Simon Horman @ 2026-06-22 16:51 UTC (permalink / raw)
  To: lucien.xin
  Cc: Simon Horman, netdev, linux-sctp, davem, kuba, edumazet, pabeni,
	marcelo.leitner
In-Reply-To: <c34fee6105f9051a52f8e774fe4c35f0e6ba0afc.1781968162.git.lucien.xin@gmail.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
sctp: add INIT verification after cookie unpacking

This commit adds sctp_verify_init() checks in the COOKIE-ECHO processing
paths sctp_sf_do_5_1D_ce() and sctp_sf_do_5_2_4_dupcook() when
cookie_auth_enable is disabled, and updates sctp_verify_init() to bound
the parameter walk against peer_init length rather than chunk->chunk_end.
On verification failure the new association is freed and an ABORT is
generated via sctp_abort_on_init_err().

> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index 8c636f045e45..6967e889d1bd 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -650,11 +650,12 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
>  					 struct sctp_cmd_seq *commands)
>  {
>  	struct sctp_ulpevent *ev, *ai_ev = NULL, *auth_ev = NULL;
> +	struct sctp_chunk *err_chk_p = NULL;
>  	struct sctp_association *new_asoc;
>  	struct sctp_init_chunk *peer_init;
>  	struct sctp_chunk *chunk = arg;
> -	struct sctp_chunk *err_chk_p;
>  	struct sctp_chunk *repl;
> +	enum sctp_cid cid;
>  	struct sock *sk;
>  	int error = 0;
>  
> @@ -728,6 +729,18 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net,
>  		}
>  	}
>  
> +	peer_init = (struct sctp_init_chunk *)(chunk->subh.cookie_hdr + 1);
> +	cid = peer_init->chunk_hdr.type;
> +	if (!sctp_sk(sk)->cookie_auth_enable &&
> +	    !sctp_verify_init(net, ep, asoc, cid, peer_init, chunk,
> +			      &err_chk_p)) {
> +		sctp_association_free(new_asoc);
> +		return sctp_abort_on_init_err(net, ep, asoc, cid, arg, commands,
> +					      err_chk_p);
> +	}

[Critical]
When cookie_auth_enable is disabled the cookie HMAC is not validated by
sctp_unpack_cookie(), so every byte of the embedded INIT (including
peer_init->chunk_hdr.type) is attacker-controlled.  Is it safe to feed
that byte directly into both sctp_verify_init() and
sctp_abort_on_init_err() as cid?

Inside the COOKIE-ECHO body the embedded chunk is by RFC always an INIT.
Could the cid argument be hard-coded to SCTP_CID_INIT (or
peer_init->chunk_hdr.type be validated against SCTP_CID_INIT before use)
instead of trusting the wire byte?

Two attacker-reachable consequences look possible if cid is forged to
SCTP_CID_INIT_ACK (3) and the STATE_COOKIE parameter is omitted:

sctp_verify_init() then returns 0 via this branch with *errp populated:

	if ((SCTP_CID_INIT_ACK == cid) && !has_cookie)
		return sctp_process_missing_param(asoc, SCTP_PARAM_STATE_COOKIE,
						  chunk, errp);

control transfers to sctp_abort_on_init_err(net, ep, asoc, cid, ...) with
cid == SCTP_CID_INIT_ACK and a non-NULL err_chunk.  After sending the
ABORT packet, that helper falls through to its out: label because
cid == SCTP_CID_INIT_ACK:

	if (cid != SCTP_CID_INIT_ACK) {
		if (!packet)
			return SCTP_DISPOSITION_NOMEM;
		return SCTP_DISPOSITION_CONSUME;
	}

out:
	...
	return sctp_stop_t1_and_abort(net, commands, error, ECONNREFUSED,
				      asoc, chunk->transport);

In the sctp_sf_do_5_1D_ce() context, asoc is typically NULL (the function
itself opens with if (asoc && !sctp_vtag_verify(chunk, asoc))).  Can the
SCTP_CMD_NEW_STATE / SCTP_CMD_TIMER_STOP / SCTP_CMD_INIT_FAILED commands
queued by sctp_stop_t1_and_abort() against a NULL asoc be dispatched
without the sideeffect interpreter dereferencing asoc?

> @@ -2205,6 +2219,19 @@ enum sctp_disposition sctp_sf_do_5_2_4_dupcook(
>  	switch (action) {
>  	case 'A': /* Association restart. */
>  	case 'B': /* Collision case B. */
> +		peer_init = (struct sctp_init_chunk *)
> +				(chunk->subh.cookie_hdr + 1);
> +		cid = peer_init->chunk_hdr.type;
> +		if (!sctp_sk(ep->base.sk)->cookie_auth_enable &&
> +		    !sctp_verify_init(net, ep, asoc, cid, peer_init, chunk,
> +				      &err_chk_p)) {
> +			sctp_association_free(new_asoc);
> +			return sctp_abort_on_init_err(net, ep, asoc, cid, arg,
> +						      commands, err_chk_p);
> +		}
> +		if (err_chk_p)
> +			sctp_chunk_free(err_chk_p);
> +		fallthrough;

[Critical]
The same attacker-controlled cid is propagated here as well.  In
sctp_sf_do_5_2_4_dupcook() asoc is the existing established association.
With a forged cid of SCTP_CID_INIT_ACK and missing STATE_COOKIE, can a
single unauthenticated COOKIE-ECHO drive sctp_abort_on_init_err() into
its out: path and have sctp_stop_t1_and_abort() tear down the established
asoc and report ECONNREFUSED to the application?

Would constraining the verification to the well-defined INIT case (for
example, hard-coding SCTP_CID_INIT or rejecting any cid !=
SCTP_CID_INIT before calling sctp_verify_init()/sctp_abort_on_init_err())
avoid both of these paths?

^ permalink raw reply

* Re: [PATCH net v2 0/2] net: ethernet: sunplus: spl2sw: fix of_node refcount leaks
From: 呂芳騰 @ 2026-06-22 16:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Shitalkumar Gandhi, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, netdev, linux-kernel,
	Shitalkumar Gandhi
In-Reply-To: <20260621132239.79000dae@kernel.org>

Hi Jakub,

You are right. I will submit a patch to mark the SUNPLUS ETHERNET DRIVER
as Orphaned, as I no longer have the hardware or the capacity to maintain it.

I contacted the managers at Sunplus, but they have declined to take over
the maintenance position.

Thank you for your understanding, and sorry for any inconvenience caused.

Best regards,
Wells Lu

Jakub Kicinski <kuba@kernel.org> 於 2026年6月22日週一 上午4:22寫道：
>
> On Sun, 21 Jun 2026 12:38:06 +0800 呂芳騰 wrote:
> > I'm sorry that I can't test the fix.
> > I've left from Suplus and don't have the relevant hardware.
>
> That makes things harder.. but you don't necessarily need HW to review
> most of the patches. If you don't intend to serve as a maintainer of
> the sunplus driver please sense a patch to MAINTAINERS and step down.
> Right now you are listed but don't seem to be fulfilling the duties.
> Or please review the patches to the best of your ability without
> testing.

^ permalink raw reply

* Re: [PATCH iproute2-next] "ip help" wrong output, exit code.
From: David Laight @ 2026-06-22 16:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Dmitri Seletski, netdev
In-Reply-To: <20260622075700.27806286@phoenix.local>

On Mon, 22 Jun 2026 07:57:00 -0700
Stephen Hemminger <stephen@networkplumber.org> wrote:

> On Sun, 21 Jun 2026 22:48:59 +0100
> Dmitri Seletski <drjoms@gmail.com> wrote:
> 
> > From 0805e07105cd15c5b94271a4706e50e3c65dbde5 Mon Sep 17 00:00:00 2001
> > From: Dmitri Seletski <drjoms@gmail.com>
> > Date: Sun, 21 Jun 2026 22:12:43 +0100
> > Subject: [PATCH iproute2-next]  "ip help" wrong output, exit code.
> > 
> > Changed output of "ip help" from standard error to standard output. And 
> > Exit is now 0 instead of -1. "ip help|grep bridge" - now gives bridge 
> > syntax instead of flooding user with everything from "ip help".
> > ---
> > ip/ip.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/ip/ip.c b/ip/ip.c
> > index e4b71bde..4627b61c 100644
> > --- a/ip/ip.c
> > +++ b/ip/ip.c
> > @@ -56,7 +56,7 @@ static void usage(void) __attribute__((noreturn));
> > 
> > static void usage(void)
> > {
> > -fprintf(stderr,
> > +fprintf(stdout,
> > "Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }\n"
> > "       ip [ -force ] -batch filename\n"
> > "where  OBJECT := { address | addrlabel | fou | help | ila | ioam | l2tp 
> > | link |\n"
> > @@ -72,7 +72,7 @@ static void usage(void)
> > "                    -o[neline] | -t[imestamp] | -ts[hort] | -b[atch] 
> > [filename] |\n"
> > "                    -rc[vbuf] [size] | -n[etns] name | -N[umeric] | 
> > -a[ll] |\n"
> > "                    -c[olor]}\n");
> > -exit(-1);
> > +exit(0);
> > }  
> 
> Your mailer damages white space.
> 

The output also needs to depend on whether these is a 'usage' error or
if 'help' is requested.
Code code is correct for the former - except it should do exit(1).

	David



^ permalink raw reply

* Re: [PATCH net v3] net: airoha: fix BQL underflow in shared QDMA TX ring
From: Simon Horman @ 2026-06-22 16:39 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Wayen Yan, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260620-airoha-bql-fixes-v3-1-76b95374e63e@kernel.org>

On Sat, Jun 20, 2026 at 05:04:51PM +0200, Lorenzo Bianconi wrote:
> When multiple netdevs share a QDMA TX ring and one device is stopped,
> netdev_tx_reset_subqueue() zeroes that device's BQL counters while its
> pending skbs remain in the shared HW TX ring. When NAPI later completes
> those skbs via netdev_tx_completed_queue(), the already-zeroed
> dql->num_queued counter underflows.
> 
> Fix the issue:
> - Remove netdev_tx_reset_subqueue() from airoha_dev_stop() so pending
>   skbs are completed naturally by NAPI with proper BQL accounting.
> - Rework airoha_qdma_tx_cleanup() to disable TX DMA, flush BQL
>   counters, DMA-unmap and free all pending skbs while skb->dev
>   references are still valid. Use a per-queue flushing flag checked
>   under q->lock in airoha_dev_xmit() to prevent races between teardown
>   and transmit. Call airoha_qdma_stop_napi() before
>   airoha_qdma_tx_cleanup() at the call sites.
> - Move DMA engine start into probe. Split DMA teardown so TX DMA is
>   disabled in airoha_qdma_tx_cleanup() and RX DMA in
>   airoha_qdma_cleanup().
> - Remove qdma->users counter since DMA lifetime is now tied to
>   probe/cleanup rather than per-netdev open/stop.
> 
> Fixes: a9c2ca61fec7 ("net: airoha: Support multiple net_devices for a single FE GDM port")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
> Changes in v3:
> - Remove airoha_qdma users counter.
> - Drop DEV_STATE_FLUSH bit and add per-queue flushing bool to avoid any
>   race between airoha_qdma_tx_flush() and airoha_dev_xmit().
> - Refactor airoha_qdma_cleanup_tx_queue().
> - Rename airoha_qdma_cleanup_tx_queue() in airoha_qdma_tx_cleanup().
> - Link to v2: https://lore.kernel.org/r/20260619-airoha-bql-fixes-v2-1-4351d6a24484@kernel.org
> 

Thanks for the updates.

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: building ynl afaics requires updating the UAPI headers first
From: Thorsten Leemhuis @ 2026-06-22 16:33 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Donald Hunter, netdev, Riana Tauro
In-Reply-To: <20260622090534.2af8ae9e@kernel.org>

On 6/22/26 18:05, Jakub Kicinski wrote:
> On Fri, 19 Jun 2026 09:28:47 +0200 Thorsten Leemhuis wrote:
>> On 6/19/26 02:06, Jakub Kicinski wrote:
>>> On Thu, 18 Jun 2026 15:39:46 +0200 Thorsten Leemhuis wrote:  
>>>> DRM_RAS_CMD_CLEAR_ERROR_COUNTER was introduced to mainline yesterday as
>>>> ee18d39a087792 ("drm/drm_ras: Add clear-error-counter netlink command to
>>>> drm_ras") [v7.1-post].
>>>>
>>>> I finally looked closer today and noticed how to prevent this: update
>>>> the kernel's UAPI files (e.g. the stuff that lives in /usr/include/) on
>>>> the builder. Thing is: that's basically impossible to do from a srpm, as
>>>> those should not change the build environment and can't even when
>>>> working as non-root.
>>>> [...]  
>>>
>>> Can't repro for some reason, but we probably need something like 
>>> commit 46e9b0224475abc to add the explicit include rule.  
>>
>> Thx for the pointer. So I guess you mean something like the below,
>> which did the trick for me. Will submit this as properly, unless
>> someone points out something stupid in it.
>
> [...]
>
> Looks good, did you send it?

No, because the funny thing is: now I fail to reproduce it myself. And I
don't know why, as 24h earlier when you had written "Can't repro for
some reason" I had once more checked that I could trigger this by
downgrading Fedora's kernel-headers package to a version from some weeks
ago. Not sure what changed since then.

Want me to sent it nevertheless?

Ciao, Thorsten

^ permalink raw reply

* Re: [PATCH net-next RESEND v3 0/2] udp: fix FOU/GUE over multicast
From: Kuniyuki Iwashima @ 2026-06-22 16:20 UTC (permalink / raw)
  To: Anton Danilov
  Cc: netdev, Willem de Bruijn, davem, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Shuah Khan,
	linux-kselftest
In-Reply-To: <cover.1782067871.git.littlesmilingcloud@gmail.com>

On Sun, Jun 21, 2026 at 12:04 PM Anton Danilov
<littlesmilingcloud@gmail.com> wrote:
>
> This is a resend of the v3 series originally posted on 2026-05-05,
> which did not receive review feedback during the previous net-next
> window.

Probably due to RFC, which means it is not official yet, and the series
is also dropped from patchwork.


> No changes since v3; rebased cleanly on current net-next
> (after the net-next-7.2 merge).

net-next is closed, please repost next Monday, following the process doc:

https://docs.kernel.org/7.0/process/maintainer-netdev.html#resending-after-review
---8<---
The new version of patches should be posted as a separate thread,
not as a reply to the previous posting.
---8<---

^ permalink raw reply

* [PATCH] tools: ynl: build archives with $(AR)
From: Greg Thelen @ 2026-06-22 16:16 UTC (permalink / raw)
  To: Donald Hunter, Jakub Kicinski, David S. Miller
  Cc: netdev, linux-kernel, Greg Thelen

Use $(AR) to allow build system to override the archiver tool (e.g.,
when cross-compiling for a different architecture) by setting the AR
environment variable.

GNU Make defaults AR to ar, so this change will not break existing build
environments that do not explicitly set AR.

Fixes: 07c3cc51a085 ("tools: net: package libynl for use in selftests")
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Signed-off-by: Greg Thelen <gthelen@google.com>
---
 tools/net/ynl/Makefile           | 2 +-
 tools/net/ynl/generated/Makefile | 2 +-
 tools/net/ynl/lib/Makefile       | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/net/ynl/Makefile b/tools/net/ynl/Makefile
index d514a48dae27..3cefe4ed96cb 100644
--- a/tools/net/ynl/Makefile
+++ b/tools/net/ynl/Makefile
@@ -22,7 +22,7 @@ tests: | lib generated libynl.a
 ynltool: | lib generated libynl.a
 libynl.a: | lib generated
 	@echo -e "\tAR $@"
-	@ar rcs $@ lib/ynl.o generated/*-user.o
+	@$(AR) rcs $@ lib/ynl.o generated/*-user.o
 
 $(SUBDIRS):
 	@if [ -f "$@/Makefile" ] ; then \
diff --git a/tools/net/ynl/generated/Makefile b/tools/net/ynl/generated/Makefile
index 86e1e4a959a7..ea4128f612d6 100644
--- a/tools/net/ynl/generated/Makefile
+++ b/tools/net/ynl/generated/Makefile
@@ -37,7 +37,7 @@ all: protos.a $(HDRS) $(SRCS) $(KHDRS) $(KSRCS) $(UAPI) $(RSTS)
 
 protos.a: $(OBJS)
 	@echo -e "\tAR $@"
-	@ar rcs $@ $(OBJS)
+	@$(AR) rcs $@ $(OBJS)
 
 %-user.h: $(SPECS_DIR)/%.yaml $(TOOL)
 	@echo -e "\tGEN $@"
diff --git a/tools/net/ynl/lib/Makefile b/tools/net/ynl/lib/Makefile
index 4b2b98704ff9..9b98c0599600 100644
--- a/tools/net/ynl/lib/Makefile
+++ b/tools/net/ynl/lib/Makefile
@@ -15,7 +15,7 @@ all: ynl.a
 
 ynl.a: $(OBJS)
 	@echo -e "\tAR $@"
-	@ar rcs $@ $(OBJS)
+	@$(AR) rcs $@ $(OBJS)
 
 clean:
 	rm -f *.o *.d *~
-- 
2.55.0.rc0.738.g0c8ab3ebcc-goog


^ permalink raw reply related

* Re: [PATCH net v2 2/2] net: airoha: fix netif_set_real_num_tx_queues for sparse QoS channels
From: Simon Horman @ 2026-06-22 16:16 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Wayen Yan, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <ajk0kS76nahtop8g@lore-desk>

On Mon, Jun 22, 2026 at 03:11:45PM +0200, Lorenzo Bianconi wrote:
> > On Fri, Jun 19, 2026 at 01:37:14PM +0200, Lorenzo Bianconi wrote:
> > > airoha_tc_htb_alloc_leaf_queue() assigns queue IDs based on the channel
> > > index (opt->qid = AIROHA_NUM_TX_RING + channel), but updates
> > > real_num_tx_queues with a simple increment (num_tx_queues + 1). When QoS
> > > channels are allocated sparsely (e.g., channels 0 and 3 without 1 and
> > > 2), the returned qid can exceed real_num_tx_queues, causing out-of-bounds
> > > accesses in the networking stack.
> > > For example, allocating channel 0 then channel 3 results in
> > > real_num_tx_queues = 34 but qid = 35, which is out of range [0, 34).
> > > Fix this by computing real_num_tx_queues based on the highest active
> > > channel index rather than using a simple counter, in both the allocation
> > > and deletion paths.
> > > 
> > > Fixes: ef1ca9271313b ("net: airoha: Add sched HTB offload support")
> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > 
> > Thanks for the update since v1.
> > 
> > Reviewed-by: Simon Horman <horms@kernel.org>
> 
> Hi Simon,
> 
> thx for the review.
> 
> > 
> > FTR, there is an AI-generated review of this patch on sashiko.dev.
> > I do not think that should impede the progress of this patch but
> > you may want to consider it in the context of follow-up.
> 
> Even if it is not introduced by this patch, I do not think what is reported
> by Sashiko is a real issue since airoha_eth driver implements
> ndo_select_queue() callback and the selected queue is always in the range
> [0, AIROHA_NUM_TX_RING[. HTB queues (in the range
> [AIROHA_NUM_TX_RING, AIROHA_NUM_TX_RING + AIROHA_NUM_QOS_CHANNELS[) are just 
> 'offloaded' and never used in the TC sw path. Agree?

Thanks Lorenzo,

I've looked over this more closely with the above in mind and I agree.

^ permalink raw reply

* Re: [PATCH bpf-next] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
From: Kuniyuki Iwashima @ 2026-06-22 16:11 UTC (permalink / raw)
  To: jakub
  Cc: ast, bpf, daniel, jiayuan.chen, john.fastabend, kernel-team, kuba,
	netdev
In-Reply-To: <20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com>

From: Jakub Sitnicki <jakub@cloudflare.com>
Date: Mon, 22 Jun 2026 14:58:34 +0200
> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG.
> 
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
>  net/unix/unix_bpf.c | 6 ++++++

AFAIU, everyhing in this file is for BPF_SYSCALL && NET_SOCK_MSG,
or am I missing something ?

I feel that it would be cleaner to add a new Kconfig that depends
on BPF_SYSCALL and NET_SOCK_MSG, change Makefile obj-$(CONFIG_XXX),
and guard .psock_update_sk_prot in af_unix.c


>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index f86ff19e9764..5289a04b4993 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -7,6 +7,7 @@
>  
>  #include "af_unix.h"
>  
> +#ifdef CONFIG_NET_SOCK_MSG
>  #define unix_sk_has_data(__sk, __psock)					\
>  		({	!skb_queue_empty(&__sk->sk_receive_queue) ||	\
>  			!skb_queue_empty(&__psock->ingress_skb) ||	\
> @@ -94,6 +95,7 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>  	sk_psock_put(sk, psock);
>  	return copied;
>  }
> +#endif /* CONFIG_NET_SOCK_MSG */
>  
>  static struct proto *unix_dgram_prot_saved __read_mostly;
>  static DEFINE_SPINLOCK(unix_dgram_prot_lock);
> @@ -107,8 +109,10 @@ static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto
>  {
>  	*prot        = *base;
>  	prot->close  = sock_map_close;
> +#ifdef CONFIG_NET_SOCK_MSG
>  	prot->recvmsg = unix_bpf_recvmsg;
>  	prot->sock_is_readable = sk_msg_is_readable;
> +#endif
>  }
>  
>  static void unix_stream_bpf_rebuild_protos(struct proto *prot,
> @@ -116,8 +120,10 @@ static void unix_stream_bpf_rebuild_protos(struct proto *prot,
>  {
>  	*prot        = *base;
>  	prot->close  = sock_map_close;
> +#ifdef CONFIG_NET_SOCK_MSG
>  	prot->recvmsg = unix_bpf_recvmsg;
>  	prot->sock_is_readable = sk_msg_is_readable;
> +#endif
>  	prot->unhash  = sock_map_unhash;
>  }
>  
> 

^ permalink raw reply

* Re: building ynl afaics requires updating the UAPI headers first
From: Jakub Kicinski @ 2026-06-22 16:05 UTC (permalink / raw)
  To: Thorsten Leemhuis; +Cc: Donald Hunter, netdev, Riana Tauro
In-Reply-To: <cec7685e-df12-4501-b851-9c8f7b8b06cb@leemhuis.info>

On Fri, 19 Jun 2026 09:28:47 +0200 Thorsten Leemhuis wrote:
> On 6/19/26 02:06, Jakub Kicinski wrote:
> > On Thu, 18 Jun 2026 15:39:46 +0200 Thorsten Leemhuis wrote:  
> >> DRM_RAS_CMD_CLEAR_ERROR_COUNTER was introduced to mainline yesterday as
> >> ee18d39a087792 ("drm/drm_ras: Add clear-error-counter netlink command to
> >> drm_ras") [v7.1-post].
> >>
> >> I finally looked closer today and noticed how to prevent this: update
> >> the kernel's UAPI files (e.g. the stuff that lives in /usr/include/) on
> >> the builder. Thing is: that's basically impossible to do from a srpm, as
> >> those should not change the build environment and can't even when
> >> working as non-root.
> >> [...]  
> > 
> > Can't repro for some reason, but we probably need something like 
> > commit 46e9b0224475abc to add the explicit include rule.  
> 
> Thx for the pointer. So I guess you mean something like the below,
> which did the trick for me. Will submit this as properly, unless
> someone points out something stupid in it.
> 
> Ciao, Thorsten
> 
> diff --git a/tools/net/ynl/Makefile.deps b/tools/net/ynl/Makefile.deps
> index cc53b2f21c444..43d06ecbae93d 100644
> --- a/tools/net/ynl/Makefile.deps
> +++ b/tools/net/ynl/Makefile.deps
> @@ -14,10 +14,12 @@ UAPI_PATH:=../../../../include/uapi/
>  
>  get_hdr_inc=-D$(1) -include $(UAPI_PATH)/linux/$(2)
>  get_hdr_inc2=-D$(1) -D$(2) -include $(UAPI_PATH)/linux/$(3)
> +get_hdr_inc_drm=-D$(1) -include $(UAPI_PATH)/drm/$(2)
>  
>  CFLAGS_dev-energymodel:=$(call get_hdr_inc,_LINUX_DEV_ENERGYMODEL_H,dev_energymodel.h)
>  CFLAGS_devlink:=$(call get_hdr_inc,_LINUX_DEVLINK_H_,devlink.h)
>  CFLAGS_dpll:=$(call get_hdr_inc,_LINUX_DPLL_H,dpll.h)
> +CFLAGS_drm_ras:=$(call get_hdr_inc_drm,_LINUX_DRM_RAS_H,drm_ras.h)
>  CFLAGS_ethtool:=$(call get_hdr_inc,_LINUX_TYPELIMITS_H,typelimits.h) \
>         $(call get_hdr_inc,_LINUX_ETHTOOL_H,ethtool.h) \
>         $(call get_hdr_inc,_LINUX_ETHTOOL_NETLINK_H_,ethtool_netlink.h) \

Looks good, did you send it? Please send it to networking if possible.
I don't have much experience with DRM trees but I don't want this patch
to get stuck somewhere and keep causing issues.

^ permalink raw reply

* Re: [RFC net-next 08/15] ipxlat: add translation engine and dispatch core
From: Ralf Lici @ 2026-06-22 15:56 UTC (permalink / raw)
  To: Beniamino Galvani
  Cc: Toke Høiland-Jørgensen, netdev, Daniel Gröber,
	Antonio Quartulli, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-kernel
In-Reply-To: <ajjzF3W-0xDC133f@tp>

On Mon, 22 Jun 2026 10:32:23 +0200, Beniamino Galvani <bgalvani@redhat.com> wrote:
> Hi,
>

Hi Beniamino,

> speaking as a maintainer of NetworkManager, I would also like to see
> this feature in the kernel!
>
> In NetworkManager currently we are using a BPF program [1] to
> implement the CLAT, but that approach comes with limitations: for
> example, we can't fragment v4->v6 packets if needed, and it's not
> possible to recompute checksums in certain cases (e.g. for v4->v6 UDP
> packets with zero checksum, and for fragmented ICMP). systemd-networkd
> is also adding CLAT support via BPF [2], with a fallback to userspace
> for the cases that can't be handled in kernel.
>
> It would be very useful to have a native in-kernel CLAT that solves
> the limitations of BPF-based solutions, and can be used by different
> tools without having to re-implement everything from scratch.
>

Thanks, this is really useful context.

CLAT is exactly the kind of consumer ipxlat aims to serve, and the gaps
you hit in BPF line up directly with paths ipxlat already handles. I'll
cite this as motivation in the next cover letter, if that's alright.

While reading the BPF programs, two things stood out that would help
shape v2. On addressing, both implementations use a single NAT64/PLAT
prefix for destinations plus an explicit local_v4<>local_v6 mapping for
the host itself. ipxlat today maps both source and destination through
one RFC 6052 prefix, so this suggests v2 should probably support
explicit 1:1 address mappings (EAM, RFC 7757) alongside prefix
embedding. Is a single local mapping enough for your case, or do you
foresee needing several?

On the consumer side, is there anything in how NM models a connection
that would make a particular kernel model awkward to drive, e.g. needing
to attach to an already-managed interface, or conversely being able to
create and own a dedicated device? We're still settling the
kernel-facing model for v2, so consumer input here is genuinely
valuable.

Thanks,

-- 
Ralf Lici
Mandelbit Srl

^ permalink raw reply

* Re: AppArmor: TCP Fast Open bypasses connect mediation (last unaddressed LSM)
From: Ryan Lee @ 2026-06-22 16:01 UTC (permalink / raw)
  To: Bryam Vargas
  Cc: John Johansen, linux-security-module, apparmor, Paul Moore,
	James Morris, Serge E . Hallyn, Mickael Salaun, Stephen Smalley,
	Matthieu Buffet, Mikhail Ivanov, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260619011138.264578-1-hexlabsecurity@proton.me>

On Thu, Jun 18, 2026 at 6:12 PM Bryam Vargas <hexlabsecurity@proton.me> wrote:
>
> Hello John, and LSM folks,
>
> I have been working on the Landlock TCP Fast Open connect bypass [1]. Stephen
> Smalley's SELinux fix for the same issue [3] -- "Similar to Landlock, SELinux was
> not updated when TCP Fast Open support was introduced ..." -- made me go back and
> check the rest of the connect-mediating LSMs, since I had only been looking at
> Landlock. With Landlock [2], SELinux [3], and now TOMOYO [4] all getting fixes,
> AppArmor is the last one with the same gap and no fix yet.
>
> Root cause (shared with the others)
> -----------------------------------
> security_socket_connect() has a single call site, net/socket.c (the connect(2)
> syscall). TCP Fast Open performs an implicit connect inside sendmsg:
>
>   tcp_sendmsg -> tcp_sendmsg_fastopen -> __inet_stream_connect(..., is_sendmsg=1)
>               -> sk->sk_prot->connect()                 net/ipv4/{tcp.c,af_inet.c}
>
> This never calls security_socket_connect(); the only LSM hook on the path is
> security_socket_sendmsg(). mptcp_sendmsg_fastopen reaches the same code and is a
> second producer.
>
> AppArmor
> --------
> apparmor_socket_connect() requests AA_MAY_CONNECT; apparmor_socket_sendmsg() (via
> aa_sock_msg_perm) requests AA_MAY_SEND. These are distinct bits, and apparmor_parser
> compiles them independently: "network send inet stream," yields accept mask 0x02
> while "network connect inet stream," yields 0x40. So an egress-restriction profile
> that grants send but not connect is bypassed by MSG_FASTOPEN.
>
> Reproduced on 6.12.88 with apparmor active. Under a profile granting the inet/inet6
> stream lifecycle except connect:
>
>   aa-exec -p egress_restricted -- ./probe
>   [TCP ] connect(2)=EACCES(blocked)  sendto(MSG_FASTOPEN)=OK(reached)  => connection established
>   [TCP6] connect(2)=EACCES(blocked)  sendto(MSG_FASTOPEN)=OK(reached)  => connection established
>
> (The coarse "network inet stream," idiom grants connect anyway, so this only bites the
> fine-grained "allow send, deny connect" policy that the asymmetry is meant to serve.)
>
> Fix
> ---
> Same shape as the TOMOYO [4] and SELinux [3] fixes: in apparmor_socket_sendmsg (or
> aa_sock_msg_perm), when MSG_FASTOPEN is set and msg_name carries a destination on a
> not-yet-connected stream socket, additionally require aa_sk_perm(OP_CONNECT,
> AA_MAY_CONNECT, sk). I am happy to send that patch and the reproducer.

We would appreciate having the patch and the reproducer to look over.
Ideally, the reproducer could be integrated as a regression test into
the upstream repo at
https://gitlab.com/apparmor/apparmor/-/tree/master/tests/regression/apparmor?ref_type=heads,
but we can also assist with that step.
>
> (A single core check in __inet_stream_connect(), gated on is_sendmsg, would have
> covered all five LSMs and both the TCP and MPTCP producers in one place -- the kernel
> already mediates the analogous implicit-connect-on-send for AF_UNIX via
> security_unix_may_send and for SCTP via security_sctp_bind_connect. But since the
> other four LSMs are taking per-hook fixes, AppArmor matching them is the consistent
> move; mentioning the core option only in case it is preferred.)
>
> [1] Landlock: LANDLOCK_ACCESS_NET_CONNECT_TCP bypass via TCP Fast Open (report)
>     https://lore.kernel.org/r/20260616201615.275032-1-hexlabsecurity@proton.me
> [2] landlock: fix TCP Fast Open connection bypass (Matthieu Buffet)
>     https://lore.kernel.org/r/20260617180526.15627-2-matthieu@buffet.re
> [3] selinux: check connect-related permissions on TCP Fast Open (Stephen Smalley)
>     https://lore.kernel.org/r/20260618175513.112443-2-stephen.smalley.work@gmail.com
> [4] tomoyo: Enforce connect policy in TCP Fast Open (Matthieu Buffet)
>     https://lore.kernel.org/r/20260619002207.61104-1-matthieu@buffet.re
>
> Thanks,
> Bryam Vargas
>
>

^ permalink raw reply

* Re: [RFC net-next 00/17] MPTCP KTLS support
From: Jakub Kicinski @ 2026-06-22 16:00 UTC (permalink / raw)
  To: Geliang Tang
  Cc: Matthieu Baerts, Mat Martineau, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Neal Cardwell, Kuniyuki Iwashima,
	John Fastabend, Sabrina Dubroca, Hannes Reinecke, Geliang Tang,
	netdev, mptcp, Gang Yan, Zqiang
In-Reply-To: <cover.1782123118.git.tanggeliang@kylinos.cn>

On Mon, 22 Jun 2026 18:43:20 +0800 Geliang Tang wrote:
> Subject: [RFC net-next 00/17] MPTCP KTLS support

Please no. We have a ton of unfixed bugs and may have to revert some of
the features we dropped back in. I'd prefer to avoid large new bug
surfaces until we reach an LTS release.

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH net] igc: Fix RX HW timestamp reporting when NET_RX_BUSY_POLL is disabled
From: Paul Menzel @ 2026-06-22 15:59 UTC (permalink / raw)
  To: Ding Meng, Florian Bezdeka
  Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem,
	edumazet, kuba, pabeni, jan.kiszka, intel-wired-lan, linux-kernel,
	netdev, wq.wang
In-Reply-To: <20260622041718.6106-1-meng.ding@siemens.com>

Dear Ding,


Thank you for your patch.

Am 22.06.26 um 06:13 schrieb Ding Meng via Intel-wired-lan:
> When CONFIG_NET_RX_BUSY_POLL is deactivated, fetching RX HW timestamps
> from the NIC no longer works as expected.

Maybe paste some logs/errors, so it can be easier found by people with 
the same issue.

> This occurs because disabling CONFIG_NET_RX_BUSY_POLL disables the
> SKB NAPI mapping in __skb_mark_napi_id(). Consequently, get_timestamp()
> fails to perform its driver lookup, and the igc driver's struct
> net_device_ops::ndo_get_tstamp is never invoked.
> 
> Instead, get_timestamp() falls back to use shhwtstamps(skb)->hwtstamp,
> a field that the driver has not populated.
> 
> Fix this by populating the hwtstamp field with the correct timestamp
> in the default timer when CONFIG_NET_RX_BUSY_POLL is disabled.

Maybe detail, why the adapter needs to be passed now.

Also, please describe a test case to check the change.

> Fixes: 069b142f5819 ("igc: Add support for PTP .getcyclesx64()")
> Co-developed-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com>
> Signed-off-by: Ding Meng <meng.ding@siemens.com>
> ---
>   drivers/net/ethernet/intel/igc/igc_main.c | 38 ++++++++++++++++-------
>   1 file changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index 8ac16808023..1da8d7aa76d 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -1992,7 +1992,26 @@ static struct sk_buff *igc_build_skb(struct igc_ring *rx_ring,
>   	return skb;
>   }
>   
> -static struct sk_buff *igc_construct_skb(struct igc_ring *rx_ring,
> +static void igc_construct_skb_timestamps(struct igc_adapter *adapter,
> +					 struct sk_buff *skb,
> +					 struct igc_xdp_buff *ctx)
> +{
> +	if (!ctx->rx_ts)
> +		return;
> +#ifdef CONFIG_NET_RX_BUSY_POLL

Is there a way to do this in C instead of the pre-processor. That way 
all the code gets build tested. (Is there a config with disabled 
NET_RX_BUSY_POLL?)

> +	skb_shinfo(skb)->tx_flags |= SKBTX_HW_TSTAMP_NETDEV;
> +	skb_hwtstamps(skb)->netdev_data = ctx->rx_ts;
> +#else
> +	struct igc_inline_rx_tstamps *tstamps;
> +
> +	tstamps = ctx->rx_ts;
> +	skb_hwtstamps(skb)->hwtstamp = igc_ptp_rx_pktstamp(adapter,
> +							   tstamps->timer0);
> +#endif
> +}
> +
> +static struct sk_buff *igc_construct_skb(struct igc_adapter *adapter,
> +					 struct igc_ring *rx_ring,
>   					 struct igc_rx_buffer *rx_buffer,
>   					 struct igc_xdp_buff *ctx)
>   {
> @@ -2013,10 +2032,7 @@ static struct sk_buff *igc_construct_skb(struct igc_ring *rx_ring,
>   	if (unlikely(!skb))
>   		return NULL;
>   
> -	if (ctx->rx_ts) {
> -		skb_shinfo(skb)->tx_flags |= SKBTX_HW_TSTAMP_NETDEV;
> -		skb_hwtstamps(skb)->netdev_data = ctx->rx_ts;
> -	}
> +	igc_construct_skb_timestamps(adapter, skb, ctx);
>   
>   	/* Determine available headroom for copy */
>   	headlen = size;
> @@ -2686,7 +2702,7 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
>   		else if (ring_uses_build_skb(rx_ring))
>   			skb = igc_build_skb(rx_ring, rx_buffer, &ctx.xdp);
>   		else
> -			skb = igc_construct_skb(rx_ring, rx_buffer, &ctx);
> +			skb = igc_construct_skb(adapter, rx_ring, rx_buffer, &ctx);
>   
>   		/* exit if we failed to retrieve a buffer */
>   		if (!xdp_res && !skb) {
> @@ -2738,7 +2754,8 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
>   	return total_packets;
>   }
>   
> -static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
> +static struct sk_buff *igc_construct_skb_zc(struct igc_adapter *adapter,
> +					    struct igc_ring *ring,
>   					    struct igc_xdp_buff *ctx)
>   {
>   	struct xdp_buff *xdp = &ctx->xdp;
> @@ -2760,10 +2777,7 @@ static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
>   		__skb_pull(skb, metasize);
>   	}
>   
> -	if (ctx->rx_ts) {
> -		skb_shinfo(skb)->tx_flags |= SKBTX_HW_TSTAMP_NETDEV;
> -		skb_hwtstamps(skb)->netdev_data = ctx->rx_ts;
> -	}
> +	igc_construct_skb_timestamps(adapter, skb, ctx);
>   
>   	return skb;
>   }
> @@ -2775,7 +2789,7 @@ static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector,
>   	struct igc_ring *ring = q_vector->rx.ring;
>   	struct sk_buff *skb;
>   
> -	skb = igc_construct_skb_zc(ring, ctx);
> +	skb = igc_construct_skb_zc(q_vector->adapter, ring, ctx);
>   	if (!skb) {
>   		ring->rx_stats.alloc_failed++;
>   		set_bit(IGC_RING_FLAG_RX_ALLOC_FAILED, &ring->flags);

Otherwise this looks good.


Kind regards,

Paul

^ permalink raw reply

* Re: [PATCH net-next 0/6] tc: introduce FRER action (IEEE 802.1CB)
From: Jakub Kicinski @ 2026-06-22 15:59 UTC (permalink / raw)
  To: Xiaoliang Yang
  Cc: netdev, linux-kernel, linux-kselftest, davem, edumazet, pabeni,
	jhs, jiri, horms, shuah, vladimir.oltean, vinicius.gomes, fejes
In-Reply-To: <20260622092118.6846-1-xiaoliang.yang_1@nxp.com>

On Mon, 22 Jun 2026 17:21:12 +0800 Xiaoliang Yang wrote:
> This series introduces a new TC action implementing
> Frame Replication and Elimination for Reliability (FRER)
> as defined in IEEE 802.1CB.

## Form letter - net-next-closed

We have already submitted our pull request with net-next material for v7.2,
and therefore net-next is closed for new drivers, features, code refactoring
and optimizations. We are currently accepting bug fixes only.

Please repost when net-next reopens after June 29th.

RFC patches sent for review only are obviously welcome at any time.

See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#development-cycle
-- 
pw-bot: defer
pv-bot: closed

^ permalink raw reply

* Re: [PATCH net] net: au1000: move free_irq out of the close-time spinlocked section
From: Jakub Kicinski @ 2026-06-22 15:56 UTC (permalink / raw)
  To: Runyu Xiao
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni, netdev,
	linux-kernel, stable
In-Reply-To: <20260619151816.1144289-1-runyu.xiao@seu.edu.cn>

On Fri, 19 Jun 2026 23:18:16 +0800 Runyu Xiao wrote:
> au1000_close() calls free_irq() while aup->lock is still held with
> spin_lock_irqsave(). free_irq() can sleep because it takes the IRQ
> descriptor request mutex, so it does not belong inside the close-time
> spinlocked section.
> 
> This was found by our static analysis tool and then confirmed by manual
> review of the in-tree au1000_close() .ndo_stop path. The reviewed path
> keeps aup->lock held across the MAC reset, queue stop and
> free_irq(dev->irq, dev).
> 
> A directed runtime validation kept that ndo_stop carrier and the same
> free_irq(dev->irq, dev) operation under the driver lock. Lockdep reported
> "BUG: sleeping function called from invalid context" and "Invalid wait
> context" while free_irq() was taking desc->request_mutex, with
> au1000_close() and free_irq() on the stack.
> 
> Drop aup->lock before freeing the IRQ. The protected close-time work still
> stops the device and queue before IRQ teardown, but the sleepable IRQ core
> path now runs outside the spinlocked section.

Do you really think that this bug matters if nobody fixed it on
a 20+ year old platform?

Please do not point your AI scanning tools at old code!
The patch is valid I guess but we have heaps of bugs like this
that _nobody care about in practice_! You're wasting everyone's
time.

^ permalink raw reply

* Re: [PATCH] net/mlx5: Fix L3 tunnel entropy refcount leak
From: Jakub Kicinski @ 2026-06-22 15:52 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: lirongqing, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
	Andrew Lunn, David S . Miller, Eric Dumazet, Paolo Abeni, netdev,
	linux-rdma, linux-kernel
In-Reply-To: <cd725cc0-9be1-4d12-bc9f-95ecf789613b@nvidia.com>

On Mon, 22 Jun 2026 09:49:17 +0300 Tariq Toukan wrote:
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

To be clear -- you have to take this via your tree now.
Our UIs doesn't even show patches older than a week.

^ permalink raw reply

* [PATCH net] eth: fbnic: fix ordering of heartbeat vs ownership
From: Jakub Kicinski @ 2026-06-22 15:47 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
	Alexander Duyck

When requesting ownership of the NIC (MAC/PHY control), we set up
the heartbeat to look stale:

  /* Initialize heartbeat, set last response to 1 second in the past
   * so that we will trigger a timeout if the firmware doesn't respond
   */
  fbd->last_heartbeat_response = req_time - HZ;
  fbd->last_heartbeat_request = req_time;

The response handler then sets:

  fbd->last_heartbeat_response = jiffies;

for which we wait via:

  fbnic_fw_init_heartbeat() -> fbnic_fw_heartbeat_current()

The scheme is a bit odd, but it should work in principle.

Fix the ordering of operations. We have to set up the stale heartbeat
before we send the message. Otherwise if the response is very fast
we will override it. This triggers on QEMU if we run on the core
that handles the IRQ, and results in ndo_open failing with ETIMEDOUT.

The change in ordering doesn't impact releasing the ownership.
Both ndo_stop and heartbeat check are under rtnl_lock.

Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence")
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index 0c6812fcf185..283d25fae79e 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -526,15 +526,10 @@ int fbnic_fw_xmit_ownership_msg(struct fbnic_dev *fbd, bool take_ownership)
 			goto free_message;
 	}
 
-	err = fbnic_mbx_map_tlv_msg(fbd, msg);
-	if (err)
-		goto free_message;
-
 	/* Initialize heartbeat, set last response to 1 second in the past
 	 * so that we will trigger a timeout if the firmware doesn't respond
 	 */
 	fbd->last_heartbeat_response = req_time - HZ;
-
 	fbd->last_heartbeat_request = req_time;
 
 	/* Set prev_firmware_time to 0 to avoid triggering firmware crash
@@ -542,6 +537,10 @@ int fbnic_fw_xmit_ownership_msg(struct fbnic_dev *fbd, bool take_ownership)
 	 */
 	fbd->prev_firmware_time = 0;
 
+	err = fbnic_mbx_map_tlv_msg(fbd, msg);
+	if (err)
+		goto free_message;
+
 	/* Set heartbeat detection based on if we are taking ownership */
 	fbd->fw_heartbeat_enabled = take_ownership;
 
-- 
2.54.0


^ permalink raw reply related

* RE: Ethtool : PRBS feature
From: Das, Shubham @ 2026-06-22 15:38 UTC (permalink / raw)
  To: Maxime Chevallier, Andrew Lunn
  Cc: Alexander H Duyck, lee@trager.us, netdev@vger.kernel.org,
	mkubecek@suse.cz, D H, Siddaraju, Chintalapalle, Balaji,
	Lindberg, Magnus, niklas.damberg@ericsson.com
In-Reply-To: <be5c474b-c969-49af-8235-825580ee945c@bootlin.com>

Hi Maxime,

> Can you elaborate on what you have in mind for now ? what would the "ethtool --
> phy-test" command look like in terms of its behaviour and parameters ?

We are trying to converge on a userspace uAPI for PRBS/BERT functionality that can work across
different hardware models (PHY-managed, MAC/NIC-offloaded, or firmware-based implementations),
without exposing those differences to userspace.

Based on the functionality we currently have, we proposed below commands in first email :

PRBS Transmitter/Checker Pattern Configuration:
ethtool --phy-test eth1 tx-prbs prbs7
ethtool --phy-test eth2 rx-prbs prbs7

BERT Test:
ethtool --phy-test eth2 bert start
ethtool --phy-test eth2 bert stop

BERT Test Counter Read/ PRBS Lock Status:
ethtool --phy-test eth2 stats

BERT Clear stats - Symbol and Error counter:
ethtool --phy-test eth2 clear-stats

TX Error Injection:
ethtool --phy-test eth1 inject-error 1
ethtool --phy-test eth1 inject-error 1e-3

Disable PRBS Pattern : TX/RX
ethtool --phy-test eth1 tx-prbs off
ethtool --phy-test eth2 rx-prbs off

Approach would be to add a generic ethtool netlink API for PHY/SerDes and allow drivers to implement the operations directly. 
Conceptually:
       ethtool ⇒ ethtool netlink ⇒ driver-specific implementation

We would appreciate your input on whether a command-based model is suitable for a uAPI, and how we should design
it to accommodate different implementation models, such as PHY-based, phylib-based, and MAC/firmware-offloaded PRBS.

- Shubham D

> -----Original Message-----
> From: Maxime Chevallier <maxime.chevallier@bootlin.com>
> Sent: 20 June 2026 20:09
> To: Das, Shubham <shubham.das@intel.com>; Andrew Lunn <andrew@lunn.ch>
> Cc: Alexander H Duyck <alexander.duyck@gmail.com>; lee@trager.us;
> netdev@vger.kernel.org; mkubecek@suse.cz; D H, Siddaraju
> <siddaraju.dh@intel.com>; Chintalapalle, Balaji
> <balaji.chintalapalle@intel.com>; Lindberg, Magnus
> <magnus.k.lindberg@ericsson.com>; niklas.damberg@ericsson.com
> Subject: Re: Ethtool : PRBS feature
> 
> Hi,
> 
> On 6/20/26 15:48, Das, Shubham wrote:
> >> Can you change the firmware to expose the 802.3 registers for PRBS?
> >> You can then write a library which both plylib and your driver can use.
> >
> > Andrew,
> >
> > No, exposing the PRBS registers to drivers is not possible in our design (the
> registers are buried deep within the Accelerator/NIC/PHY/Analog IP hierarchy).
> >
> > Additionally, the PHY PRBS registers are not in accordance with the IEEE Clause
> 45 definitions. For instance, the PRBS registers are paged and 32-bit wide.
> >
> > Given these constraints, we think ethtool --phy-test is a reasonable starting
> point for exposing the long-established Ethernet PRBS functionality to Linux
> userspace, as it aligns well with the driver-owned NIC architecture model. If you
> think a more generic layered approach would be preferable, we would appreciate
> guidance on the expected architecture. That would help us better understand the
> implementation complexity, required effort, and delivery timelines.
> 
> Can you elaborate on what you have in mind for now ? what would the "ethtool --
> phy-test" command look like in terms of its behaviour and parameters ?
> 
> This feature is interesting for multiple people, each having different hardware
> designs and constraints. It's good to consider an iterative approach to build this,
> however we need to have in mind that this is uAPI, so once we commit to a design
> choice, we have to live with it.
> 
> We do have flexibility on the kernel side of the API. We can implement PRBS in
> generic PHY, phylib, some MAC driver that talks to a firmware, etc. and hide away
> these implementation details to userspace, but we need to make sure the uAPI
> we come up with allows us to support all of that.
> 
> Let's figure this out together, if you already have some ideas in mind we can use
> that as a starting point for the discussion :)
> 
> Maxime
> 
> >
> > Thanks,
> > Shubham D
> >
> >> -----Original Message-----
> >> From: Andrew Lunn <andrew@lunn.ch>
> >> Sent: 20 June 2026 00:07
> >> To: Das, Shubham <shubham.das@intel.com>
> >> Cc: Alexander H Duyck <alexander.duyck@gmail.com>; lee@trager.us;
> >> netdev@vger.kernel.org; mkubecek@suse.cz; D H, Siddaraju
> >> <siddaraju.dh@intel.com>; Chintalapalle, Balaji
> >> <balaji.chintalapalle@intel.com>; Lindberg, Magnus
> >> <magnus.k.lindberg@ericsson.com>; niklas.damberg@ericsson.com
> >> Subject: Re: Ethtool : PRBS feature
> >>
> >>> The host driver does not directly access any registers but requests
> >>> the PHY FW to manage PRBS on behalf of it.
> >>
> >> Maybe a dumb question. Why?
> >>
> >> Can you change the firmware to expose the 802.3 registers for PRBS?
> >> You can then write a library which both plylib and your driver can use.
> >>
> >> 	Andrew
> >


^ permalink raw reply

* Re: [PATCH] net: wwan: t7xx: destroy DMA pool on CLDMA late init failure
From: Loic Poulain @ 2026-06-22 15:38 UTC (permalink / raw)
  To: Haoxiang Li
  Cc: chandrashekar.devegowda, haijun.liu, ricardo.martinez,
	ryazanov.s.a, johannes, andrew+netdev, davem, edumazet, kuba,
	pabeni, ilpo.jarvinen, netdev, linux-kernel, stable
In-Reply-To: <20260621031714.3605022-1-haoxiang_li2024@163.com>

On Sun, Jun 21, 2026 at 5:18 AM Haoxiang Li <haoxiang_li2024@163.com> wrote:
>
> t7xx_cldma_late_init() creates md_ctrl->gpd_dmapool before
> initializing the TX and RX rings. If any ring initialization
> fails, the error path frees the already initialized rings but
> leaves the DMA pool allocated.
>
> Destroy md_ctrl->gpd_dmapool on the late-init failure path
> to avoid leaking the DMA pool.
>
> Fixes: 39d439047f1d ("net: wwan: t7xx: Add control DMA interface")
> Cc: stable@vger.kernel.org
> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>

Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>

> ---
>  drivers/net/wwan/t7xx/t7xx_hif_cldma.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
> index e10cb4f9104e..2917cee9b802 100644
> --- a/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
> +++ b/drivers/net/wwan/t7xx/t7xx_hif_cldma.c
> @@ -1063,6 +1063,9 @@ static int t7xx_cldma_late_init(struct cldma_ctrl *md_ctrl)
>         while (i--)
>                 t7xx_cldma_ring_free(md_ctrl, &md_ctrl->tx_ring[i], DMA_TO_DEVICE);
>
> +       dma_pool_destroy(md_ctrl->gpd_dmapool);
> +       md_ctrl->gpd_dmapool = NULL;
> +
>         return ret;
>  }
>
> --
> 2.25.1
>

^ permalink raw reply

* Re: [PATCH net v3 0/6] ipv6: fix error handling in disable_ipv6 sysctl
From: Ido Schimmel @ 2026-06-22 15:35 UTC (permalink / raw)
  To: Fernando Fernandez Mancera
  Cc: netdev, nicolas.dichtel, stephen, horms, pabeni, kuba, edumazet,
	davem, dsahern
In-Reply-To: <20260622130857.5115-1-fmancera@suse.de>

On Mon, Jun 22, 2026 at 03:08:51PM +0200, Fernando Fernandez Mancera wrote:
> While working on a different IPv6 patch series I have spotted multiple
> minor bugs around sysctl error handling and notifications. In general,
> they are not serious issues.

For the series:

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox