Netdev List
 help / color / mirror / Atom feed
* [PATCH net v4] openvswitch: cap upcall PID array size and pre-size vport replies
From: Weiming Shi @ 2026-04-15 12:51 UTC (permalink / raw)
  To: Aaron Conole, Eelco Chaudron, Ilya Maximets, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Pravin B Shelar, Alex Wang, Thomas Graf, netdev,
	dev, Xiang Mei, Weiming Shi

The vport netlink reply helpers allocate a fixed-size skb with
nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID
array via ovs_vport_get_upcall_portids().  Since
ovs_vport_set_upcall_portids() accepts any non-zero multiple of
sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID
array large enough to overflow the reply buffer, causing nla_put() to
fail with -EMSGSIZE and hitting BUG_ON(err < 0).  On systems with
unprivileged user namespaces enabled (e.g., Ubuntu default), this is
reachable via unshare -Urn since OVS vport mutation operations use
GENL_UNS_ADMIN_PERM.

 kernel BUG at net/openvswitch/datapath.c:2414!
 Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
 CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1
 RIP: 0010:ovs_vport_cmd_set+0x34c/0x400
 Call Trace:
  <TASK>
  genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116)
  genl_rcv_msg (net/netlink/genetlink.c:1194)
  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
  genl_rcv (net/netlink/genetlink.c:1219)
  netlink_unicast (net/netlink/af_netlink.c:1344)
  netlink_sendmsg (net/netlink/af_netlink.c:1894)
  __sys_sendto (net/socket.c:2206)
  __x64_sys_sendto (net/socket.c:2209)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  </TASK>
 Kernel panic - not syncing: Fatal exception

Reject attempts to set more PIDs than nr_cpu_ids in
ovs_vport_set_upcall_portids(), and pre-compute the worst-case reply
size in ovs_vport_cmd_msg_size() based on that bound, similar to the
existing ovs_dp_cmd_msg_size().  nr_cpu_ids matches the cap already
used by the per-CPU dispatch configuration on the datapath side
(ovs_dp_cmd_fill_info() serialises at most nr_cpu_ids PIDs), so the
two sides stay consistent.

Fixes: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
---
v4 (per Ilya):
- Use nr_cpu_ids instead of num_possible_cpus() for consistency with
  the per-CPU dispatch on the datapath side.
- Annotate ovs_vport_cmd_msg_size() per-attribute; split nested sums.
v3: Cap at num_possible_cpus(); add ovs_vport_cmd_msg_size(); keep
    BUG_ON(); fix Fixes tag.
v2: Dynamically size reply skb; drop WARN_ON_ONCE, return plain errors.
---
 net/openvswitch/datapath.c | 33 +++++++++++++++++++++++++++++++--
 net/openvswitch/vport.c    |  3 +++
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index e209099218b4..35e67e51b0d2 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -2184,9 +2184,38 @@ static int ovs_vport_cmd_fill_info(struct vport *vport, struct sk_buff *skb,
 	return err;
 }
 
+static size_t ovs_vport_cmd_msg_size(void)
+{
+	size_t msgsize = NLMSG_ALIGN(sizeof(struct ovs_header));
+
+	msgsize += nla_total_size(sizeof(u32)); /* OVS_VPORT_ATTR_PORT_NO */
+	msgsize += nla_total_size(sizeof(u32)); /* OVS_VPORT_ATTR_TYPE */
+	msgsize += nla_total_size(IFNAMSIZ);    /* OVS_VPORT_ATTR_NAME */
+	msgsize += nla_total_size(sizeof(u32)); /* OVS_VPORT_ATTR_IFINDEX */
+	msgsize += nla_total_size(sizeof(s32)); /* OVS_VPORT_ATTR_NETNSID */
+	/* OVS_VPORT_ATTR_STATS */
+	msgsize += nla_total_size_64bit(sizeof(struct ovs_vport_stats));
+	/* OVS_VPORT_ATTR_UPCALL_STATS(OVS_VPORT_UPCALL_ATTR_SUCCESS +
+	 *                             OVS_VPORT_UPCALL_ATTR_FAIL)
+	 */
+	msgsize += nla_total_size(nla_total_size_64bit(sizeof(u64)) +
+				  nla_total_size_64bit(sizeof(u64)));
+	/* OVS_VPORT_ATTR_UPCALL_PID (capped at nr_cpu_ids by
+	 * ovs_vport_set_upcall_portids())
+	 */
+	msgsize += nla_total_size(nr_cpu_ids * sizeof(u32));
+	/* OVS_VPORT_ATTR_OPTIONS(OVS_TUNNEL_ATTR_DST_PORT +
+	 *                        OVS_TUNNEL_ATTR_EXTENSION(OVS_VXLAN_EXT_GBP))
+	 */
+	msgsize += nla_total_size(nla_total_size(sizeof(u16)) +
+				  nla_total_size(nla_total_size(0)));
+
+	return msgsize;
+}
+
 static struct sk_buff *ovs_vport_cmd_alloc_info(void)
 {
-	return nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	return genlmsg_new(ovs_vport_cmd_msg_size(), GFP_KERNEL);
 }
 
 /* Called with ovs_mutex, only via ovs_dp_notify_wq(). */
@@ -2196,7 +2225,7 @@ struct sk_buff *ovs_vport_cmd_build_info(struct vport *vport, struct net *net,
 	struct sk_buff *skb;
 	int retval;
 
-	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	skb = ovs_vport_cmd_alloc_info();
 	if (!skb)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 23f629e94a36..56b2e2d1a749 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -406,6 +406,9 @@ int ovs_vport_set_upcall_portids(struct vport *vport, const struct nlattr *ids)
 	if (!nla_len(ids) || nla_len(ids) % sizeof(u32))
 		return -EINVAL;
 
+	if (nla_len(ids) / sizeof(u32) > nr_cpu_ids)
+		return -EINVAL;
+
 	old = ovsl_dereference(vport->upcall_portids);
 
 	vport_portids = kmalloc(sizeof(*vport_portids) + nla_len(ids),
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH iwl-net 1/5] iavf: fix null pointer dereference in iavf_detect_recover_hung
From: Simon Horman @ 2026-04-15 12:48 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: Simon Horman, intel-wired-lan, kiran.patil, sylwesterx.dziedziuch,
	netdev, anthony.l.nguyen, avinash.dayanand
In-Reply-To: <20260413073035.4082204-2-aleksandr.loktionov@intel.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
iavf: fix null pointer dereference in iavf_detect_recover_hung

During a concurrent reset, q_vectors are freed and re-allocated while
the watchdog task may still be iterating rings in
iavf_detect_recover_hung(). Dereferencing a NULL q_vector inside
iavf_force_wb() results in a crash. Guard against this by skipping
rings whose q_vector is NULL.

> Fixes: 9c6c12595b73 ("i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout")

The Fixes tag points to commit 9c6c12595b73, but that commit only modified
the i40e driver and did not introduce the iavf_detect_recover_hung function.

The bug was actually introduced by commit 07d44190a389 ("i40e/i40evf: Detect
and recover hung queue scenario"), which added iavf_detect_recover_hung with
the code pattern 'if (tx_ring && tx_ring->desc)' that calls
iavf_force_wb(vsi, tx_ring->q_vector) without checking if q_vector is NULL.

Should the Fixes tag be:

    Fixes: 07d44190a389 ("i40e/i40evf: Detect and recover hung queue scenario")

> diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
> index 363c42bf3dcf..e7e7fc9a5c68 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c

[ ... ]

> @@ -195,8 +194,11 @@ void iavf_detect_recover_hung(struct iavf_vsi *vsi)
>  		return;
>
>  	for (i = 0; i < vsi->back->num_active_queues; i++) {
> -		tx_ring = &vsi->back->tx_rings[i];
> -		if (tx_ring && tx_ring->desc) {
> +		struct iavf_ring *tx_ring = &vsi->back->tx_rings[i];
> +
> +		if (!tx_ring || !tx_ring->q_vector)
> +			continue;

The check `if (!tx_ring || ...)` tests whether tx_ring is NULL, but tx_ring
is assigned as `&vsi->back->tx_rings[i]`, which is the address of an array
element. This address is always non-NULL by C language rules, as noted in
the commit message itself: "the compiler can never observe since an
array-element address is always non-NULL."

This creates misleading logic. Other similar code in the driver correctly
omits such checks (for example, in iavf_virtchnl.c and elsewhere in
iavf_txrx.c).

Should this be simplified to just check the q_vector:

    if (!tx_ring->q_vector)
        continue;

> +		if (tx_ring->desc) {
>  			/* If packet counter has not changed the queue is
>  			 * likely stalled, so force an interrupt for this
>  			 * queue.

[ ... ]

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Russell King (Oracle) @ 2026-04-15 12:43 UTC (permalink / raw)
  To: Sam Edwards
  Cc: Jakub Kicinski, Andrew Lunn, Alexandre Torgue, Andrew Lunn,
	David S. Miller, Eric Dumazet,
	moderated list:BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE,
	linux-stm32, Linux Network Development Mailing List, Paolo Abeni
In-Reply-To: <CAH5Ym4jA8w9=UxMT4vKJpnXkuDHtkFtMcg4u2sy_0S+8wgy-9w@mail.gmail.com>

On Tue, Apr 14, 2026 at 07:12:34PM -0700, Sam Edwards wrote:
> On Tue, Apr 14, 2026 at 6:19 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> > Okay, just a quick note to say that nvidia's 5.10.216-tegra kernel
> > survives iperf3 -c -R to the imx6.
> 
> Hi Russell,
> 
> Aw, you beat me to it! I was about to report that 5.10.104-tegra is
> unaffected. And my iperf3 server is a multi-GbE amd64 machine.
> 
> > Dumping the registers and comparing, and then forcing the RQS and TQS
> > values to 0x23 (+1 = 36, *256 = 9216 bytes) and 0x8f (+1 = 144,
> > *256 = 36864 ytes) respectively seems to solve the problem. Under
> > net-next, these both end up being 0xff (+1 = 256, *256 = 65536 bytes.)
> > Suspiciously, 36 * 4 = 144, and I also see that this kernel programs
> > all four of the MTL receive operation mode registers, but only the
> > first MTL transmit operation mode register. However, DMA channels 1-3
> > aren't initialised.
> 
> Wow, great! I wonder if the problem is that the MTL FIFOs are smaller
> than that, so when the DMA suffers a momentary hiccup, the FIFOs are
> allowed to overflow, putting the hardware in a bad state.
> 
> Though I suspect this is only half of the problem: do you still see
> RBUs? Everything you've shared so far suggests the DMA failures are
> _not_ because the rx ring is drying up.

Yes. Note that RBUs will happen not because of DMA failures, but if
the kernel fails to keep up with the packet rate. RBU means "we read
the next descriptor, and it wasn't owned by hardware".

> > Looking back at 5.10, I don't see any code that would account for these
> > values being programmed for TQS and RQS, it looks like the calculations
> > are basically the same as we have today.
> 
> Note that Nvidia have their own "nvethernet" driver for their vendor
> kernel, which appears to pick the FIFO sizes from hardcoded tables in
> its eqos_configure_mtl_queue() [1] function.

That has:

	const nveu32_t rx_fifo_sz[2U][OSI_EQOS_MAX_NUM_QUEUES] = {
		{ FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U),
		  FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U) },
		{ FIFO_SZ(36U), FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(2U),
		  FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(2U), FIFO_SZ(16U) },
	};
	const nveu32_t tx_fifo_sz[2U][OSI_EQOS_MAX_NUM_QUEUES] = {
		{ FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U), FIFO_SZ(9U),
		  FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U), FIFO_SZ(1U) },
		{ FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U),
		  FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U), FIFO_SZ(8U) },
	};

where each of those values is the RQS/TQS value to use in KiB:

#define FIFO_SZ(x)		((((x) * 1024U) / 256U) - 1U)

This doesn't correspond with the values I'm seeing programmed into
the hardware under the 5.10.216-tegra kernel. I'm seeing TQS = 143
(36KiB), and RQS = 35 (9KiB). Yes, these values exist in the tables
above from a quick look, but they're not in the right place!

For example, tx_fifo_sz[] doesn't contain an entry for 36KiB.
rx_fifo_sz[0][0..3] looks plausible.

It's certainly not a case of misreading the register values, this is
what devmem2 said:

Value at address 0x02490d00: 0x008f000a
Value at address 0x02490d30: 0x02379eb0

where TQS is bits 24:16 of the register at offset 0xd00 - which is
0x8f, and RQS is bits 29:20 of the register at 0xd30, which is
0x23.

Now, as for FIFO sizes, if we sum up all the entries, then we
get:

SUM(rx_fifo_size[0][]) = 60KiB
SUM(rx_fifo_size[1][]) = 64KiB
SUM(tx_fifo_size[0][]) = 60KiB
SUM(tx_fifo_size[1][]) = 64KiB

From what I gather in core_local.h, l_mac_ver contains one of three
values - 0 = Legacy EQOS, 1 = Orin EQOS, 2 = Orin MGBE, and which
set of values is selected by bit 0 of that. Decoding this further,
Legacy EQOS is IP version v5.0, Orin EQOS is v5.3, and Orin MGBE
is v3.1 and v4.0.

So, I wonder whether there's something in "Legacy EQOS" that consumes
4KiB of FIFO that isn't documented in iMX8M (IP v5.1).

Is anyone aware of public SoC documentation that covers the v5.0 IP
version?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* [syzbot ci] Re: [PATCH net] tipc: fix UAF race in tipc_mon_peer_up/down/remove_peer vs bearer teardown
From: syzbot ci @ 2026-04-15 12:39 UTC (permalink / raw)
  To: jmaloy, kai.aizen.dev, kuba, netdev, pabeni, stable,
	tipc-discussion, ying.xue
  Cc: syzbot, syzkaller-bugs
In-Reply-To: <20260415061211.45530-1-95986478+SnailSploit@users.noreply.github.com>

syzbot ci has tested the following series

[v1] [PATCH net] tipc: fix UAF race in tipc_mon_peer_up/down/remove_peer vs bearer teardown
https://lore.kernel.org/all/20260415061211.45530-1-95986478+SnailSploit@users.noreply.github.com
* [PATCH] [PATCH net] tipc: fix UAF race in tipc_mon_peer_up/down/remove_peer vs bearer teardown

and found the following issue:
WARNING: suspicious RCU usage in tipc_mon_delete

Full report is available here:
https://ci.syzbot.org/series/6267bc07-4172-4821-b3e5-dac381479d9d

***

WARNING: suspicious RCU usage in tipc_mon_delete

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      35c2c39832e569449b9192fa1afbbc4c66227af7
arch:      amd64
compiler:  Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config:    https://ci.syzbot.org/builds/a29dabe7-96d8-4072-bc2c-d798a349301e/config
syz repro: https://ci.syzbot.org/findings/f144d75a-7c29-41a1-988e-09892a89baa1/syz_repro

tipc: Disabling bearer <eth:syzkaller0>
=============================
WARNING: suspicious RCU usage
syzkaller #0 Not tainted
-----------------------------
net/tipc/monitor.c:108 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz.2.19/5962:
 #0: ffffffff8fbcba48 (rtnl_mutex){+.+.}-{4:4}, at: tun_detach drivers/net/tun.c:634 [inline]
 #0: ffffffff8fbcba48 (rtnl_mutex){+.+.}-{4:4}, at: tun_chr_close+0x3e/0x1c0 drivers/net/tun.c:3438

stack backtrace:
CPU: 1 UID: 0 PID: 5962 Comm: syz.2.19 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 lockdep_rcu_suspicious+0x13f/0x1d0 kernel/locking/lockdep.c:6876
 tipc_monitor_rcu_bh+0xf5/0x110 net/tipc/monitor.c:108
 get_self net/tipc/monitor.c:209 [inline]
 tipc_mon_delete+0x10b/0x4d0 net/tipc/monitor.c:704
 tipc_l2_device_event+0x370/0x680 net/tipc/bearer.c:-1
 notifier_call_chain+0x1be/0x400 kernel/notifier.c:85
 call_netdevice_notifiers_extack net/core/dev.c:2287 [inline]
 call_netdevice_notifiers net/core/dev.c:2301 [inline]
 unregister_netdevice_many_notify+0x17a5/0x22c0 net/core/dev.c:12464
 unregister_netdevice_many net/core/dev.c:12527 [inline]
 unregister_netdevice_queue+0x31f/0x360 net/core/dev.c:12337
 unregister_netdevice include/linux/netdevice.h:3427 [inline]
 __tun_detach+0x6d9/0x15d0 drivers/net/tun.c:621
 tun_detach drivers/net/tun.c:637 [inline]
 tun_chr_close+0x10a/0x1c0 drivers/net/tun.c:3438
 __fput+0x44f/0xa70 fs/file_table.c:469
 task_work_run+0x1d9/0x270 kernel/task_work.c:233
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 __exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
 exit_to_user_mode_loop+0xed/0x480 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
 do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7b26d9c819
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffec30cee78 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 00007ffec30cef60 RCX: 00007f7b26d9c819
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 0000000000011900 R08: 0000000000000001 R09: 0000000000000000
R10: 0000001b2e520000 R11: 0000000000000246 R12: 00007ffec30cefa0
R13: 00007f7b27015fac R14: 000000000001193b R15: 00007f7b27015fa0
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.

^ permalink raw reply

* RE: [PATCH] net/sched: sch_dualpi2: fix NULL pointer dereference in dualpi2_change()
From: Chia-Yu Chang (Nokia) @ 2026-04-15 12:37 UTC (permalink / raw)
  To: Simon Horman, veritas501
  Cc: jhs@mojatatu.com, jiri@resnulli.us, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	pabeni@redhat.com
In-Reply-To: <20260415123403.GF772670@horms.kernel.org>

> -----Original Message-----
> From: Simon Horman <horms@kernel.org> 
> Sent: Wednesday, April 15, 2026 2:34 PM
> To: veritas501 <hxzene@gmail.com>
> Cc: Chia-Yu Chang (Nokia) <chia-yu.chang@nokia-bell-labs.com>; jhs@mojatatu.com; jiri@resnulli.us; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; pabeni@redhat.com
> Subject: Re: [PATCH] net/sched: sch_dualpi2: fix NULL pointer dereference in dualpi2_change()
> 
> 
> CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information.
> 
> 
> 
> On Wed, Apr 15, 2026 at 10:31:58AM +0800, veritas501 wrote:
> > From: "Kito Xu (veritas501)" <hxzene@gmail.com>
> >
> > Hi Simon,
> >
> > Thanks for the review and for pointing out the alternative patch from 
> > Chia-Yu. I agree that the more comprehensive fix is the better choice 
> > for this code path.
> >
> > Since I independently discovered and reported this issue, would it be 
> > possible to add a Reported-by tag to Chia-Yu's patch?
> >
> >     Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com>
> 
> That sounds reasonable to me.
> But it might be best to bring up in a response to Chia-Yu's patch.
> (I can't make it happen myself.)
> 
> >
> > Either way, thanks for handling this!
> >
> > Best regards,
> > Kito

Hi Kito and Simon,

Sure, I will add in the other patch with the proposed tag in v2 (let's wait for more feedback before submitting v2).
Thanks!

Chia-Yu

^ permalink raw reply

* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
From: Simon Horman @ 2026-04-15 12:37 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel
In-Reply-To: <ad84k4qaMd9wnYwd@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Wed, Apr 15, 2026 at 12:04:51AM -0700, Erni Sri Satya Vennela wrote:
> On Tue, Apr 14, 2026 at 04:35:03PM +0100, Simon Horman wrote:
> > From: 'Simon Horman' <horms@kernel.org>
> > 
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > Full review at: https://sashiko.dev
> > ---
> > net: mana: Don't overwrite port probe error with add_adev result
> > 
> > This commit ensures that if mana_probe_port() or mana_attach() fails, the
> > error is not overwritten by the result of add_adev(). By preserving the
> > error, it triggers mana_remove() cleanup instead of leaving ports partially
> > initialized.
> > 
> > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > index 9d095a6fb56c7..4d42cbfb3f15b 100644
> > > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > >  	if (!resuming) {
> > >  		for (i = 0; i < ac->num_ports; i++) {
> > >  			err = mana_probe_port(ac, i, &ac->ports[i]);
> > > -			/* we log the port for which the probe failed and stop
> > > -			 * probes for subsequent ports.
> > > -			 * Note that we keep running ports, for which the probes
> > > -			 * were successful, unless add_adev fails too
> > > +			/* Log the port for which the probe failed, stop probing
> > > +			 * subsequent ports, and skip add_adev.
> > > +			 * Already-probed ports remain functional.
> > >  			 */
> > 
> > Are these comments accurate regarding ports remaining functional?
> > 
> > Since this patch preserves the error code, execution falls through to the end
> > of mana_probe() where it checks if err is set and calls mana_remove().
> > mana_remove() will iterate over the successfully allocated ports, completely
> > detach them, unregister their net_devices, and destroy event queues.
> > 
> > It appears no ports remain functional because the device is completely torn
> > down.
> >
> I will update the patch with the updated comment message. 

Thanks.

> > [ ... ]
> > 
> > > @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > >  			enable_work(&apc->queue_reset_work);
> > >  			err = mana_attach(ac->ports[i]);
> > >  			rtnl_unlock();
> > > -			/* we log the port for which the attach failed and stop
> > > -			 * attach for subsequent ports
> > > -			 * Note that we keep running ports, for which the attach
> > > -			 * were successful, unless add_adev fails too
> > > +			/* Log the port for which the attach failed, stop
> > > +			 * attaching subsequent ports, and skip add_adev.
> > > +			 * Already-attached ports remain functional.
> > >  			 */
> > >  			if (err) {
> > >  				dev_err(dev, "Attach Failed for port %d\n", i);
> > > @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > 
> > Does propagating this error during PM resume introduce a NULL pointer
> > dereference?
> > 
> > If PM resume fails here (resuming == true), the preserved error code causes
> > mana_probe() to call mana_remove(), which tears down the device and sets
> > gd->gdma_context = NULL.
> > 
> > However, a failed resume callback does not automatically unbind the driver.
> > When the device is eventually unbound later, mana_remove() is invoked again:
> > 
> > drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
> >     struct gdma_context *gc = gd->gdma_context;
> >     struct device *dev = gc->dev;
> >     ...
> > }
> > 
> > In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
> > which could lead to a kernel panic when dereferencing gc->dev.
> 
> Thankyou for pointing it out, Simon.
> Since this is a pre-existing bug, I will create a different patch for
> this change and make it as part of this patchset.

Likewise, thanks.

FTR, it it is a pre-existing bug then I don't think it needs
to block progress of your patchset. Even if fixing things
sooner than later is a good maxim.

^ permalink raw reply

* Re: [PATCH] net/sched: sch_dualpi2: fix NULL pointer dereference in dualpi2_change()
From: Simon Horman @ 2026-04-15 12:34 UTC (permalink / raw)
  To: veritas501
  Cc: chia-yu.chang, jhs, jiri, davem, edumazet, kuba, linux-kernel,
	netdev, pabeni
In-Reply-To: <20260414160000.1@hxzene.gmail.com>

On Wed, Apr 15, 2026 at 10:31:58AM +0800, veritas501 wrote:
> From: "Kito Xu (veritas501)" <hxzene@gmail.com>
> 
> Hi Simon,
> 
> Thanks for the review and for pointing out the alternative patch
> from Chia-Yu. I agree that the more comprehensive fix is the better
> choice for this code path.
> 
> Since I independently discovered and reported this issue, would it
> be possible to add a Reported-by tag to Chia-Yu's patch?
> 
>     Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com>

That sounds reasonable to me.
But it might be best to bring up in a response to Chia-Yu's patch.
(I can't make it happen myself.)

> 
> Either way, thanks for handling this!
> 
> Best regards,
> Kito

^ permalink raw reply

* Re: [PATCH v2] Bluetooth: Add Broadcom channel priority commands
From: Sasha Finkelstein @ 2026-04-15 12:33 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Sven Peter, Janne Grunau, Neal Gompa, Marcel Holtmann,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-kernel, asahi, linux-arm-kernel,
	linux-bluetooth, netdev
In-Reply-To: <CABBYNZJAEqwfTuVqbFAnx97HBSjcwn3Hb+y+r4r2C=MMPxFoDg@mail.gmail.com>

On Tue, 14 Apr 2026 at 16:00, Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
> > +       if (sock)
> > +               set_bit(SOCK_CUSTOM_SOCKOPT, &sock->flags);
>
> This is more complicated than it needs to be. I'd just add a new
> callback, `hdev->set_priority(handle, skb->priority)`, so the driver
> is called whenever it needs to elevate a connection's priority, that
> said there could be cases where a connection needs its priority set
> momentarily to transmit A2DP, followed by OBEX packets that are best
> effort. Therefore, `hci_conn` will probably need to track the priority
> so it can detect when it needs changing on a per skb basis.

I have tested per-skb priorities, and unfortunately, this does not work.
If something tries to send a low-priority packet (for example - a volume
adjustment), a priority drop causes the same kind of dropout that is
caused by scans. It appears that the only way to make this hardware work
is to set the entire hci connection as high priority for as long as it
is being used to transmit audio.

^ permalink raw reply

* Re: [PATCH net-next v3 00/12] BIG TCP for UDP tunnels
From: Alice Mikityanska @ 2026-04-15 12:14 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Alice Mikityanska, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Paolo Abeni, Xin Long, Willem de Bruijn, David Ahern,
	Nikolay Aleksandrov, Shuah Khan, Stanislav Fomichev, Andrew Lunn,
	Simon Horman, Florian Westphal, netdev
In-Reply-To: <20260413155552.5cd00bc0@kernel.org>

On Tue, 14 Apr 2026 at 01:55, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 10 Apr 2026 18:09:31 +0300 Alice Mikityanska wrote:
> > This series is a follow-up to "BIG TCP without HBH in IPv6", and it adds
> > support for BIG TCP IPv4/IPv6 workloads in vxlan and geneve. Now that
> > IPv6 BIG TCP doesn't require stripping the HBH in all various
> > combinations in tunneled traffic, adding BIG TCP becomes feasible.
>
> No longer applies, sorry :(

That's a pity :(. I see that the only conflict is because udplite
parts have been removed from net/netfilter/nf_conntrack_proto_udp.c,
so I just need to drop my change that touches udplite.

> We'll have to revisit after the merge window.

OK, I'll resubmit after the merge window. I'd appreciate it if I can
still collect review comments in the meanwhile.

> --
> pw-bot: cr

^ permalink raw reply

* Re: [PATCH net v3 2/3] vsock/test: fix MSG_PEEK handling in recv_buf()
From: Stefano Garzarella @ 2026-04-15 11:54 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
	netdev, linux-kernel
In-Reply-To: <ad9uYrUjgCkW1D_k@sgarzare-redhat>

On Wed, Apr 15, 2026 at 01:31:11PM +0200, Stefano Garzarella wrote:
>On Tue, Apr 14, 2026 at 06:10:22PM +0200, Luigi Leonardi wrote:
>>`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
>>`recv` until all requested bytes are available or an error occurs.
>>
>>The problem is how it calculates the amount of bytes read: MSG_PEEK
>>doesn't consume any bytes, will re-read the same bytes from the buffer
>>head, so, summing the return value every time is wrong.
>>
>>Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
>>requested amount is more than the bytes available, the loop will never
>>terminate, because `recv` will never return EOF. For this reason we need
>>to compare the amount of read bytes with the number of bytes expected.
>>
>>Add a check, and if the MSG_PEEK flag is present, update the counter of
>>read bytes differently, and break if we read the expected amount.
>
>nit: "..., update the counter for bytes read only after all expected
>bytes have been read and break out of the loop; otherwise, try again
>after a short delay to avoid consuming too many CPU cycles."
>
>>
>>This allows us to simplify the `test_stream_credit_update_test`, by
>>reusing `recv_buf`, like some other tests already do.
>>
>>This also fixes callers that pass MSG_PEEK to recv_buf().
>
>nit: this is implicit from the first part of the description.
>
>>
>>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>>---
>>tools/testing/vsock/util.c       | 15 +++++++++++++++
>>tools/testing/vsock/vsock_test.c | 13 +------------
>>2 files changed, 16 insertions(+), 12 deletions(-)
>>
>>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>>index 1fe1338c79cd..2c9ee3210090 100644
>>--- a/tools/testing/vsock/util.c
>>+++ b/tools/testing/vsock/util.c
>>@@ -381,7 +381,13 @@ void send_buf(int fd, const void *buf, size_t len, int flags,
>>	}
>>}
>>
>>+#define RECV_PEEK_RETRY_USEC 10
>
>10 usec IMO are a bit low, it could be the same order of the syscalls 
>involved in the loop, I'd go to some milliseconds like we do for 
>SEND_SLEEP_USEC.
>
>>+
>>/* Receive bytes in a buffer and check the return value.
>>+ *
>>+ * MSG_PEEK note: MSG_PEEK doesn't consume bytes from the buffer, so partial
>>+ * reads cannot be summed. Instead, the function retries until recv() returns
>>+ * exactly expected_ret bytes in a single call.
>
>I'd replace with something like this:
>
>   * When MSG_PEEK is set, recv() is retried until it returns exactly
>   * expected_ret bytes. The function returns on error, EOF, or timeout
>   * as usual.
>
>Thanks,
>Stefano
>
>> *
>> * expected_ret:
>> *  <0 Negative errno (for testing errors)
>>@@ -403,6 +409,15 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
>>		if (ret <= 0)
>>			break;
>>
>>+		if (flags & MSG_PEEK) {
>>+			if (ret == expected_ret) {

On second thought, I think it would be more appropriate to check for
`ret >= expected_ret` here, because all subsequent recv() will
definitely return more bytes, so there’s no point in continuing the
loop... and anyway, we’ll check the result later, so just that change
should be fine.

And of course I'd update the comment on top in this way:

    * When MSG_PEEK is set, recv() is retried until it returns at least
    * expected_ret bytes. The function returns on error, EOF, or timeout
    * as usual.

Thanks,
Stefano

>>+				nread = ret;
>>+				break;
>>+			}
>>+			timeout_usleep(RECV_PEEK_RETRY_USEC);
>>+			continue;
>>+		}
>>+
>>		nread += ret;
>>	} while (nread < len);
>>	timeout_end();
>>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>>index 5bd20ccd9335..bdb0754965df 100644
>>--- a/tools/testing/vsock/vsock_test.c
>>+++ b/tools/testing/vsock/vsock_test.c
>>@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
>>	}
>>
>>	/* Wait until there will be 128KB of data in rx queue. */
>>-	while (1) {
>>-		ssize_t res;
>>-
>>-		res = recv(fd, buf, buf_size, MSG_PEEK);
>>-		if (res == buf_size)
>>-			break;
>>-
>>-		if (res <= 0) {
>>-			fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
>>-			exit(EXIT_FAILURE);
>>-		}
>>-	}
>>+	recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);
>>
>>	/* There is 128KB of data in the socket's rx queue, dequeue first
>>	 * 64KB, credit update is sent if 'low_rx_bytes_test' == true.
>>
>>-- 
>>2.53.0
>>


^ permalink raw reply

* Re: [PATCH net-next v2 5/5] selftests: net: add veth BQL stress test
From: Breno Leitao @ 2026-04-15 11:47 UTC (permalink / raw)
  To: hawk
  Cc: netdev, kernel-team, Jonas Köppeler, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260413094442.1376022-6-hawk@kernel.org>

On Mon, Apr 13, 2026 at 11:44:38AM +0200, hawk@kernel.org wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> 
> Add a selftest that exercises veth's BQL (Byte Queue Limits) code path
> under sustained UDP load. The test creates a veth pair with GRO enabled
> (activating the NAPI path and BQL), attaches a qdisc, optionally loads
> iptables rules in the consumer namespace to slow NAPI processing, and
> floods UDP packets for a configurable duration.
> 
> The test serves two purposes: benchmarking BQL's latency impact under
> configurable load (iptables rules, qdisc type and parameters), and
> detecting kernel BUG/Oops from DQL accounting mismatches. It monitors
> dmesg throughout the run and reports PASS/FAIL via kselftest (lib.sh).
> 
> Diagnostic output is printed every 5 seconds:
>   - BQL sysfs inflight/limit and watchdog tx_timeout counter
>   - qdisc stats: packets, drops, requeues, backlog, qlen, overlimits
>   - consumer PPS and NAPI-64 cycle time (shows fq_codel target impact)
>   - sink PPS (per-period delta), latency min/avg/max (stddev at exit)
>   - ping RTT to measure latency under load
> 
> Generating enough traffic to fill the 256-entry ptr_ring requires care:
> the UDP sendto() path charges each SKB to sk_wmem_alloc, and the SKB
> stays charged (via sock_wfree destructor) until the consumer NAPI thread
> finishes processing it -- including any iptables rules in the receive
> path. With the default sk_sndbuf (~208KB from wmem_default), only ~93
> packets can be in-flight before sendto(MSG_DONTWAIT) returns EAGAIN.
> Since 93 < 256 ring entries, the ring never fills and no backpressure
> occurs. The test raises wmem_max via sysctl and sets SO_SNDBUF=1MB on
> the flood socket to remove this bottleneck. An earlier multi-namespace
> routing approach avoided this limit because ip_forward creates new SKBs
> detached from the sender's socket.
> 
> The --bql-disable option (sets limit_min=1GB) enables A/B comparison.
> Typical results with --nrules 6000 --qdisc-opts 'target 2ms interval 20ms':
> 
>   fq_codel + BQL disabled:  ping RTT ~10.8ms, 15% loss, 400KB in ptr_ring
>   fq_codel + BQL enabled:   ping RTT ~0.6ms,   0% loss, 4KB in ptr_ring
> 
> Both cases show identical consumer speed (~20Kpps) and fq_codel drops
> (~255K), proving the improvement comes purely from where packets buffer.
> 
> BQL moves buffering from the ptr_ring into the qdisc, where AQM
> (fq_codel/CAKE) can act on it -- eliminating the "dark buffer" that
> hides congestion from the scheduler.
> 
> The --qdisc-replace mode cycles through sfq/pfifo/fq_codel/noqueue
> under active traffic to verify that stale BQL state (STACK_XOFF) is
> properly handled during live qdisc transitions.
> 
> A companion wrapper (veth_bql_test_virtme.sh) launches the test inside
> a virtme-ng VM, with .config validation to prevent silent stalls.
> 
> Usage:
>   sudo ./veth_bql_test.sh [--duration 300] [--nrules 100]
>                           [--qdisc sfq] [--qdisc-opts '...']
>                           [--bql-disable] [--normal-napi]
>                           [--qdisc-replace]
> 
> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
> Tested-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>

Tested-by: Breno Leitao <leitao@debian.org>

> diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config
> index 2a390cae41bf..7b1f41421145 100644
> --- a/tools/testing/selftests/net/config
> +++ b/tools/testing/selftests/net/config
> @@ -97,6 +97,7 @@ CONFIG_NET_PKTGEN=m
>  CONFIG_NET_SCH_ETF=m
>  CONFIG_NET_SCH_FQ=m
>  CONFIG_NET_SCH_FQ_CODEL=m
> +CONFIG_NET_SCH_SFQ=m

nit: This breaks the alphabetical ordering of the config file.

^ permalink raw reply

* Re: [PATCH net v3 3/3] vsock/test: add MSG_PEEK after partial recv test
From: Stefano Garzarella @ 2026-04-15 11:40 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
	netdev, linux-kernel
In-Reply-To: <20260414-fix_peek-v3-3-e7daead49f83@redhat.com>

On Tue, Apr 14, 2026 at 06:10:23PM +0200, Luigi Leonardi wrote:
>Add a test that verifies MSG_PEEK works correctly after a partial
>recv().
>
>This is to test a bug that was present in the
>`virtio_transport_stream_do_peek()` when computing the number of bytes to
>copy: After a partial read, the peek function didn't take into
>consideration the number of bytes that were already read. So peeking the
>whole buffer would cause an out-of-bounds read, that resulted in a -EFAULT.
>
>This test does exactly this: do a partial recv on a buffer, then try to
>peek the whole buffer content.

nit: I think it's better to mention also that we are re-using
test_stream_msg_peek_client() also for this test.

>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/vsock_test.c | 37 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 37 insertions(+)
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index bdb0754965df..ab387a13f0ae 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -346,6 +346,38 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
> 	return test_msg_peek_server(opts, false);
> }
>
>+static void test_stream_peek_after_recv_server(const struct test_opts *opts)
>+{
>+	unsigned char buf_normal[MSG_PEEK_BUF_LEN];
>+	unsigned char buf_peek[MSG_PEEK_BUF_LEN];
>+	int fd;
>+
>+	fd = vsock_stream_accept(VMADDR_CID_ANY, opts->peer_port, NULL);
>+	if (fd < 0) {
>+		perror("accept");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	control_writeln("SRVREADY");
>+
>+	/* Partial recv to advance offset within the skb */
>+	recv_buf(fd, buf_normal, 1, 0, 1);
>+
>+	/* Ask more bytes than available */

nit:	/* Peek with a buffer larger than the remaining data */

>+	recv_buf(fd, buf_peek, sizeof(buf_peek), MSG_PEEK, sizeof(buf_peek) - 1);
>+
>+	/* Recv rest of the data */

nit:	/* Consume the remaining data */

>+	recv_buf(fd, buf_normal, sizeof(buf_normal) - 1, 0, sizeof(buf_normal) - 1);
>+
>+	/* Compare full peek and normal read. */
>+	if (memcmp(buf_peek, buf_normal, sizeof(buf_peek) - 1)) {
>+		fprintf(stderr, "Full peek data mismatch\n");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	close(fd);
>+}
>+
> #define SOCK_BUF_SIZE (2 * 1024 * 1024)
> #define SOCK_BUF_SIZE_SMALL (64 * 1024)
> #define MAX_MSG_PAGES 4
>@@ -2509,6 +2541,11 @@ static struct test_case test_cases[] = {
> 		.run_client = test_stream_tx_credit_bounds_client,
> 		.run_server = test_stream_tx_credit_bounds_server,
> 	},
>+	{
>+		.name = "SOCK_STREAM MSG_PEEK after partial recv",
>+		.run_client = test_stream_msg_peek_client,
>+		.run_server = test_stream_peek_after_recv_server,

I left just minor comments, the test LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>


^ permalink raw reply

* Re: [PATCH net-next v2 05/14] libie: add bookkeeping support for control queue messages
From: Larysa Zaremba @ 2026-04-15 11:40 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Tony Nguyen, davem, kuba, edumazet, andrew+netdev, netdev,
	Phani R Burra, przemyslaw.kitszel, aleksander.lobakin,
	sridhar.samudrala, anjali.singhai, michal.swiatkowski,
	maciej.fijalkowski, emil.s.tantilov, madhu.chittim, joshua.a.hay,
	jacob.e.keller, jayaprakash.shanmugam, jiri, horms, corbet,
	richardcochran, linux-doc, Bharath R, Samuel Salin,
	Aleksandr Loktionov
In-Reply-To: <b559c877-7712-4ed7-adb4-d2b667e16e74@redhat.com>

On Thu, Apr 09, 2026 at 11:07:02AM +0200, Paolo Abeni wrote:
> On 4/3/26 9:49 PM, Tony Nguyen wrote:
> > +static bool
> > +libie_ctlq_xn_process_recv(struct libie_ctlq_xn_recv_params *params,
> > +			   struct libie_ctlq_msg *ctlq_msg)
> > +{
> > +	struct libie_ctlq_xn_manager *xnm = params->xnm;
> > +	struct libie_ctlq_xn *xn;
> > +	u16 msg_cookie, xn_index;
> > +	struct kvec *response;
> > +	int status;
> > +	u16 data;
> > +
> > +	data = ctlq_msg->sw_cookie;
> > +	xn_index = FIELD_GET(LIBIE_CTLQ_XN_INDEX_M, data);
> > +	msg_cookie = FIELD_GET(LIBIE_CTLQ_XN_COOKIE_M, data);
> > +	status = ctlq_msg->chnl_retval ? -EFAULT : 0;
> > +
> > +	xn = &xnm->ring[xn_index];
> > +	if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
> > +	    msg_cookie != xn->cookie)
> > +		return false;
> > +
> > +	spin_lock(&xn->xn_lock);
> 
> Sashiko says:
> 
> ---
> Because the cookie and opcode are checked before acquiring the lock, is
> it possible for the transaction to time out, be returned to the free
> list, and get reallocated for a new message before the lock is acquired?
> If that happens, could the old delayed response falsely complete the
> newly allocated transaction since the identifiers are not re-verified
> inside the lock?
> ---
> 

Yes, there is a race condition risk that is easy to fix.

> > +/**
> > + * libie_xn_check_async_timeout - Check for asynchronous message timeouts
> > + * @xnm: Xn transaction manager
> > + *
> > + * Call the corresponding callback to notify the caller about the timeout.
> > + */
> > +static void libie_xn_check_async_timeout(struct libie_ctlq_xn_manager *xnm)
> > +{
> > +	u32 idx;
> > +
> > +	for_each_clear_bit(idx, xnm->free_xns_bm, LIBIE_CTLQ_MAX_XN_ENTRIES) {
> 
> Sashiko says:
> 
> ---
> This iterates over the bitmap without holding the lock. Concurrently,
> other paths modify this bitmap using non-atomic bitwise operations like
> __clear_bit() and __set_bit() under the lock. Will this cause torn reads
> or data races that might lead the timeout handler to skip valid
> transactions or examine invalid ones?
> ---
>

This should create only false-negatives, which is not a problem, timeout time is 
much longer than libie_xn_check_async_timeout() calling period.

> 
> > +		params->ctlq_msg->sw_cookie = cookie;
> > +		params->ctlq_msg->send_mem = *dma_mem;
> > +		params->ctlq_msg->data_len = buf_len;
> > +		params->ctlq_msg->chnl_opcode = params->chnl_opcode;
> > +		ret = libie_ctlq_send(params->ctlq, params->ctlq_msg, 1);
> > +	}
> > +
> > +	if (ret && !libie_cp_can_send_onstack(buf_len))
> > +		libie_cp_unmap_dma_mem(dev, dma_mem);
> 
> Sashiko says:
> 
> ---
> When libie_ctlq_send() fails here, the DMA memory is unmapped and the
> buffer is freed by the caller. However, the software tracking ring at
> tx_msg[next_to_use] still contains the populated send_mem details and a
> non-zero data_len.
> 
> During driver teardown, libie_ctlq_xn_send_clean() is invoked with
> params->force = true, which processes the ring without checking the
> hardware completion bit. Could this cause the cleanup routine to process
> the failed slot again, resulting in a double-free and double-unmap?
> ---

Yes, I think that in trying to avoid unnecessary copying, I shot myself in the 
foot, will fix.

> 
> There are more remarks on the following patch, please have a look.
>

There are also a few AI's comments that will result in fixes to stable.

> Also, it would be very helpful if you could help triaging such
> (overwhelming amount of) feedback on future submissions, explicitly
> commenting on the ML. Sashiko tends to be quite noise on device driver code.
> 
> Thanks,
> 
> Paolo
> 

^ permalink raw reply

* Re: [PATCH] macvlan: fix macvlan_get_size() not reserving space for IFLA_MACVLAN_BC_CUTOFF
From: Eric Dumazet @ 2026-04-15 11:37 UTC (permalink / raw)
  To: Dudu Lu; +Cc: netdev, andrew+netdev, davem, kuba, pabeni
In-Reply-To: <20260413085349.73977-1-phx0fer@gmail.com>

On Mon, Apr 13, 2026 at 1:53 AM Dudu Lu <phx0fer@gmail.com> wrote:
>
> macvlan_get_size() does not account for IFLA_MACVLAN_BC_CUTOFF, but
> macvlan_fill_info() conditionally includes it when port->bc_cutoff != 1.
> This causes nla_put_s32() to fail with -EMSGSIZE when the netlink skb
> runs out of space, triggering a WARN_ON in rtnetlink and preventing the
> interface from being dumped.
>
> The bug can be reproduced with:
>
>   ip link add macvlan0 link eth0 type macvlan mode bridge
>   ip link set macvlan0 type macvlan bc_cutoff 0

Was this generated by LLM ?

AFAIK, iproute2 command would look like this

 ip link set macvlan0 type macvlan bclim 0

>   ip -d link show macvlan0   # fails with -EMSGSIZE
>
> The bc_cutoff feature was added in commit 954d1fa1ac93 ("macvlan: Add
> netlink attribute for broadcast cutoff"), which added the nla_put_s32()
> call in macvlan_fill_info() but missed adding the corresponding
> nla_total_size(4) in macvlan_get_size(). A follow-up commit
> 55cef78c244d ("macvlan: add forgotten nla_policy for
> IFLA_MACVLAN_BC_CUTOFF") fixed the missing nla_policy entry but still
> did not fix the size calculation.
>
> Fixes: 954d1fa1ac93 ("macvlan: Add netlink attribute for broadcast cutoff")
> Signed-off-by: Dudu Lu <phx0fer@gmail.com>
> ---
>  drivers/net/macvlan.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index a71f058eceef..80f87599a503 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -1681,6 +1681,7 @@ static size_t macvlan_get_size(const struct net_device *dev)
>                 + macvlan_get_size_mac(vlan) /* IFLA_MACVLAN_MACADDR */
>                 + nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN */
>                 + nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN_USED */
> +               + nla_total_size(4) /* IFLA_MACVLAN_BC_CUTOFF */
>                 );
>  }
>

Note that skbs have more tailroom than requested, because kmalloc()
power-of-two roundings,
so the bug does not show in practice, just in case someone tries the
repro and sees nothing wrong.

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH net v3 2/3] vsock/test: fix MSG_PEEK handling in recv_buf()
From: Stefano Garzarella @ 2026-04-15 11:31 UTC (permalink / raw)
  To: Luigi Leonardi
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Arseniy Krasnov, kvm, virtualization,
	netdev, linux-kernel
In-Reply-To: <20260414-fix_peek-v3-2-e7daead49f83@redhat.com>

On Tue, Apr 14, 2026 at 06:10:22PM +0200, Luigi Leonardi wrote:
>`recv_buf` does not handle the MSG_PEEK flag correctly: it keeps calling
>`recv` until all requested bytes are available or an error occurs.
>
>The problem is how it calculates the amount of bytes read: MSG_PEEK
>doesn't consume any bytes, will re-read the same bytes from the buffer
>head, so, summing the return value every time is wrong.
>
>Moreover, MSG_PEEK doesn't consume the bytes in the buffer, so if the
>requested amount is more than the bytes available, the loop will never
>terminate, because `recv` will never return EOF. For this reason we need
>to compare the amount of read bytes with the number of bytes expected.
>
>Add a check, and if the MSG_PEEK flag is present, update the counter of
>read bytes differently, and break if we read the expected amount.

nit: "..., update the counter for bytes read only after all expected
bytes have been read and break out of the loop; otherwise, try again
after a short delay to avoid consuming too many CPU cycles."

>
>This allows us to simplify the `test_stream_credit_update_test`, by
>reusing `recv_buf`, like some other tests already do.
>
>This also fixes callers that pass MSG_PEEK to recv_buf().

nit: this is implicit from the first part of the description.

>
>Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
>Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>---
> tools/testing/vsock/util.c       | 15 +++++++++++++++
> tools/testing/vsock/vsock_test.c | 13 +------------
> 2 files changed, 16 insertions(+), 12 deletions(-)
>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 1fe1338c79cd..2c9ee3210090 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -381,7 +381,13 @@ void send_buf(int fd, const void *buf, size_t len, int flags,
> 	}
> }
>
>+#define RECV_PEEK_RETRY_USEC 10

10 usec IMO are a bit low, it could be the same order of the syscalls 
involved in the loop, I'd go to some milliseconds like we do for 
SEND_SLEEP_USEC.

>+
> /* Receive bytes in a buffer and check the return value.
>+ *
>+ * MSG_PEEK note: MSG_PEEK doesn't consume bytes from the buffer, so partial
>+ * reads cannot be summed. Instead, the function retries until recv() returns
>+ * exactly expected_ret bytes in a single call.

I'd replace with something like this:

    * When MSG_PEEK is set, recv() is retried until it returns exactly
    * expected_ret bytes. The function returns on error, EOF, or timeout
    * as usual.

Thanks,
Stefano

>  *
>  * expected_ret:
>  *  <0 Negative errno (for testing errors)
>@@ -403,6 +409,15 @@ void recv_buf(int fd, void *buf, size_t len, int flags, ssize_t expected_ret)
> 		if (ret <= 0)
> 			break;
>
>+		if (flags & MSG_PEEK) {
>+			if (ret == expected_ret) {
>+				nread = ret;
>+				break;
>+			}
>+			timeout_usleep(RECV_PEEK_RETRY_USEC);
>+			continue;
>+		}
>+
> 		nread += ret;
> 	} while (nread < len);
> 	timeout_end();
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index 5bd20ccd9335..bdb0754965df 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -1500,18 +1500,7 @@ static void test_stream_credit_update_test(const struct test_opts *opts,
> 	}
>
> 	/* Wait until there will be 128KB of data in rx queue. */
>-	while (1) {
>-		ssize_t res;
>-
>-		res = recv(fd, buf, buf_size, MSG_PEEK);
>-		if (res == buf_size)
>-			break;
>-
>-		if (res <= 0) {
>-			fprintf(stderr, "unexpected 'recv()' return: %zi\n", res);
>-			exit(EXIT_FAILURE);
>-		}
>-	}
>+	recv_buf(fd, buf, buf_size, MSG_PEEK, buf_size);
>
> 	/* There is 128KB of data in the socket's rx queue, dequeue first
> 	 * 64KB, credit update is sent if 'low_rx_bytes_test' == true.
>
>-- 
>2.53.0
>


^ permalink raw reply

* [PATCH iwl-next] ice: add SBQ posted writes with non-posted support for CGU
From: Przemyslaw Korba @ 2026-04-15 11:27 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, Przemyslaw Korba,
	Aleksandr Loktionov, Arkadiusz Kubalewski

From: Karol Kolacinski <karol.kolacinski@intel.com>

Sideband queue (SBQ) is a HW queue with very short completion time. All
SBQ writes were posted by default, which means that the driver did not
have to wait for completion from the neighbor device, because there was
none. This introduced unnecessary delays, where only those delays were
"ensuring" that the command is "completed" and this was a potential race
condition.

Add the possibility to perform non-posted writes where it's necessary to
wait for completion, instead of relying on fake completion from the FW,
where only the delays are guarding the writes.

Flush the SBQ by reading address 0 from the PHY 0 before issuing SYNC
command to ensure that writes to all PHYs were completed and skip SBQ
message completion if it's posted.

To analyze if delays are gone, look for and compare time spent in
ice_sq_send_cmd — posted writes should return immediately after the wr32.
That can be done for example by adjusting phc time with phc_ctl on E830
device, for less than 2 seconds to use this new mechanism. Without it,
command below will fail.

Reproduction steps:
phc_ctl eth13 adj 1
phc_ctl[4478170.994]: adjusted clock by 1.000000 seconds

Check trace for timing for comparisions:
echo ice_sbq_send_cmd > /sys/kernel/debug/tracing/set_ftrace_filter
echo function_graph > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace

Tested on:
  - Intel E830 NIC (FW version 1.00)
  - Kernel 6.19.0+

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_common.c  | 18 ++++--
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 64 ++++++++++++--------
 drivers/net/ethernet/intel/ice/ice_sbq_cmd.h |  5 +-
 3 files changed, 53 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index f84990996530..2cd3d6d450a9 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -1777,23 +1777,29 @@ int ice_sbq_rw_reg(struct ice_hw *hw, struct ice_sbq_msg_input *in, u16 flags)
 	msg.msg_addr_low = cpu_to_le16(in->msg_addr_low);
 	msg.msg_addr_high = cpu_to_le32(in->msg_addr_high);
 
-	if (in->opcode)
+	switch (in->opcode) {
+	case ice_sbq_msg_wr_p:
+	case ice_sbq_msg_wr_np:
 		msg.data = cpu_to_le32(in->data);
-	else
+		break;
+	case ice_sbq_msg_rd:
 		/* data read comes back in completion, so shorten the struct by
 		 * sizeof(msg.data)
 		 */
 		msg_len -= sizeof(msg.data);
+		break;
+	default:
+		return -EINVAL;
+	}
 
-	if (in->opcode == ice_sbq_msg_wr)
-		cd.posted = 1;
+	cd.posted = in->opcode == ice_sbq_msg_wr_p;
 
 	desc.flags = cpu_to_le16(flags);
 	desc.opcode = cpu_to_le16(ice_sbq_opc_neigh_dev_req);
 	desc.param0.cmd_len = cpu_to_le16(msg_len);
 	status = ice_sbq_send_cmd(hw, &desc, &msg, msg_len, &cd);
 
-	if (!status && !in->opcode)
+	if (!status && in->opcode == ice_sbq_msg_rd)
 		in->data = le32_to_cpu
 			(((struct ice_sbq_msg_cmpl *)&msg)->data);
 	return status;
@@ -6701,7 +6707,7 @@ int ice_write_cgu_reg(struct ice_hw *hw, u32 addr, u32 val)
 {
 	struct ice_sbq_msg_input cgu_msg = {
 		.dest_dev = ice_get_dest_cgu(hw),
-		.opcode = ice_sbq_msg_wr,
+		.opcode = ice_sbq_msg_wr_np,
 		.msg_addr_low = addr,
 		.data = val
 	};
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 690f9d874443..0f202d4dae7c 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -368,6 +368,16 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
 static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
 {
 	struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+	struct ice_sbq_msg_input msg = {
+		.dest_dev = ice_sbq_dev_phy_0,
+		.opcode = ice_sbq_msg_rd,
+	};
+	int err;
+
+	/* Flush SBQ by reading address 0 on PHY 0 */
+	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
+	if (err)
+		dev_warn(ice_hw_to_dev(hw), "Failed to flush SBQ: %d\n", err);
 
 	if (!ice_is_primary(hw))
 		hw = ice_get_primary_hw(pf);
@@ -433,7 +443,7 @@ static int ice_write_phy_eth56g(struct ice_hw *hw, u8 port, u32 addr, u32 val)
 {
 	struct ice_sbq_msg_input msg = {
 		.dest_dev = ice_ptp_get_dest_dev_e825(hw, port),
-		.opcode = ice_sbq_msg_wr,
+		.opcode = ice_sbq_msg_wr_p,
 		.msg_addr_low = lower_16_bits(addr),
 		.msg_addr_high = upper_16_bits(addr),
 		.data = val
@@ -2358,11 +2368,12 @@ static bool ice_is_40b_phy_reg_e82x(u16 low_addr, u16 *high_addr)
 static int
 ice_read_phy_reg_e82x(struct ice_hw *hw, u8 port, u16 offset, u32 *val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.opcode = ice_sbq_msg_rd,
+	};
 	int err;
 
 	ice_fill_phy_msg_e82x(hw, &msg, port, offset);
-	msg.opcode = ice_sbq_msg_rd;
 
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
@@ -2435,12 +2446,13 @@ ice_read_64b_phy_reg_e82x(struct ice_hw *hw, u8 port, u16 low_addr, u64 *val)
 static int
 ice_write_phy_reg_e82x(struct ice_hw *hw, u8 port, u16 offset, u32 val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.opcode = ice_sbq_msg_wr_p,
+		.data = val
+	};
 	int err;
 
 	ice_fill_phy_msg_e82x(hw, &msg, port, offset);
-	msg.opcode = ice_sbq_msg_wr;
-	msg.data = val;
 
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
@@ -2594,15 +2606,15 @@ static int ice_fill_quad_msg_e82x(struct ice_hw *hw,
 int
 ice_read_quad_reg_e82x(struct ice_hw *hw, u8 quad, u16 offset, u32 *val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.opcode = ice_sbq_msg_rd,
+	};
 	int err;
 
 	err = ice_fill_quad_msg_e82x(hw, &msg, quad, offset);
 	if (err)
 		return err;
 
-	msg.opcode = ice_sbq_msg_rd;
-
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
 		ice_debug(hw, ICE_DBG_PTP, "Failed to send message to PHY, err %d\n",
@@ -2628,16 +2640,16 @@ ice_read_quad_reg_e82x(struct ice_hw *hw, u8 quad, u16 offset, u32 *val)
 int
 ice_write_quad_reg_e82x(struct ice_hw *hw, u8 quad, u16 offset, u32 val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.opcode = ice_sbq_msg_wr_p,
+		.data = val
+	};
 	int err;
 
 	err = ice_fill_quad_msg_e82x(hw, &msg, quad, offset);
 	if (err)
 		return err;
 
-	msg.opcode = ice_sbq_msg_wr;
-	msg.data = val;
-
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
 		ice_debug(hw, ICE_DBG_PTP, "Failed to send message to PHY, err %d\n",
@@ -4275,14 +4287,14 @@ static void ice_ptp_init_phy_e82x(struct ice_ptp_hw *ptp)
  */
 static int ice_read_phy_reg_e810(struct ice_hw *hw, u32 addr, u32 *val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.dest_dev = ice_sbq_dev_phy_0,
+		.opcode = ice_sbq_msg_rd,
+		.msg_addr_low = lower_16_bits(addr),
+		.msg_addr_high = upper_16_bits(addr),
+	};
 	int err;
 
-	msg.msg_addr_low = lower_16_bits(addr);
-	msg.msg_addr_high = upper_16_bits(addr);
-	msg.opcode = ice_sbq_msg_rd;
-	msg.dest_dev = ice_sbq_dev_phy_0;
-
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
 		ice_debug(hw, ICE_DBG_PTP, "Failed to send message to PHY, err %d\n",
@@ -4305,15 +4317,15 @@ static int ice_read_phy_reg_e810(struct ice_hw *hw, u32 addr, u32 *val)
  */
 static int ice_write_phy_reg_e810(struct ice_hw *hw, u32 addr, u32 val)
 {
-	struct ice_sbq_msg_input msg = {0};
+	struct ice_sbq_msg_input msg = {
+		.dest_dev = ice_sbq_dev_phy_0,
+		.opcode = ice_sbq_msg_wr_p,
+		.msg_addr_low = lower_16_bits(addr),
+		.msg_addr_high = upper_16_bits(addr),
+		.data = val
+	};
 	int err;
 
-	msg.msg_addr_low = lower_16_bits(addr);
-	msg.msg_addr_high = upper_16_bits(addr);
-	msg.opcode = ice_sbq_msg_wr;
-	msg.dest_dev = ice_sbq_dev_phy_0;
-	msg.data = val;
-
 	err = ice_sbq_rw_reg(hw, &msg, LIBIE_AQ_FLAG_RD);
 	if (err) {
 		ice_debug(hw, ICE_DBG_PTP, "Failed to send message to PHY, err %d\n",
diff --git a/drivers/net/ethernet/intel/ice/ice_sbq_cmd.h b/drivers/net/ethernet/intel/ice/ice_sbq_cmd.h
index 21bb861febbf..86a143ebf089 100644
--- a/drivers/net/ethernet/intel/ice/ice_sbq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_sbq_cmd.h
@@ -54,8 +54,9 @@ enum ice_sbq_dev_id {
 };
 
 enum ice_sbq_msg_opcode {
-	ice_sbq_msg_rd	= 0x00,
-	ice_sbq_msg_wr	= 0x01
+	ice_sbq_msg_rd		= 0x00,
+	ice_sbq_msg_wr_p	= 0x01,
+	ice_sbq_msg_wr_np	= 0x02,
 };
 
 #define ICE_SBQ_MSG_FLAGS	0x40

base-commit: 0851f49814a8899a9769619b50baaeef59f9ece4
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH] macvlan: fix macvlan_get_size() not reserving space for IFLA_MACVLAN_BC_CUTOFF
From: Vadim Fedorenko @ 2026-04-15 11:11 UTC (permalink / raw)
  To: Dudu Lu, netdev; +Cc: andrew+netdev, davem, edumazet, kuba, pabeni
In-Reply-To: <20260413085349.73977-1-phx0fer@gmail.com>

On 13/04/2026 09:53, Dudu Lu wrote:
> macvlan_get_size() does not account for IFLA_MACVLAN_BC_CUTOFF, but
> macvlan_fill_info() conditionally includes it when port->bc_cutoff != 1.
> This causes nla_put_s32() to fail with -EMSGSIZE when the netlink skb
> runs out of space, triggering a WARN_ON in rtnetlink and preventing the
> interface from being dumped.
> 
> The bug can be reproduced with:
> 
>    ip link add macvlan0 link eth0 type macvlan mode bridge
>    ip link set macvlan0 type macvlan bc_cutoff 0
>    ip -d link show macvlan0   # fails with -EMSGSIZE
> 
> The bc_cutoff feature was added in commit 954d1fa1ac93 ("macvlan: Add
> netlink attribute for broadcast cutoff"), which added the nla_put_s32()
> call in macvlan_fill_info() but missed adding the corresponding
> nla_total_size(4) in macvlan_get_size(). A follow-up commit
> 55cef78c244d ("macvlan: add forgotten nla_policy for
> IFLA_MACVLAN_BC_CUTOFF") fixed the missing nla_policy entry but still
> did not fix the size calculation.
> 
> Fixes: 954d1fa1ac93 ("macvlan: Add netlink attribute for broadcast cutoff")
> Signed-off-by: Dudu Lu <phx0fer@gmail.com>
> ---
>   drivers/net/macvlan.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index a71f058eceef..80f87599a503 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -1681,6 +1681,7 @@ static size_t macvlan_get_size(const struct net_device *dev)
>   		+ macvlan_get_size_mac(vlan) /* IFLA_MACVLAN_MACADDR */
>   		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN */
>   		+ nla_total_size(4) /* IFLA_MACVLAN_BC_QUEUE_LEN_USED */
> +		+ nla_total_size(4) /* IFLA_MACVLAN_BC_CUTOFF */
>   		);
>   }

Please, use tree indication for the next submissions. As this patch
fixes the issue, it will go to net tree.

Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>

^ permalink raw reply

* Re: [PATCH v2] net: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler
From: kernel test robot @ 2026-04-15 11:09 UTC (permalink / raw)
  To: Pavitra Jha, pabeni
  Cc: llvm, oe-kbuild-all, w, chandrashekar.devegowda, linux-wwan,
	netdev, stable, Pavitra Jha
In-Reply-To: <20260414153201.1633720-1-jhapavitra98@gmail.com>

Hi Pavitra,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net/main]
[also build test WARNING on net-next/main linus/master v7.0 next-20260414]
[cannot apply to horms-ipvs/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavitra-Jha/net-wwan-t7xx-validate-port_count-against-message-length-in-t7xx_port_enum_msg_handler/20260415-014321
base:   net/main
patch link:    https://lore.kernel.org/r/20260414153201.1633720-1-jhapavitra98%40gmail.com
patch subject: [PATCH v2] net: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler
config: loongarch-randconfig-002-20260415 (https://download.01.org/0day-ci/archive/20260415/202604151900.1tnLdQi7-lkp@intel.com/config)
compiler: clang version 23.0.0git (https://github.com/llvm/llvm-project 5bac06718f502014fade905512f1d26d578a18f3)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260415/202604151900.1tnLdQi7-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604151900.1tnLdQi7-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Warning: drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c:127 function parameter 'msg_len' not described in 't7xx_port_enum_msg_handler'
>> Warning: drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c:127 function parameter 'msg_len' not described in 't7xx_port_enum_msg_handler'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [net-next v1 2/3] net: motorcomm: phy: set drive strength in 8531s RGMII case
From: Minda Chen @ 2026-04-15  9:26 UTC (permalink / raw)
  To: Frank, Andrew Lunn, Heiner Kallweit, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev
  Cc: linux-kernel, Minda Chen
In-Reply-To: <20260415092654.64907-1-minda.chen@starfivetech.com>

Set RXD and RX CLK pin drive strength while in 8531s RGMII
case.

Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
---
 drivers/net/phy/motorcomm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
index 35aff1519b4b..f3129419f7c9 100644
--- a/drivers/net/phy/motorcomm.c
+++ b/drivers/net/phy/motorcomm.c
@@ -1714,6 +1714,11 @@ static int yt8521_config_init(struct phy_device *phydev)
 		if (ret < 0)
 			goto err_restore_page;
 	}
+
+	if (phydev->drv->phy_id == PHY_ID_YT8531S &&
+	    phydev->interface != PHY_INTERFACE_MODE_SGMII)
+		ret = yt8531_set_ds(phydev, true);
+
 err_restore_page:
 	return phy_restore_page(phydev, old_page, ret);
 }
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH v3 net] vsock: fix buffer size clamping order
From: Stefano Garzarella @ 2026-04-15 10:42 UTC (permalink / raw)
  To: Michal Luczaj
  Cc: Norbert Szetei, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, virtualization, netdev, linux-kernel
In-Reply-To: <e965ff22-37b2-406d-b885-e2736e84c0f3@rbox.co>

On Tue, Apr 14, 2026 at 04:22:04PM +0200, Michal Luczaj wrote:
>On 4/9/26 18:34, Norbert Szetei wrote:
>> In vsock_update_buffer_size(), the buffer size was being clamped to the
>> maximum first, and then to the minimum. If a user sets a minimum buffer
>> size larger than the maximum, the minimum check overrides the maximum
>> check, inverting the constraint.
>>
>> This breaks the intended socket memory boundaries by allowing the
>> vsk->buffer_size to grow beyond the configured vsk->buffer_max_size.
>>
>> Fix this by checking the minimum first, and then the maximum. This
>> ensures the buffer size never exceeds the buffer_max_size.
>
>Something may be missing. After adding another ioctl to your reproducer, I
>still see crashes.
>
>     SYSCHK(setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MIN_SIZE, &min,
>                       sizeof(min)));
>+    SYSCHK(setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE, &min,
>+                      sizeof(min)));
> }
>
>[*] Setting buffer_min_size to 0x400000000.
>[socket][0] sending...
>
>refcount_t: saturated; leaking memory.
>WARNING: lib/refcount.c:22 at refcount_warn_saturate+0x7d/0xb0, CPU#2:
>a.out/1478
>...
>refcount_t: underflow; use-after-free.
>WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x50/0xb0, CPU#12:
>kworker/12:0/80
>Workqueue: vsock-loopback vsock_loopback_work
>...
>

yeah, I pointed out the same during the bug discussion 
(https://lore.kernel.org/netdev/acuKUpZQq6z1DY_n@sgarzare-redhat/) and 
suggested to add a sysctl or reuse net.core.wmem_max/rmem_max
(https://lore.kernel.org/netdev/adYKERRYwzMIhZAl@sgarzare-redhat/)

Thanks,
Stefano


^ permalink raw reply

* Re: [PATCH net] hv_sock: Report EOF instead of -EIO for FIN
From: Stefano Garzarella @ 2026-04-15 10:38 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: kys, haiyangz, wei.liu, longli, davem, edumazet, kuba, pabeni,
	horms, niuxuewei.nxw, linux-hyperv, virtualization, netdev,
	linux-kernel, stable, Ben Hillis, Mitchell Levy
In-Reply-To: <20260414234316.711578-1-decui@microsoft.com>

On Tue, Apr 14, 2026 at 04:43:16PM -0700, Dexuan Cui wrote:
>Commit f0c5827d07cb unluckily causes a regression for the FIN packet,
>and the final read syscall gets an error rather than 0.
>
>Ideally, we would want to fix hvs_channel_readable_payload() so that it
>could return 0 in the FIN scenario, but it's not good for the hv_sock
>driver to use the VMBus ringbuffer's cached priv_read_index, which is
>internal data in the VMBus driver.
>
>Fix the regression in hv_sock by returning 0 rather than -EIO.
>
>Fixes: f0c5827d07cb ("hv_sock: Return the readable bytes in hvs_stream_has_data()")
>Cc: stable@vger.kernel.org
>Reported-by: Ben Hillis <Ben.Hillis@microsoft.com>
>Reported-by: Mitchell Levy <levymitchell0@gmail.com>
>Signed-off-by: Dexuan Cui <decui@microsoft.com>
>---
> net/vmw_vsock/hyperv_transport.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
>diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
>index 069386a74557..63d3549125be 100644
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -703,8 +703,22 @@ static s64 hvs_stream_has_data(struct vsock_sock *vsk)
> 	switch (hvs_channel_readable_payload(hvs->chan)) {
> 	case 1:
> 		need_refill = !hvs->recv_desc;
>-		if (!need_refill)
>-			return -EIO;
>+		if (!need_refill) {

Can we drop `need_refill` entirly and just check `hvs->recv_desc` here?

Mainly because now the comment we are adding is confusing me about what 
`need_refill` means.

The rest LGTM.

Thanks,
Stefano

>+			/* Here hvs->recv_data_len is 0, so hvs->recv_desc must
>+			 * be NULL unless it points to the 0-byte-payload FIN
>+			 * packet: see hvs_update_recv_data().
>+			 *
>+			 * Here all the payload has been dequeued, but
>+			 * hvs_channel_readable_payload() still returns 1,
>+			 * because the VMBus ringbuffer's read_index is not
>+			 * updated for the FIN packet: hvs_stream_dequeue() ->
>+			 * hv_pkt_iter_next() updates the cached priv_read_index
>+			 * but has no opportunity to update the read_index in
>+			 * hv_pkt_iter_close() as hvs_stream_has_data() returns
>+			 * 0 for the FIN packet, so it won't get dequeued.
>+			 */
>+			return 0;
>+		}
>
> 		hvs->recv_desc = hv_pkt_iter_first(hvs->chan);
> 		if (!hvs->recv_desc)
>-- 
>2.49.0
>
>


^ permalink raw reply

* Re: [PATCH] rose: Fix rose_find_socket() returning without sock_hold()
From: kernel test robot @ 2026-04-15 10:36 UTC (permalink / raw)
  To: Dudu Lu, netdev; +Cc: oe-kbuild-all, davem, edumazet, kuba, pabeni, Dudu Lu
In-Reply-To: <20260413090420.79932-1-phx0fer@gmail.com>

Hi Dudu,

kernel test robot noticed the following build errors:

[auto build test ERROR on net/main]
[also build test ERROR on net-next/main linus/master horms-ipvs/master v7.0 next-20260414]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Dudu-Lu/rose-Fix-rose_find_socket-returning-without-sock_hold/20260414-194608
base:   net/main
patch link:    https://lore.kernel.org/r/20260413090420.79932-1-phx0fer%40gmail.com
patch subject: [PATCH] rose: Fix rose_find_socket() returning without sock_hold()
config: i386-randconfig-141-20260415 (https://download.01.org/0day-ci/archive/20260415/202604151819.celyrwKo-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
smatch: v0.5.0-9007-gcf3ea02b
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260415/202604151819.celyrwKo-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202604151819.celyrwKo-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/rose/af_rose.c:1:9: error: expected identifier or '(' before 'if'
       1 |         if (s)
         |         ^~


vim +1 net/rose/af_rose.c

   > 1		if (s)
     2			sock_hold(s);// SPDX-License-Identifier: GPL-2.0-or-later
     3	/*
     4	 *
     5	 * Copyright (C) Jonathan Naylor G4KLX (g4klx@g4klx.demon.co.uk)
     6	 * Copyright (C) Alan Cox GW4PTS (alan@lxorguk.ukuu.org.uk)
     7	 * Copyright (C) Terry Dawson VK2KTJ (terry@animats.net)
     8	 * Copyright (C) Tomi Manninen OH2BNS (oh2bns@sral.fi)
     9	 */
    10	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH v2] vsock/virtio: fix accept queue count leak on transport mismatch
From: Michael S. Tsirkin @ 2026-04-15 10:28 UTC (permalink / raw)
  To: Dudu Lu; +Cc: netdev, stefanha, sgarzare, jasowang
In-Reply-To: <20260413131409.19022-1-phx0fer@gmail.com>

On Mon, Apr 13, 2026 at 09:14:09PM +0800, Dudu Lu wrote:
> virtio_transport_recv_listen() calls sk_acceptq_added() before
> vsock_assign_transport(). If vsock_assign_transport() fails or
> selects a different transport, the error path returns without
> calling sk_acceptq_removed(), permanently incrementing
> sk_ack_backlog.
> 
> After approximately backlog+1 such failures, sk_acceptq_is_full()
> returns true, causing the listener to reject all new connections.
> 
> Fix by moving sk_acceptq_added() to after the transport validation,
> matching the pattern used by vmci_transport and hyperv_transport.
> 
> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
> Signed-off-by: Dudu Lu <phx0fer@gmail.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  net/vmw_vsock/virtio_transport_common.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index 8a9fb23c6e85..e01d983488e5 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -1560,8 +1560,6 @@ virtio_transport_recv_listen(struct sock *sk, struct sk_buff *skb,
>  		return -ENOMEM;
>  	}
>  
> -	sk_acceptq_added(sk);
> -
>  	lock_sock_nested(child, SINGLE_DEPTH_NESTING);
>  
>  	child->sk_state = TCP_ESTABLISHED;
> @@ -1583,6 +1581,7 @@ virtio_transport_recv_listen(struct sock *sk, struct sk_buff *skb,
>  		return ret;
>  	}
>  
> +	sk_acceptq_added(sk);
>  	if (virtio_transport_space_update(child, skb))
>  		child->sk_write_space(child);
>  
> -- 
> 2.39.3 (Apple Git-145)


^ permalink raw reply

* Re: [PATCH v2] vsock/virtio: fix accept queue count leak on transport mismatch
From: Stefano Garzarella @ 2026-04-15 10:27 UTC (permalink / raw)
  To: Dudu Lu; +Cc: netdev, stefanha, mst, jasowang
In-Reply-To: <20260413131409.19022-1-phx0fer@gmail.com>

On Mon, Apr 13, 2026 at 09:14:09PM +0800, Dudu Lu wrote:
>virtio_transport_recv_listen() calls sk_acceptq_added() before
>vsock_assign_transport(). If vsock_assign_transport() fails or
>selects a different transport, the error path returns without
>calling sk_acceptq_removed(), permanently incrementing
>sk_ack_backlog.
>
>After approximately backlog+1 such failures, sk_acceptq_is_full()
>returns true, causing the listener to reject all new connections.
>
>Fix by moving sk_acceptq_added() to after the transport validation,
>matching the pattern used by vmci_transport and hyperv_transport.
>
>Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
>Signed-off-by: Dudu Lu <phx0fer@gmail.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>


^ permalink raw reply

* [PATCH] net: ipv4: igmp: add sysctl option to ignore inbound llm_reports
From: Steffen Trumtrar @ 2026-04-15 10:26 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Jonathan Corbet, Shuah Khan, David Ahern
  Cc: netdev, linux-doc, linux-kernel, Steffen Trumtrar

Add a new sysctl option 'igmp_link_local_mcast_reports_drop' that allows
dropping inbound IGMP reports for link-local multicast groups in the
224.0.0.X range. This can be used to prevent the local system from
processing IGMP reports for link local multicast groups and therefore
let the kernel still send the own outbound IGMP reports.

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
---
 Documentation/networking/ip-sysctl.rst                       | 12 ++++++++++++
 .../networking/net_cachelines/netns_ipv4_sysctl.rst          |  1 +
 include/net/netns/ipv4.h                                     |  1 +
 net/ipv4/af_inet.c                                           |  1 +
 net/ipv4/igmp.c                                              |  2 ++
 net/ipv4/sysctl_net_ipv4.c                                   |  7 +++++++
 6 files changed, 24 insertions(+)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 6921d8594b849..2da4cd6ac7202 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -2306,6 +2306,18 @@ igmp_link_local_mcast_reports - BOOLEAN
 
 	Default TRUE
 
+igmp_link_local_mcast_reports_drop - BOOLEAN
+	Drop inbound IGMP reports for link local multicast groups in
+	the 224.0.0.X range. When enabled, IGMP membership reports for
+	link local multicast addresses are silently dropped without
+	processing.
+	When the kernel gets inbound IGMP reports it stops sending own
+	IGMP reports. With allowing to drop and process the inbound reports,
+	the kernel will not stop sending the own reports, even when IGMP
+	reports from other hosts are seen on the network.
+
+	Default FALSE
+
 Alexey Kuznetsov.
 kuznet@ms2.inr.ac.ru
 
diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
index beaf1880a19bf..703afe2ba063b 100644
--- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
+++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
@@ -140,6 +140,7 @@ int                             sysctl_udp_rmem_min
 u8                              sysctl_fib_notify_on_flag_change
 u8                              sysctl_udp_l3mdev_accept
 u8                              sysctl_igmp_llm_reports
+u8                              sysctl_igmp_llm_reports_drop
 int                             sysctl_igmp_max_memberships
 int                             sysctl_igmp_max_msf
 int                             sysctl_igmp_qrv
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 8e971c7bf1646..1453f825ffd4d 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -258,6 +258,7 @@ struct netns_ipv4 {
 	u8 sysctl_igmp_llm_reports;
 	int sysctl_igmp_max_memberships;
 	int sysctl_igmp_max_msf;
+	u8 sysctl_igmp_llm_reports_drop;
 	int sysctl_igmp_qrv;
 
 	struct ping_group_range ping_group_range;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index c7731e300a442..b8f96a5d8afdc 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1825,6 +1825,7 @@ static __net_init int inet_init_net(struct net *net)
 	net->ipv4.sysctl_igmp_max_msf = 10;
 	/* IGMP reports for link-local multicast groups are enabled by default */
 	net->ipv4.sysctl_igmp_llm_reports = 1;
+	net->ipv4.sysctl_igmp_llm_reports_drop = 0;
 	net->ipv4.sysctl_igmp_qrv = 2;
 
 	net->ipv4.sysctl_fib_notify_on_flag_change = 0;
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index a674fb44ec25b..3a4932e4108bd 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -931,6 +931,8 @@ static bool igmp_heard_report(struct in_device *in_dev, __be32 group)
 	if (ipv4_is_local_multicast(group) &&
 	    !READ_ONCE(net->ipv4.sysctl_igmp_llm_reports))
 		return false;
+	if (READ_ONCE(net->ipv4.sysctl_igmp_llm_reports_drop))
+		return true;
 
 	rcu_read_lock();
 	for_each_pmc_rcu(in_dev, im) {
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 5654cc9c8a0b9..24dde84d289e4 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -948,6 +948,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dou8vec_minmax,
 	},
+	{
+		.procname	= "igmp_link_local_mcast_reports_drop",
+		.data		= &init_net.ipv4.sysctl_igmp_llm_reports_drop,
+		.maxlen		= sizeof(u8),
+		.mode		= 0644,
+		.proc_handler	= proc_dou8vec_minmax,
+	},
 	{
 		.procname	= "igmp_max_memberships",
 		.data		= &init_net.ipv4.sysctl_igmp_max_memberships,

---
base-commit: 028ef9c96e96197026887c0f092424679298aae8
change-id: 20260415-v7-0-topic-igmp-llm-drop-e4c13dbf17cc

Best regards,
--  
Steffen Trumtrar <s.trumtrar@pengutronix.de>


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox