Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] RDMA/nldev: add mutual exclusion in nldev_dellink()
From: Zhu Yanjun @ 2026-05-07 14:11 UTC (permalink / raw)
  To: Edward Adam Davis, yanjun.zhu@linux.dev
  Cc: akpm, arjan, davem, dsahern, edumazet, hdanton, horms, jgg, kuba,
	kuni1840, kuniyu, leon, linux-kernel, linux-rdma, netdev, pabeni,
	syzbot+d8f76778263ab65c2b21, syzkaller-bugs, zyjzyj2000
In-Reply-To: <tencent_D175A964A3A32452D77DB76B66C2B3730305@qq.com>


在 2026/5/7 6:40, Edward Adam Davis 写道:
> On Thu, 7 May 2026 06:25:54 -0700, Zhu Yanjun wrote:
>>> We must serialize calls to nldev_dellink() or risk a crash as syzbot
>>> reported:
>>>
>>> KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
>>> Call Trace:
>>>    udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
>>>    rxe_release_udp_tunnel drivers/infiniband/sw/rxe/rxe_net.c:294 [inline]
>>>    rxe_sock_put drivers/infiniband/sw/rxe/rxe_net.c:639 [inline]
>>>    rxe_net_del+0xfb/0x290 drivers/infiniband/sw/rxe/rxe_net.c:660
>>>    rxe_dellink+0x15/0x20 drivers/infiniband/sw/rxe/rxe.c:254
>>>
>>> Fixes: a60e3f3d6fba ("RDMA/nldev: Add dellink function pointer")
>>> Reported-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
>>> Closes: https://syzkaller.appspot.com/bug?extid=d8f76778263ab65c2b21
>>> Tested-by: syzbot+d8f76778263ab65c2b21@syzkaller.appspotmail.com
>>> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
>> Thanks a lot. This looks like a good solution. Since the issue is
>> reproducible,
>>
>> have you sent this commit to syzbot for verification?
> The patch has been verified by syzbot.

Thanks a lot.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

>
> BR,
> Edward
>
-- 
Best Regards,
Yanjun.Zhu


^ permalink raw reply

* Re: [PATCH net-next 08/12] dt-bindings: net: toshiba,tc965x-dwmac: add TC956x Ethernet bridge
From: Bjorn Andersson @ 2026-05-07 14:12 UTC (permalink / raw)
  To: Alex Elder
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, maxime.chevallier,
	rmk+kernel, konradybcio, robh, krzk+dt, conor+dt, linusw, brgl,
	arnd, gregkh, Daniel Thompson, mohd.anwar, a0987203069,
	alexandre.torgue, ast, boon.khai.ng, chenchuangyu, chenhuacai,
	daniel, hawk, hkallweit1, inochiama, john.fastabend, julianbraha,
	livelycarpet87, matthew.gerlach, mcoquelin.stm32, me,
	prabhakar.mahadev-lad.rj, richardcochran, rohan.g.thomas, sdf,
	siyanteng, weishangjuan, wens, netdev, bpf, linux-arm-msm,
	devicetree, linux-gpio, linux-stm32, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20260501155421.3329862-9-elder@riscstar.com>

On Fri, May 01, 2026 at 10:54:16AM -0500, Alex Elder wrote:
> diff --git a/Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml b/Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml
[..]
> +
> +  gpio-controller: true

I don't have any concern with the use of a proper gpio driver to model
the implementation, but if I understand correctly this relationship
between gpio controller and gpio consumer is strictly internal to "the
PCI device".

Is this connection variable or is the link merely expressed in
DeviceTree to mitigate the fact that you choose to implement the
responsibilities of the two parts split into two device drivers?

Are there other consumers of these TC956x gpios which would result in a
board designer (and hence dts author) to ever reference this
gpio-controller in a different way?

Regards,
Bjorn

^ permalink raw reply

* Re: [PATCH net-next 10/12] net: stmmac: tc956x: add TC956x/QPS615 support
From: Andrew Lunn @ 2026-05-07 14:14 UTC (permalink / raw)
  To: Xilin Wu
  Cc: Alex Elder, andrew+netdev, davem, edumazet, kuba, pabeni,
	maxime.chevallier, rmk+kernel, andersson, konradybcio, robh,
	krzk+dt, conor+dt, linusw, brgl, arnd, gregkh, Daniel Thompson,
	mohd.anwar, a0987203069, alexandre.torgue, ast, boon.khai.ng,
	chenchuangyu, chenhuacai, daniel, hawk, hkallweit1, inochiama,
	john.fastabend, julianbraha, livelycarpet87, matthew.gerlach,
	mcoquelin.stm32, me, prabhakar.mahadev-lad.rj, richardcochran,
	rohan.g.thomas, sdf, siyanteng, weishangjuan, wens, netdev, bpf,
	linux-arm-msm, devicetree, linux-gpio, linux-stm32,
	linux-arm-kernel, linux-kernel
In-Reply-To: <3A5C0389E7C0D241+21a4f16b-1af8-46ac-8831-0c1b49694df0@radxa.com>

> Hi Alex,
> 
> Do you think if a shutdown callback like this is required? It looks like the
> driver sometimes does a MDIO MMIO read when the PCIe link is down, causing
> the board to reset due to SoC side PCIe NoC timeout.
> 
> After this change, the board can always shutdown gracefully.
> 
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-tc956x.c
> b/drivers/net/ethernet/stmicro/stmmac/dwmac-tc956x.c
> index 4e8b4a185583..34b8e3fe1b51 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-tc956x.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-tc956x.c
> @@ -767,6 +767,17 @@ static void tc956x_dwmac_remove(struct auxiliary_device
> *adev)
>         tc956x_mac_disable(td);
>  }
> 
> +static void tc956x_dwmac_shutdown(struct auxiliary_device *adev)
> +{
> +       struct device *dev = &adev->dev;
> +       int ret;
> +
> +       ret = stmmac_suspend(dev);

It seems odd to do a suspend in shutdown.

But lets backtrack. Why is the PCIe link down?

	Andrew


^ permalink raw reply

* Re: [PATCH net-next 08/12] dt-bindings: net: toshiba,tc965x-dwmac: add TC956x Ethernet bridge
From: Andrew Lunn @ 2026-05-07 14:19 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Alex Elder, andrew+netdev, davem, edumazet, kuba, pabeni,
	maxime.chevallier, rmk+kernel, konradybcio, robh, krzk+dt,
	conor+dt, linusw, brgl, arnd, gregkh, Daniel Thompson, mohd.anwar,
	a0987203069, alexandre.torgue, ast, boon.khai.ng, chenchuangyu,
	chenhuacai, daniel, hawk, hkallweit1, inochiama, john.fastabend,
	julianbraha, livelycarpet87, matthew.gerlach, mcoquelin.stm32, me,
	prabhakar.mahadev-lad.rj, richardcochran, rohan.g.thomas, sdf,
	siyanteng, weishangjuan, wens, netdev, bpf, linux-arm-msm,
	devicetree, linux-gpio, linux-stm32, linux-arm-kernel,
	linux-kernel
In-Reply-To: <afycOwz5TpkegkZd@baldur>

> Are there other consumers of these TC956x gpios which would result in a
> board designer (and hence dts author) to ever reference this
> gpio-controller in a different way?

This Ethernet device could driver an SFP cage. Such cages typically
have a number of pins connect to GPIOs, so you can tell when there is
a module in the cage, enable the transmit laser, know if light is
entering the module from the link peer, etc.

    sfp2: sfp {
      compatible = "sff,sfp";
      i2c-bus = <&sfp_i2c>;
      los-gpios = <&cps_gpio1 28 GPIO_ACTIVE_HIGH>;
      mod-def0-gpios = <&cps_gpio1 27 GPIO_ACTIVE_LOW>;
      pinctrl-names = "default";
      pinctrl-0 = <&cps_sfpp0_pins>;
      tx-disable-gpios = <&cps_gpio1 29 GPIO_ACTIVE_HIGH>;
      tx-fault-gpios = <&cps_gpio1 26 GPIO_ACTIVE_HIGH>;
    };

	Andrew

^ permalink raw reply

* [PATCH net] net: xgene: fix mdio_np leak in xgene_mdiobus_register()
From: Shitalkumar Gandhi @ 2026-05-07 14:20 UTC (permalink / raw)
  To: Iyappan Subramanian, Keyur Chudgar, Quan Nguyen
  Cc: Jakub Kicinski, David S . Miller, Eric Dumazet, Paolo Abeni,
	Andrew Lunn, Simon Horman, netdev, linux-kernel,
	Shitalkumar Gandhi

The for_each_child_of_node() loop captures mdio_np via break,
holding the refcount. of_mdiobus_register() does not consume the
reference, so it leaks on success.

Put it after registration.

Fixes: e6ad767305eb ("drivers: net: Add APM X-Gene SoC ethernet driver support.")
Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com>
---
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index b854b6b42d77..2926e1e59941 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -910,7 +910,9 @@ static int xgene_mdiobus_register(struct xgene_enet_pdata *pdata,
 			return -ENXIO;
 		}
 
-		return of_mdiobus_register(mdio, mdio_np);
+		ret = of_mdiobus_register(mdio, mdio_np);
+		of_node_put(mdio_np);
+		return ret;
 	}
 
 	/* Mask out all PHYs from auto probing. */
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH net-next v2 1/2] net: Consistently define pci_device_ids using named initializers
From: Uwe Kleine-König (The Capable Hub) @ 2026-05-07 14:23 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Michael Grzeschik, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Vincent Mailhol, Krzysztof Halasa,
	Johannes Berg, Steffen Klassert, David Dillow, Ion Badulescu,
	Mark Einon, Rasesh Mody, GR-Linux-NIC-Dev, Manish Chopra,
	Potnuri Bharat Teja, Denis Kirjanov, Jijie Shao, Jian Shen,
	Cai Huoqing, Fan Gong, Tony Nguyen, Przemek Kitszel, Tariq Toukan,
	Saeed Mahameed, Leon Romanovsky, Mark Bloch, Ido Schimmel,
	Petr Machata, Yibo Dong, Heiner Kallweit, nic_swsd, Jiri Pirko,
	Francois Romieu, Daniele Venzano, Samuel Chessman, Jiawen Wu,
	Mengyuan Lou, Kevin Curtis, Arend van Spriel, Stanislav Yakovlev,
	Richard Cochran, Kees Cook, Aleksandr Loktionov, Thomas Gleixner,
	Jacob Keller, Thomas Fourier, Ingo Molnar, Kory Maincent,
	Zilin Guan, Vadim Fedorenko, Marco Crivellari, Bjorn Helgaas,
	David Arinzon, Yeounsu Moon, Denis Benato, Yonglong Liu,
	Andy Shevchenko, Randy Dunlap, Yicong Hui, MD Danish Anwar,
	Nathan Chancellor, Ethan Nelson-Moore, Larysa Zaremba, Ian Lin,
	Colin Ian King, Double Lo, Markus Schneider-Pargmann,
	Simon Horman, netdev, linux-kernel, linux-can, linux-parisc,
	intel-wired-lan, linux-rdma, oss-drivers, linux-wireless,
	brcm80211, brcm80211-dev-list.pdl
In-Reply-To: <20260507-healthy-gainful-fox-500552-mkl@pengutronix.de>

[-- Attachment #1: Type: text/plain, Size: 1638 bytes --]

Hello Marc,

On Thu, May 07, 2026 at 12:55:45PM +0200, Marc Kleine-Budde wrote:
> > +	}, {
> >  		/* ASEM Dual CAN raw -new model */
> > -		ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
> > -		ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS,
> > -		0, 0,
> > -		(kernel_ulong_t)&plx_pci_card_info_asem_dual_can
> > +		PCI_DEVICE_SUB(ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
> > +			       ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS),
> > +		.driver_data = (kernel_ulong_t)&plx_pci_card_info_asem_dual_can,
> >  	},
> > -	{ 0,}
> > +	{ }
> 
> Nitpick: can you convert the terminating entry to follow the same style
> as the rest of the driver:
> 
> diff --git a/drivers/net/can/sja1000/plx_pci.c b/drivers/net/can/sja1000/plx_pci.c
> index a03553b80a5d..d69ff0ccfd94 100644
> --- a/drivers/net/can/sja1000/plx_pci.c
> +++ b/drivers/net/can/sja1000/plx_pci.c
> @@ -353,8 +353,8 @@ static const struct pci_device_id plx_pci_tbl[] = {
>                  PCI_DEVICE_SUB(ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
>                                 ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS),
>                  .driver_data = (kernel_ulong_t)&plx_pci_card_info_asem_dual_can,
> -        },
> -        { }
> +        }, {
> +        }
>  };
>  MODULE_DEVICE_TABLE(pci, plx_pci_tbl);

After the conversation in the v1 thread it was unclear to me if you
stand by your opinion, so I kept the format as it was. I interpret your
repetition of the nitpick as request to rework the can drivers for the
next revision (if that happens).

Best regards
Uwe

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH net v2] eth: fbnic: fix double-free of PCS on phylink creation failure
From: Jakub Kicinski @ 2026-05-07 14:24 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Bobby Eshleman, Alexander Duyck, kernel-team, Andrew Lunn,
	David S. Miller, Eric Dumazet, Russell King, netdev, linux-kernel,
	Bobby Eshleman
In-Reply-To: <6cec0c03-5bdc-4131-9899-bc5c77fba198@redhat.com>

On Thu, 7 May 2026 12:34:24 +0200 Paolo Abeni wrote:
> > Clearing fbd->netdev to NULL avoids UAF in init_failure_mode where
> > callers guard by checking !fbd->netdev, such as fbnic_mdio_read_pmd().
> > These callers remain active even after a failed probe, so fdb->netdev
> > still needs to be cleared.
> > 
> > Fixes: d0fe7104c795 ("fbnic: Replace use of internal PCS w/ Designware XPCS")
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>  
> 
> Note that sashiko-gemini spotted a pre-existing issue:
> 
> https://sashiko.dev/#/patchset/20260504-fbnic-pcs-fix-v2-1-de45192821d9%40meta.com
> 
> does not block this patch but could deserve a follow-up.

fbd is a devlink priv, not netdev priv, touching it after free_netdev()
is perfectly fine. I wish Gemini tried a *little* harder instead of
guessing :| Sorry for not commenting earlier.

^ permalink raw reply

* Re: [PATCH net-next v5 1/5] veth: fix OOB txq access in veth_poll() with asymmetric queue counts
From: Paolo Abeni @ 2026-05-07 14:25 UTC (permalink / raw)
  To: hawk, netdev
  Cc: kernel-team, Sashiko, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Stanislav Fomichev, Toshiaki Makita, linux-kernel,
	bpf
In-Reply-To: <20260505132159.241305-2-hawk@kernel.org>

On 5/5/26 3:21 PM, hawk@kernel.org wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> 
> XDP redirect into a veth device (via bpf_redirect()) calls
> veth_xdp_xmit(), which enqueues frames into the peer's ptr_ring using
>   smp_processor_id() % peer->real_num_rx_queues
> as the ring index.  With an asymmetric veth pair where the peer has
> fewer TX queues than RX queues, that index can exceed
> peer->real_num_tx_queues.
> 
> veth_poll() then resolves peer_txq for the ring via:
> 
>   peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
> 
> where queue_idx = rq->xdp_rxq.queue_index.  When queue_idx exceeds
> peer_dev->real_num_tx_queues this is an out-of-bounds (OOB) access
> into the peer's netdev_queue array, triggering DEBUG_NET_WARN_ON_ONCE
> in netdev_get_tx_queue().
> 
> The normal ndo_start_xmit path is not affected: the stack clamps
> skb->queue_mapping via netdev_cap_txqueue() before invoking
> ndo_start_xmit, so rxq in veth_xmit() never exceeds real_num_tx_queues.
> 
> Fix veth_poll() by clamping: only dereference peer_txq when queue_idx is
> within bounds, otherwise set it to NULL.  The out-of-range rings are fed
> exclusively via XDP redirect (veth_xdp_xmit), never via ndo_start_xmit
> (veth_xmit), so the peer txq was never stopped and there is nothing to
> wake; NULL is the correct fallback.
> 
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Closes: https://lore.kernel.org/all/20260502071828.616C3C19425@smtp.kernel.org/
> Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>

This looks fairly uncontroversial, but it's IMHO net material. Let me
apply it there.

/P


^ permalink raw reply

* Re: [PATCH net v2] eth: fbnic: fix double-free of PCS on phylink creation failure
From: Jakub Kicinski @ 2026-05-07 14:29 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Bobby Eshleman, Alexander Duyck, kernel-team, Andrew Lunn,
	David S. Miller, Eric Dumazet, Russell King, netdev, linux-kernel,
	Bobby Eshleman
In-Reply-To: <20260507072453.5eec7051@kernel.org>

On Thu, 7 May 2026 07:24:53 -0700 Jakub Kicinski wrote:
> On Thu, 7 May 2026 12:34:24 +0200 Paolo Abeni wrote:
> > > Clearing fbd->netdev to NULL avoids UAF in init_failure_mode where
> > > callers guard by checking !fbd->netdev, such as fbnic_mdio_read_pmd().
> > > These callers remain active even after a failed probe, so fdb->netdev
> > > still needs to be cleared.
> > > 
> > > Fixes: d0fe7104c795 ("fbnic: Replace use of internal PCS w/ Designware XPCS")
> > > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>    
> > 
> > Note that sashiko-gemini spotted a pre-existing issue:
> > 
> > https://sashiko.dev/#/patchset/20260504-fbnic-pcs-fix-v2-1-de45192821d9%40meta.com
> > 
> > does not block this patch but could deserve a follow-up.  
> 
> fbd is a devlink priv, not netdev priv, touching it after free_netdev()
> is perfectly fine. I wish Gemini tried a *little* harder instead of
> guessing :| Sorry for not commenting earlier.

Ugh, not enough coffee. It's complaining about MDIO reads, I think
that's valid.

^ permalink raw reply

* Re: [PATCH net-next v5 0/5] veth: add Byte Queue Limits (BQL) support
From: patchwork-bot+netdevbpf @ 2026-05-07 14:30 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, kernel-team, davem, edumazet, kuba, pabeni, horms, carges,
	mfreemon, toke, j.koeppeler, leitao, simon.schippers, ast, daniel,
	john.fastabend, sdf, bpf
In-Reply-To: <20260505132159.241305-1-hawk@kernel.org>

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue,  5 May 2026 15:21:52 +0200 you wrote:
> From: Jesper Dangaard Brouer <hawk@kernel.org>
> 
> This series adds BQL (Byte Queue Limits) to the veth driver, reducing
> latency by dynamically limiting in-flight packets in the ptr_ring and
> moving buffering into the qdisc where AQM algorithms can act on it.
> 
> Problem:
>   veth's 256-entry ptr_ring acts as a "dark buffer" -- packets queued
>   there are invisible to the qdisc's AQM.  Under load, the ring fills
>   completely (DRV_XOFF backpressure), adding up to 256 packets of
>   unmanaged latency before the qdisc even sees congestion.
> 
> [...]

Here is the summary with links:
  - [net-next,v5,1/5] veth: fix OOB txq access in veth_poll() with asymmetric queue counts
    https://git.kernel.org/netdev/net/c/08f566e8f83b
  - [net-next,v5,2/5] net: add dev->bql flag to allow BQL sysfs for IFF_NO_QUEUE devices
    (no matching commit)
  - [net-next,v5,3/5] veth: implement Byte Queue Limits (BQL) for latency reduction
    (no matching commit)
  - [net-next,v5,4/5] veth: add tx_timeout watchdog as BQL safety net
    (no matching commit)
  - [net-next,v5,5/5] net: sched: add timeout count to NETDEV WATCHDOG message
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH ethtool-next 0/5] ethtool: Add 'pages on|off' option for module EEROM hex dump
From: Michal Kubecek @ 2026-05-07 14:32 UTC (permalink / raw)
  To: Danielle Ratson; +Cc: netdev@vger.kernel.org, Ido Schimmel, Petr Machata
In-Reply-To: <DM6PR12MB4516FAE7C203C652E10BE801D83C2@DM6PR12MB4516.namprd12.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 887 bytes --]

On Thu, May 07, 2026 at 12:37:25PM GMT, Danielle Ratson wrote:
> The reason for keeping the switch is backward compatibility: today's
> ethtool -m hex on produces a single flat hex dump, while the
> page-organized output has a different shape - per-page headers, blank
> lines and repeated column headers interleaved between hex rows.
> Scripts that parse today's output could break.
[...]
> Since the format changes shape and not just size, we kept it opt-in via pages on so existing consumers are unaffected.
> Are you OK with this even so it might break some existing usage?

Good point, there may indeed be existing scripts that would not handle
changed (default) output format. While I would personally prever the
binary dump for machine processing, people often parse text outputs that
were never meant to be used that way.

I suppose your approach is a safer choice then.

Michal

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH net] vsock/virtio: fix potential unbounded skb queue
From: Jakub Kicinski @ 2026-05-07 14:33 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Michael S. Tsirkin, Eric Dumazet, Arseniy Krasnov, Bobby Eshleman,
	Stefan Hajnoczi, David S . Miller, Paolo Abeni, Simon Horman,
	netdev, eric.dumazet, Arseniy Krasnov, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, kvm, virtualization
In-Reply-To: <afyMCyBvZpzWrLtO@sgarzare-redhat>

On Thu, 7 May 2026 14:59:13 +0200 Stefano Garzarella wrote:
> >well if you want to support pathological cases such as 1 byte messages
> >that would mean like 100x reduction no?
> 
> Yep, but since this patch is already merged, IMHO that is better than 
> losing data in those pathological cases.

We can revert if you think that the risk of regression is high..
Please LMK soon, we can do it before patch reaches Linus.

^ permalink raw reply

* Re: [PATCH net-next v5 3/5] veth: implement Byte Queue Limits (BQL) for latency reduction
From: Paolo Abeni @ 2026-05-07 14:34 UTC (permalink / raw)
  To: Simon Schippers, hawk, netdev
  Cc: kernel-team, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Stanislav Fomichev, linux-kernel, bpf
In-Reply-To: <ee275aa6-af27-4dac-9afa-da88abde312b@schippers-hamm.de>

On 5/7/26 8:54 AM, Simon Schippers wrote:
> On 5/5/26 15:21, hawk@kernel.org wrote:
>> @@ -928,9 +968,13 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>>  			}
>>  		} else {
>>  			/* ndo_start_xmit */
>> -			struct sk_buff *skb = ptr;
>> +			bool bql_charged = veth_ptr_is_bql(ptr);
>> +			struct sk_buff *skb = veth_ptr_to_skb(ptr);
>>  
>>  			stats->xdp_bytes += skb->len;
>> +			if (peer_txq && bql_charged)
>> +				netdev_tx_completed_queue(peer_txq, 1, VETH_BQL_UNIT);
> 
> In the discussion with Jonas [1], I left a comment explaining why I think
> this doesn’t work.
> 
> I still think first that adding an option to modify the hard-coded
> VETH_RING_SIZE is the way to go.
> 
> Thanks!
> 
> [1] Link: https://lore.kernel.org/netdev/e8cdba04-aa9a-45c6-9807-8274b62920df@tu-dortmund.de/
In the above discussion a 20% regression is reported, which IMHO can't
be ignored. Still the tput figures in the data are extremely low,
something is possibly off?!? I would expect a few Mpps with pktgen on
top of veth, while the reported data is ~20-30Kpps.

/P


^ permalink raw reply

* Re: [PATCH net-next v2 2/3] net: sched: tbf: pass all params to offload users
From: Jakub Kicinski @ 2026-05-07 14:37 UTC (permalink / raw)
  To: David Yang
  Cc: netdev, andrew, olteanv, davem, edumazet, pabeni, jhs, jiri,
	horms, linux-kernel
In-Reply-To: <CAAXyoMP3RNFWG+n39ZtCYb3Aq66KvpEjMhKr_2zU85mah0czrg@mail.gmail.com>

On Thu, 7 May 2026 11:11:58 +0800 David Yang wrote:
> > >  struct tc_tbf_qopt_offload_replace_params {
> > > +     u32             limit;
> > > +     u32             max_size;
> > > +     s64             buffer;
> > > +     s64             mtu;  
> >
> > The buffer and mtu fields are stored in tbf_sched_data in nanoseconds
> > (see tbf_change() in net/sched/sch_tbf.c where they are derived via
> > PSCHED_TICKS2NS(qopt->buffer) and psched_l2t_ns()), but they are
> > exposed here as bare s64 buffer / s64 mtu right next to max_size
> > which is a byte count.
> >
> > Would it be worth renaming these to buffer_ns / mtu_ns, or adding
> > kerneldoc to describe their unit?
> >
> > A driver author reading this struct and seeing mtu adjacent to
> > max_size might reasonably assume mtu is a byte MTU and program
> > hardware accordingly.  
> 
> These are carbon copies of struct tbf_sched_data, I see no reason to
> rename just here.

Driver API has broader exposure and more potential for
misunderstandings. AI's naming suggestion makes sense to me.

^ permalink raw reply

* Re: [PATCH net-next v3 2/2] net: openvswitch: decouple flow_table from ovs_mutex
From: David Laight @ 2026-05-07 14:37 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Adrian Moreno, netdev, aconole, Eelco Chaudron, Ilya Maximets,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Simon Horman,
	open list:OPENVSWITCH, open list
In-Reply-To: <b1820863-fd30-45b5-b03e-28225ea80780@redhat.com>

On Thu, 7 May 2026 13:31:24 +0200
Paolo Abeni <pabeni@redhat.com> wrote:

> On 5/5/26 10:42 AM, Adrian Moreno wrote:
..
> > +/* Must be called with flow_table->lock held. */
> >  int ovs_flow_tbl_flush(struct flow_table *flow_table)
> >  {
> >  	struct table_instance *old_ti, *new_ti;
> >  	struct table_instance *old_ufid_ti, *new_ufid_ti;
> >  
> > +	ASSERT_OVS_TBL(flow_table);  
> 
> Minor nit: adding the assert and the comment is redundant. I think the
> assert alone would be better. There are other similar later occurrences.

There is no point adding an ASSERT() for a pointer being NULL.
The NULL pointer dereference does the same job and can be easier to
debug because all the registers are still live.

-- David

> 
> /P
> 
> 


^ permalink raw reply

* Re: [PATCH net-next 08/12] dt-bindings: net: toshiba,tc965x-dwmac: add TC956x Ethernet bridge
From: Daniel Thompson @ 2026-05-07 14:47 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Alex Elder, andrew+netdev, davem, edumazet, kuba, pabeni,
	maxime.chevallier, rmk+kernel, andersson, konradybcio, robh,
	krzk+dt, conor+dt, linusw, brgl, arnd, gregkh, mohd.anwar,
	a0987203069, alexandre.torgue, ast, boon.khai.ng, chenchuangyu,
	chenhuacai, daniel, hawk, hkallweit1, inochiama, john.fastabend,
	julianbraha, livelycarpet87, matthew.gerlach, mcoquelin.stm32, me,
	prabhakar.mahadev-lad.rj, richardcochran, rohan.g.thomas, sdf,
	siyanteng, weishangjuan, wens, netdev, bpf, linux-arm-msm,
	devicetree, linux-gpio, linux-stm32, linux-arm-kernel,
	linux-kernel
In-Reply-To: <20260504-fascinating-teal-tarsier-b116c8@quoll>

On Mon, May 04, 2026 at 01:00:07PM +0200, Krzysztof Kozlowski wrote:
> On Fri, May 01, 2026 at 10:54:16AM -0500, Alex Elder wrote:
> > From: Daniel Thompson <daniel@riscstar.com>
> >
> > Add devicetree bindings for the Toshiba TC956x family of Ethernet-AVB/TSN
> > bridges.
> >
> > Signed-off-by: Daniel Thompson <daniel@riscstar.com>
> > Signed-off-by: Alex Elder <elder@riscstar.com>

Alex already replied to most of your comments but on this one
specifically...


> > ---
> >  .../bindings/net/toshiba,tc956x-dwmac.yaml    | 111 ++++++++++++++++++
> >  1 file changed, 111 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml b/Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml
> > new file mode 100644
> > index 0000000000000..d95d22a3761da
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/toshiba,tc956x-dwmac.yaml
> > @@ -0,0 +1,111 @@
> > <snip>
> > +examples:
> > +  - |
> > +    pcie {
> > +      #address-cells = <3>;
> > +      #size-cells = <2>;
> > +
> > +      tc956x_emac0: pci@0,0 {
> > +        compatible = "pci1179,0220";
> > +        reg = <0x50000 0x0 0x0 0x0 0x0>;
> > +        #address-cells = <3>;
> > +        #size-cells = <2>;
> > +        device_type = "pci";
> > +        ranges;
> > +
> > +        gpio-controller;
> > +        #gpio-cells = <2>;
> > +
> > +        phy-mode = "10gbase-r";
> > +        phy-handle = <&tc956x_emac0_phy>;
> > +
> > +        mdio {
> > +          compatible = "snps,dwmac-mdio";
> > +          #address-cells = <1>;
> > +          #size-cells = <0>;
> > +
> > +          tc956x_emac0_phy: ethernet-phy@1c {
> > +            compatible = "ethernet-phy-id311c.1c12";
> > +            reg = <0x1c>;
> > +          };
> > +        };
> > +      };
>
> Keep only one example, unless you have different properties (not their
> values, but their presence),

At some point I simplified the example by stripping out excess
properties from each ethernet-phy. In the process it looks like I
removed too much and eliminated reason I thought it important to
include both PCI functions in the example!

Each ethernet-phy will typically describe a reset gpio but we expect
only eMAC0 to act as a gpio-controller. For that reason I wanted to
show that. You can see part of that that in the current example because
tc956x_emac1 is not a gpio-controller.

In other words tc956x_emac**1**_phy will, in the real world, include a
reset-gpios property that references tc956x_emac**0**. For example:

    reset-gpios = <&tc956x_emac0 1 GPIO_ACTIVE_LOW>


So... is it better to strip it back the example to describe only a
single PCI function or should I add back the reset-gpios that I
accidentally removed?


Daniel.

^ permalink raw reply

* Re: [PATCH net-next v3 1/2] dpll: add fractional frequency offset to pin-parent-device
From: Jakub Kicinski @ 2026-05-07 14:47 UTC (permalink / raw)
  To: Ivan Vecera
  Cc: Jiri Pirko, netdev, Andrew Lunn, Arkadiusz Kubalewski,
	David S. Miller, Donald Hunter, Eric Dumazet, Jonathan Corbet,
	Leon Romanovsky, Mark Bloch, Michal Schmidt, Paolo Abeni,
	Pasi Vaananen, Petr Oros, Prathosh Satish, Saeed Mahameed,
	Shuah Khan, Simon Horman, Tariq Toukan, Vadim Fedorenko,
	linux-doc, linux-kernel, linux-rdma
In-Reply-To: <541f767d-222b-4dfa-a95a-19a5ed7a46bf@redhat.com>

On Thu, 7 May 2026 08:12:01 +0200 Ivan Vecera wrote:
> >> @@ -299,6 +299,10 @@ zl3073x_dpll_input_pin_ffo_get(const struct dpll_pin *dpll_pin, void *pin_priv,
> >>   {
> >>   	struct zl3073x_dpll_pin *pin = pin_priv;
> >>   
> >> +	/* Only rx vs tx symbol rate FFO is supported */
> >> +	if (dpll)
> >> +		return -ENODATA;
> >> +
> >>   	*ffo = pin->freq_offset;  
> > 
> > It's easy for driver authors to forget this sort of validation.
> > We should fail close, so it's better to have some "capability"
> > bits or something for the driver to opt into getting given format
> > of the call.  
> 
> Regarding the fail-close concern — I agree that relying on drivers
> to check dpll==NULL is fragile. A capability bit alone wouldn't help
> though, since the driver still needs to distinguish which FFO context
> is being requested.
> 
> I can think of two approaches:
> 1. An explicit bool parameter (e.g. `bool per_parent`) instead of
>     overloading the dpll pointer for context distinction.
> 2. Separate callbacks for each FFO context (e.g. ffo_get for the
>     top-level and ffo_parent_get for the per-parent).
> 
> Do you have a preference, or something else in mind?

TAL at the fields at the beginning of struct ethtool_ops
If we had two bits in the ops struct for driver to declare / opt-in
to each context the core can avoid calling the driver if it doesn't
support a context.

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] net: Consistently define pci_device_ids using named initializers
From: Marc Kleine-Budde @ 2026-05-07 14:48 UTC (permalink / raw)
  To: Uwe Kleine-König (The Capable Hub)
  Cc: Michael Grzeschik, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Vincent Mailhol, Krzysztof Halasa,
	Johannes Berg, Steffen Klassert, David Dillow, Ion Badulescu,
	Mark Einon, Rasesh Mody, GR-Linux-NIC-Dev, Manish Chopra,
	Potnuri Bharat Teja, Denis Kirjanov, Jijie Shao, Jian Shen,
	Cai Huoqing, Fan Gong, Tony Nguyen, Przemek Kitszel, Tariq Toukan,
	Saeed Mahameed, Leon Romanovsky, Mark Bloch, Ido Schimmel,
	Petr Machata, Yibo Dong, Heiner Kallweit, nic_swsd, Jiri Pirko,
	Francois Romieu, Daniele Venzano, Samuel Chessman, Jiawen Wu,
	Mengyuan Lou, Kevin Curtis, Arend van Spriel, Stanislav Yakovlev,
	Richard Cochran, Kees Cook, Aleksandr Loktionov, Thomas Gleixner,
	Jacob Keller, Thomas Fourier, Ingo Molnar, Kory Maincent,
	Zilin Guan, Vadim Fedorenko, Marco Crivellari, Bjorn Helgaas,
	David Arinzon, Yeounsu Moon, Denis Benato, Yonglong Liu,
	Andy Shevchenko, Randy Dunlap, Yicong Hui, MD Danish Anwar,
	Nathan Chancellor, Ethan Nelson-Moore, Larysa Zaremba, Ian Lin,
	Colin Ian King, Double Lo, Markus Schneider-Pargmann,
	Simon Horman, netdev, linux-kernel, linux-can, linux-parisc,
	intel-wired-lan, linux-rdma, oss-drivers, linux-wireless,
	brcm80211, brcm80211-dev-list.pdl
In-Reply-To: <afyfa4E4rNbkMYTk@monoceros>

[-- Attachment #1: Type: text/plain, Size: 2191 bytes --]

On 07.05.2026 16:23:43, Uwe Kleine-König (The Capable Hub) wrote:
> Hello Marc,
>
> On Thu, May 07, 2026 at 12:55:45PM +0200, Marc Kleine-Budde wrote:
> > > +	}, {
> > >  		/* ASEM Dual CAN raw -new model */
> > > -		ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
> > > -		ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS,
> > > -		0, 0,
> > > -		(kernel_ulong_t)&plx_pci_card_info_asem_dual_can
> > > +		PCI_DEVICE_SUB(ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
> > > +			       ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS),
> > > +		.driver_data = (kernel_ulong_t)&plx_pci_card_info_asem_dual_can,
> > >  	},
> > > -	{ 0,}
> > > +	{ }
> >
> > Nitpick: can you convert the terminating entry to follow the same style
> > as the rest of the driver:
> >
> > diff --git a/drivers/net/can/sja1000/plx_pci.c b/drivers/net/can/sja1000/plx_pci.c
> > index a03553b80a5d..d69ff0ccfd94 100644
> > --- a/drivers/net/can/sja1000/plx_pci.c
> > +++ b/drivers/net/can/sja1000/plx_pci.c
> > @@ -353,8 +353,8 @@ static const struct pci_device_id plx_pci_tbl[] = {
> >                  PCI_DEVICE_SUB(ASEM_RAW_CAN_VENDOR_ID, ASEM_RAW_CAN_DEVICE_ID,
> >                                 ASEM_RAW_CAN_SUB_VENDOR_ID, ASEM_RAW_CAN_SUB_DEVICE_ID_BIS),
> >                  .driver_data = (kernel_ulong_t)&plx_pci_card_info_asem_dual_can,
> > -        },
> > -        { }
> > +        }, {
> > +        }
> >  };
> >  MODULE_DEVICE_TABLE(pci, plx_pci_tbl);
>
> After the conversation in the v1 thread it was unclear to me if you
> stand by your opinion, so I kept the format as it was. I interpret your
> repetition of the nitpick as request to rework the can drivers for the
> next revision (if that happens).

Doh - Yes, right, we discussed this already. Keep it as is and add my:

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for drivers/net/can

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH] devlink/param: replace deprecated strcpy() with strscpy()
From: Jakub Kicinski @ 2026-05-07 14:52 UTC (permalink / raw)
  To: David Laight
  Cc: Álvaro Costa, jiri, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, open list:DEVLINK, open list
In-Reply-To: <20260507090445.3448d536@pumpkin>

On Thu, 7 May 2026 09:04:45 +0100 David Laight wrote:
> >  	case DEVLINK_PARAM_TYPE_STRING:
> > -		len = strnlen(nla_data(param_data), nla_len(param_data));
> > -		if (len == nla_len(param_data) ||
> > -		    len >= __DEVLINK_PARAM_MAX_STRING_VALUE)
> > +		len = strscpy(value->vstr, nla_data(param_data));
> > +		if (len < 0)
> >  			return -EINVAL;
> > -		strcpy(value->vstr, nla_data(param_data));  
> 
> The only sensible thing here is to replace the strcpy() with:
> 		memcpy(value->vstr, nla_data(param_data), len + 1);

That'd probably be a good move, to avoid getting the broken strscpy()
conversion submissions. Care to send a patch?

^ permalink raw reply

* Re: [PATCH net] vsock/virtio: fix potential unbounded skb queue
From: Michael S. Tsirkin @ 2026-05-07 14:52 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Eric Dumazet, Arseniy Krasnov, Bobby Eshleman, Stefan Hajnoczi,
	David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	netdev, eric.dumazet, Arseniy Krasnov, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, kvm, virtualization
In-Reply-To: <afyMCyBvZpzWrLtO@sgarzare-redhat>

On Thu, May 07, 2026 at 02:59:13PM +0200, Stefano Garzarella wrote:
> On Thu, May 07, 2026 at 07:45:10AM -0400, Michael S. Tsirkin wrote:
> > On Thu, May 07, 2026 at 11:09:47AM +0200, Stefano Garzarella wrote:
> > > On Wed, May 06, 2026 at 11:37:45AM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, May 05, 2026 at 06:11:13PM +0200, Stefano Garzarella wrote:
> > > > > On Tue, May 05, 2026 at 07:14:36AM -0700, Eric Dumazet wrote:
> > > > > > On Tue, May 5, 2026 at 6:52 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Apr 30, 2026 at 12:26:52PM +0000, Eric Dumazet wrote:
> > > > > > > >virtio_transport_inc_rx_pkt() checks vvs->rx_bytes + len > vvs->buf_alloc.
> > > > > > > >
> > > > > > > >virtio_transport_recv_enqueue() skips coalescing for packets
> > > > > > > >with VIRTIO_VSOCK_SEQ_EOM.
> > > > > > > >
> > > > > > > >If fed with packets with len == 0 and VIRTIO_VSOCK_SEQ_EOM,
> > > > > > > >a very large number of packets can be queued
> > > > > > > >because vvs->rx_bytes stays at 0.
> > > > > > > >
> > > > > > > >Fix this by estimating the skb metadata size:
> > > > > > > >
> > > > > > > >       (Number of skbs in the queue) * SKB_TRUESIZE(0)
> > > > > > > >
> > > > > > > >Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit")
> > > > > > > >Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > > > > > >Cc: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
> > > > > > > >Cc: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > > > >Cc: Stefano Garzarella <sgarzare@redhat.com>
> > > > > > > >Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > > > > > >Cc: Jason Wang <jasowang@redhat.com>
> > > > > > > >Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > > > > > > >Cc: "Eugenio Pérez" <eperezma@redhat.com>
> > > > > > > >Cc: kvm@vger.kernel.org
> > > > > > > >Cc: virtualization@lists.linux.dev
> > > > > > > >---
> > > > > > > > net/vmw_vsock/virtio_transport_common.c | 4 +++-
> > > > > > > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > >diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > > > > > >index 416d533f493d7b07e9c77c43f741d28cfcd0953e..9b8014516f4fb1130ae184635fbba4dfee58bd64 100644
> > > > > > > >--- a/net/vmw_vsock/virtio_transport_common.c
> > > > > > > >+++ b/net/vmw_vsock/virtio_transport_common.c
> > > > > > > >@@ -447,7 +447,9 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> > > > > > > > static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
> > > > > > > >                                       u32 len)
> > > > > > > > {
> > > > > > > >-      if (vvs->buf_used + len > vvs->buf_alloc)
> > > > > > > >+      u64 skb_overhead = (skb_queue_len(&vvs->rx_queue) + 1) * SKB_TRUESIZE(0);
> > > > > > > >+
> > > > > > > >+      if (skb_overhead + vvs->buf_used + len > vvs->buf_alloc)
> > > > > > > >               return false;
> > > > > > >
> > > > > > > I'm not sure about this fix, I mean that maybe this is incomplete.
> > > > > > > In virtio-vsock, there is a credit mechanism between the two peers:
> > > > > > > https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-4850003
> > > > > > >
> > > > > > > This takes only the payload into account, so it’s true that this problem
> > > > > > > exists; however, perhaps we should also inform the other peer of a lower
> > > > > > > credit balance, otherwise the other peer will believe it has much more
> > > > > > > credit than it actually does, send a large payload, and then the packet
> > > > > > > will be discarded and the data lost (there are no retransmissions,
> > > > > > > etc.).
> > > > > >
> > > > > > I dunno, perhaps revert 077706165717 ("virtio/vsock: don't use skbuff
> > > > > > state to account credit")
> > > > > > and find a better fix then?
> > > > >
> > > > > IIRC the same issue was there before the commit fixed by that one (commit
> > > > > 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")), so
> > > > > not sure about reverting it TBH.
> > > > >
> > > > > CCing Arseniy and Bobby.
> > > > >
> > > > > >
> > > > > > There is always a discrepancy between skb->len and skb->truesize.
> > > > > > You will not be able to announce a 1MB window, and accept one milliion
> > > > > > skb of 1-byte each.
> > > > > >
> > > > > > This kind of contract is broken.
> > > > > >
> > > > >
> > > > > Yep, I agree, but before we start discarding data (and losing it), IMHO we
> > > > > should at least inform the other peer that we're out of space.
> > > > >
> > > > > @Stefan, @Michael, do you think we can do something in the spec to avoid
> > > > > this issue and in some way take into account also the metadata in the
> > > > > credit. I mean to avoid the 1-byte packets flooding.
> > > > >
> > > > > Thanks,
> > > > > Stefano
> > > >
> > > > Why do we need the metadata? Just don't keep it around if you begin
> > > > running low on memory.
> > > 
> > > I don't think removing the skuffs will be easy; we added them for ebpf,
> > > zero-copy, and seqpacket as well.
> > 
> > You do not need to remove them completely.
> > 
> > > For now, we're already doing something:
> > > merging the skuffs if they don't have EOM set.
> > 
> > 
> > Right that's good. You could go further and merge with EOM too
> > if you stick the info about message boundaries somewhere else.
> 
> This adds a lot of complexity IMO, but we can try.
> 
> Do you have something in mind?

I'll send something shortly just to give you an idea.


> > 
> > > As a quick fix, I'm thinking of reducing the `buf_alloc` value to account
> > > for the overhead and notifying the other peer, at least until we find a
> > > better solution.
> > > 
> > > Stefano
> > 
> > well if you want to support pathological cases such as 1 byte messages
> > that would mean like 100x reduction no?
> > 
> 
> Yep, but since this patch is already merged, IMHO that is better than losing
> data in those pathological cases.
> 
> Thanks,
> Stefano





^ permalink raw reply

* Re: [PATCH net-next v5 3/5] veth: implement Byte Queue Limits (BQL) for latency reduction
From: Simon Schippers @ 2026-05-07 14:46 UTC (permalink / raw)
  To: Paolo Abeni, hawk, netdev
  Cc: kernel-team, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Stanislav Fomichev, linux-kernel, bpf
In-Reply-To: <8f2f7f2e-6aa2-4e5b-b52d-0025b2525579@redhat.com>



On 5/7/26 16:34, Paolo Abeni wrote:
> On 5/7/26 8:54 AM, Simon Schippers wrote:
>> On 5/5/26 15:21, hawk@kernel.org wrote:
>>> @@ -928,9 +968,13 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>>>  			}
>>>  		} else {
>>>  			/* ndo_start_xmit */
>>> -			struct sk_buff *skb = ptr;
>>> +			bool bql_charged = veth_ptr_is_bql(ptr);
>>> +			struct sk_buff *skb = veth_ptr_to_skb(ptr);
>>>  
>>>  			stats->xdp_bytes += skb->len;
>>> +			if (peer_txq && bql_charged)
>>> +				netdev_tx_completed_queue(peer_txq, 1, VETH_BQL_UNIT);
>>
>> In the discussion with Jonas [1], I left a comment explaining why I think
>> this doesn’t work.
>>
>> I still think first that adding an option to modify the hard-coded
>> VETH_RING_SIZE is the way to go.
>>
>> Thanks!
>>
>> [1] Link: https://lore.kernel.org/netdev/e8cdba04-aa9a-45c6-9807-8274b62920df@tu-dortmund.de/
> In the above discussion a 20% regression is reported, which IMHO can't
> be ignored. Still the tput figures in the data are extremely low,
> something is possibly off?!? I would expect a few Mpps with pktgen on
> top of veth, while the reported data is ~20-30Kpps.
> 
> /P
> 

The ~20-30Kpps occur when thousands of iptables rules are applied and
an UDP userspace application is sending.

And there is a 20% pktgen regression (no iptables rules applied).

I am pretty sure the reason is because the BQL limit is stuck at 2
packets (because the completed queue is always called with 1 packet
and not in a interrupt/timer with multiple packets...).


^ permalink raw reply

* Re: [PATCH net-next v2 4/4] net: phy: Introduce Airoha AN8801/R Gigabit Ethernet PHY driver
From: Louis-Alexis Eyraud @ 2026-05-07 14:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	AngeloGioacchino Del Regno, Heiner Kallweit, Russell King,
	kevin-kw.huang, macpaul.lin, matthias.bgg, kernel, netdev,
	devicetree, linux-arm-kernel, linux-mediatek, linux-kernel
In-Reply-To: <3688a285-7f98-4afa-80ad-697094cd7b97@lunn.ch>

Hi Andrew,

On Thu, 2026-03-26 at 13:47 +0100, Andrew Lunn wrote:
> > +static int an8801r_led_blink_set(struct phy_device *phydev, u8
> > index,
> > +				 unsigned long *delay_on,
> > +				 unsigned long *delay_off)
> > +{
> 
> ...
> 
> > +	ret = phy_modify_mmd(phydev, MDIO_MMD_VEND2,
> > LED_ON_CTRL(index),
> > +			     LED_ON_EN, blink ? LED_ON_EN : 0);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return 0;
> 
> Just
> 
> 
> 	return phy_modify_mmd(phydev, MDIO_MMD_VEND2,
> LED_ON_CTRL(index),
> 			     LED_ON_EN, blink ? LED_ON_EN : 0);
> 
> > +		if (!led_trigger)
> > +			continue;
> > +
> > +		ret = an8801r_led_hw_control_set(phydev, led_id,
> > led_trigger);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +	return 0;
> > +}
> 
> 
> Please take a look at all your functions. Can the last error check be
> removed and just use return ret, etc.
I'll fix this in the next version.

> 
> > +static int an8801r_of_init_leds(struct phy_device *phydev, u8
> > *led_cfg)
> > +{
> > +	struct device *dev = &phydev->mdio.dev;
> > +	struct device_node *np = dev->of_node;
> > +	struct device_node *leds;
> > +	u32 function_enum_idx;
> > +	int ret;
> > +
> > +	if (!np)
> > +		return 0;
> > +
> > +	/* If devicetree is present, leds configuration is
> > required */
> > +	leds = of_get_child_by_name(np, "leds");
> > +	if (!leds)
> > +		return 0;
> > +
> > +	for_each_available_child_of_node_scoped(leds, led) {
> > +		u32 led_idx;
> > +
> > +		ret = of_property_read_u32(led, "reg", &led_idx);
> > +		if (ret)
> > +			goto out;
> > +
> > +		if (led_idx >= AN8801R_NUM_LEDS) {
> > +			ret = -EINVAL;
> > +			goto out;
> > +		}
> > +
> > +		ret = of_property_read_u32(led, "function-
> > enumerator",
> > +					   &function_enum_idx);
> > +		if (ret)
> > +			function_enum_idx = AN8801R_LED_FN_NONE;
> > +
> 
> What is this doing? Is this documented in the binding?
The `function-enumerator` property is only documented in the led common
dt-binding file. The an8801 dt-bindings inherits this property from the
ethernet-phy dt-bindings.

We aimed to have this PHY have its led behaviour (how many to enable
and what their role shall be) configurable using devicetree and not to
rely on a default configuration, hard-coded in the driver (like the
air_en8811h driver did) and also make use of the led hardware
offloading (for functions like 100/1000, activity blinking, and others)
that this PHY is capable of.

From the available property list for the led node, this one seems to be
appropriate to distinguish between the possible LAN functions, that 
would mean that a specific LED has either a link or RX/Tx activity 
role. That is why we used it but we could be wrong.

The an8801 dt-bindings (in patch 1) misses the possible values and
should improved in that regard and I'll fix them in next version if
this implementation seems acceptable to you.
> 
> > +		if (function_enum_idx >= AN8801R_LED_FN_MAX) {
> > +			ret = -EINVAL;
> > +			goto out;
> > +		}
> > +
> > +		led_cfg[led_idx] = function_enum_idx;
> > +	}
> > +out:
> > +	of_node_put(leds);
> > +	return ret;
> > +}
> 
> > +static int an8801r_read_status(struct phy_device *phydev)
> > +{
> > +	int prev_speed, ret;
> > +	u32 val;
> > +
> > +	prev_speed = phydev->speed;
> > +
> > +	ret = genphy_read_status(phydev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (phydev->link && prev_speed != phydev->speed) {
> > +		val = phydev->speed == SPEED_1000 ?
> > +		      AN8801_BPBUS_LINK_MODE_1000 : 0;
> > +
> > +		return an8801_buckpbus_reg_rmw(phydev,
> > +					      
> > AN8801_BPBUS_REG_LINK_MODE,
> > +					      
> > AN8801_BPBUS_LINK_MODE_1000,
> > +					       val);
> > +	};
> 
> This is unusual. What is it doing? Please add a comment.
This call is to ensure that the PHY switches to the expected 1Gbps 
speed when available. 
I'll confirm it and add a comment in v3.

Best regards,
Louis-Alexis
> 
> 	Andrew

^ permalink raw reply

* Re: [PATCH net-next v3 0/3] ipv4: Flush the FIB once on multiple nexthop removal
From: David Ahern @ 2026-05-07 14:57 UTC (permalink / raw)
  To: Cosmin Ratiu, netdev
  Cc: Ido Schimmel, Kuniyuki Iwashima, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Simon Horman, Paolo Abeni
In-Reply-To: <20260507075606.322405-1-cratiu@nvidia.com>

On 5/7/26 1:56 AM, Cosmin Ratiu wrote:
> This series optimizes multiple nexthop removal performance from having
> to do a FIB flush for each nexthop being removed to only doing a single
> FIB flush after all nexthops are removed.
> 
> This dramatically improves performance in scenarios where there are
> many nexthops and many ipv4 routes. Please see individual patches for
> more details and for a test scenario.
> 
> V2 -> V3: https://lore.kernel.org/netdev/8fea4084-c9ec-472a-b8ab-ecc87e537216@kernel.org/T/#t
> - Split the patch into 3 (Ido Schimmel, David Ahern)
> - Used WARN_ON_ONCE instead of WARN_ON (Ido Schimmel)
> 
> V1 -> V2:
> - Fixes xmas tree in a couple places (Kuniyuki Iwashima)
> - Added __must_check to remove_nexthop_from_groups() (Kuniyuki Iwashima)
> 
> Cosmin Ratiu (3):
>   ipv4: Provide a FIB flushing signal from nexthop removal functions
>   ipv4: Flush the FIB once on multiple nexthop removal
>   ipv4: Add __must_check to nexthop removal functions
> 
>  net/ipv4/nexthop.c | 88 +++++++++++++++++++++++++++++-----------------
>  1 file changed, 56 insertions(+), 32 deletions(-)
> 

Much easier to follow. Thank you.

For the set:
Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply

* Re: [PATCH v2 0/2] mfd: rsmu: fixes and new IC support
From: Lee Jones @ 2026-05-07 15:02 UTC (permalink / raw)
  To: Lee Jones, Richard Cochran, Min Li, Matthew Bystrin; +Cc: linux-kernel, netdev
In-Reply-To: <20260429072047.1111427-1-dev.mbstr@gmail.com>

On Wed, 29 Apr 2026 10:20:45 +0300, Matthew Bystrin wrote:
> First patch fixes Renesas 8A34002 SPI driver.
> 
> In my setup 8A34002 is connected to VisionFive2 (via SPI or I2C). I've
> discovered that upstream driver does not work:
> 
> [    4.728771] 8a3400x-phc 8a3400x-phc.0.auto: 4.8.7, Id: 0x4002  HW Rev: 5  OTP Config Select: 0
> [    4.737389] 8a3400x-phc 8a3400x-phc.0.auto: requesting firmware 'idtcm.bin'
> [    4.744462] 8a3400x-phc 8a3400x-phc.0.auto: Direct firmware load for idtcm.bin failed with error -2
> [    4.753547] 8a3400x-phc 8a3400x-phc.0.auto: Failed at line 1273 in idtcm_load_firmware!
> [    4.761576] 8a3400x-phc 8a3400x-phc.0.auto: loading firmware failed with -2
> [    4.769411] 8a3400x-phc 8a3400x-phc.0.auto: No wait state: DPLL_SYS_STATE 0
> [    4.776374] 8a3400x-phc 8a3400x-phc.0.auto: Continuing while SYS APLL/DPLL is not locked
> [    4.785206] 8a3400x-phc 8a3400x-phc.0.auto: Unsupported MANUAL_REFERENCE: 0x00
> [    4.796930] 8a3400x-phc 8a3400x-phc.0.auto: PLL2 registered as ptp0
> 
> [...]

Applied, thanks!

[1/2] mfd: rsmu: fix page register setup
      commit: d65ff90be2c1a09e9e477e6f7493fae21b31e59c
[2/2] mfd: rsmu: add 8a34002 support
      commit: c8e94555a15fbe0925a9cc424118ca10e3b5531a

--
Lee Jones [李琼斯]


^ permalink raw reply

* [PATCH net v2] qed: fix division by zero in qed_init_wfq_param when all vports are configured
From: Evgenii Burenchev @ 2026-05-07 14:55 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Evgenii Burenchev, andrew+netdev, davem, edumazet, kuba, pabeni,
	kees, horms, bhelgaas, darinzon, Yuval.Mintz, manish.chopra,
	netdev, linux-kernel, lvc-project

In qed_init_wfq_param(), variable non_requested_count can become zero
when the number of vports with the configured flag set (including the
current vport being configured) equals total num_vports. This happens
when configuring the last unconfigured vport or when re-configuring
an already configured vport.

The function then calculates left_rate_per_vp = total_left_rate /
non_requested_count, which causes division by zero.

Fix this by skipping the division when non_requested_count is zero.
In that case, there is no remaining bandwidth to distribute, so just
record the configuration for the current vport and return success.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: bcd197c81f63 ("qed: Add vport WFQ configuration APIs")
Signed-off-by: Evgenii Burenchev <evg28bur@yandex.ru>
---
Changes in v2:
- Return success instead of -EINVAL when non_requested_count is zero
- Add Fixes tag
- Clarify commit message: explain both scenarios that lead to non_requested_count == 0
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 42c6dcfb1f0f..dd75c47758e1 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -5103,6 +5103,13 @@ static int qed_init_wfq_param(struct qed_hwfn *p_hwfn,
 		return -EINVAL;
 	}
 
+	/* All vports are already or become configured, nothing to distribute */
+	if (non_requested_count == 0) {
+		p_hwfn->qm_info.wfq_data[vport_id].min_speed = req_rate;
+		p_hwfn->qm_info.wfq_data[vport_id].configured = true;
+		return 0;
+	}
+
 	total_left_rate	= min_pf_rate - total_req_min_rate;
 
 	left_rate_per_vp = total_left_rate / non_requested_count;
-- 
2.43.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox