Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 net-next 00/15] ip6mr: No RTNL for RTNL_FAMILY_IP6MR rtnetlink.
From: Kuniyuki Iwashima @ 2026-04-12 20:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David S . Miller, David Ahern, Eric Dumazet, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev
In-Reply-To: <20260412075856.68f37eb6@kernel.org>

On Sun, Apr 12, 2026 at 7:58 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 10 Apr 2026 21:16:56 +0000 Kuniyuki Iwashima wrote:
> > This series is the IPv6 version of
> >
> >   https://lore.kernel.org/netdev/20260228221800.1082070-1-kuniyu@google.com/
> >
> > and removes RTNL from ip6mr rtnetlink handlers.
> >
> > After this series, there are a few RTNL left in net/ipv6/ipmr.c
> > and such users will be converted to per-netns RTNL in another
> > series.
> >
> > Patch 1 extends the ipmr selftest to exercise most of the RTNL
> >  paths in net/ipv6/ipmr.c
> >
> > Patch 2 - 6 converts RTM_GETROUTE handlers to RCU.
> >
> > Patch 7 removes struct fib_dump_filter.rtnl_held.
> >
> > Patch 8 - 9 use RCU for mr_table for CONFIG_IP_MROUTE_MULTIPLE_TABLES=n
> >  and CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=n for ->exit_rtnl().
> >
> > Patch 10 - 12 converts ->exit_batch() to ->exit_rtnl() to
> >  save one RTNL in cleanup_net().
> >
> > Patch 13 - 14 removes unnecessary RTNL during setup_net()
> >  failure.
> >
> > Patch 15 drops RTNL for MRT6_(ADD|DEL)_MFC(_PROXY)?.
>
> Hitting a bunch of:
>
>   SKIP      no netlink MFC interface
>
> on the new test here. Do we need to add something to .../config ?

No, I used SKIP() intentionally becuase only IPv4 has the MFC
netlink interface and IPv6 does not have the corresponding one.

Should I just return 0 in this case instead of SKIP() ?

^ permalink raw reply

* Re: [PATCH net v3] ppp: require CAP_NET_ADMIN in target netns for unattached ioctls
From: patchwork-bot+netdevbpf @ 2026-04-12 20:50 UTC (permalink / raw)
  To: =?utf-8?b?7ZWY7YOc6rWsIDxoYXRhZWd1MDgyNkBnbWFpbC5jb20+?=
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, dqfext, kees,
	kuniyu, bigeasy, gorcunov, linux-ppp, netdev, linux-kernel,
	qingfang.deng, gnault, jaco, richardbgobert, ericwouds,
	teknoraver
In-Reply-To: <20260409071117.4354-1-hataegu0826@gmail.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu,  9 Apr 2026 16:11:15 +0900 you wrote:
> /dev/ppp open is currently authorized against file->f_cred->user_ns,
> while unattached administrative ioctls operate on current->nsproxy->net_ns.
> 
> As a result, a local unprivileged user can create a new user namespace
> with CLONE_NEWUSER, gain CAP_NET_ADMIN only in that new user namespace,
> and still issue PPPIOCNEWUNIT, PPPIOCATTACH, or PPPIOCATTCHAN against
> an inherited network namespace.
> 
> [...]

Here is the summary with links:
  - [net,v3] ppp: require CAP_NET_ADMIN in target netns for unattached ioctls
    https://git.kernel.org/netdev/net/c/2bb6379416fd

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2 0/2] net/rds: Fix use-after-free in RDS/IB for non-init namespaces
From: patchwork-bot+netdevbpf @ 2026-04-12 20:50 UTC (permalink / raw)
  To: Allison Henderson
  Cc: netdev, pabeni, edumazet, rds-devel, kuba, horms, linux-rdma
In-Reply-To: <20260408080420.540032-1-achender@kernel.org>

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  8 Apr 2026 01:04:18 -0700 you wrote:
> This series fixes syzbot bug da8e060735ae02c8f3d1
> https://syzkaller.appspot.com/bug?extid=da8e060735ae02c8f3d1
> 
> The report finds a use-after-free bug where ib connections access an
> invalid network namespace after it has been freed.  The stack is:
> 
>     rds_rdma_cm_event_handler_cmn
>       rds_conn_path_drop
>         rds_destroy_pending
>           check_net()  <-- use-after-free
> 
> [...]

Here is the summary with links:
  - [net,v2,1/2] net/rds: Optimize rds_ib_laddr_check
    https://git.kernel.org/netdev/net/c/236f718ac885
  - [net,v2,2/2] net/rds: Restrict use of RDS/IB to the initial network namespace
    https://git.kernel.org/netdev/net/c/ebf71dd4aff4

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v5 1/2] net: hsr: require valid EOT supervision TLV
From: Luka Gejak @ 2026-04-12 20:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, edumazet, pabeni, netdev, fmaurer, horms, luka.gejak
In-Reply-To: <20260412133157.3b335e1b@kernel.org>

On April 12, 2026 10:31:57 PM GMT+02:00, Jakub Kicinski <kuba@kernel.org> wrote:
>On Sun, 12 Apr 2026 22:13:35 +0200 Luka Gejak wrote:
>> Regarding the TLV loop: I actually implemented a TLV walker in v4 [1] 
>> for this exact reason, but I moved to strict sequential parsing in v5 
>> based on reviewer's feedback to keep the implementation simple. Could 
>> you please check if the approach used in v4 is what you had in mind? 
>> If so, I will rebase that logic onto the memory safety fixes 
>> (pskb_may_pull) from v5 and submit it as v6.
>
>That's not really what I had in mind. I was thinking of a loop which
>just skips the TLVs in order, leaving the parsing of known TLVs as is.
>But I've never used HSR maybe this sort of strict validation is somehow
>okay in HSR deployments.
>
>Please just undo the comment tweaks then.

So keep other changes as is and only undo comment changes?

^ permalink raw reply

* Re: [PATCH net-next v18 00/15] Begin upstreaming Homa transport protocol
From: Jakub Kicinski @ 2026-04-12 20:45 UTC (permalink / raw)
  To: John Ousterhout; +Cc: netdev, pabeni, edumazet, horms
In-Reply-To: <20260410200310.1915-1-ouster@cs.stanford.edu>

On Fri, 10 Apr 2026 13:02:54 -0700 John Ousterhout wrote:
> This patch series begins the process of upstreaming the Homa transport
> protocol. Homa is an alternative to TCP for use in datacenter
> environments. It provides 10-100x reductions in tail latency for short
> messages relative to TCP. Its benefits are greatest for mixed workloads
> containing both short and long messages running under high network loads.
> Homa is not API-compatible with TCP: it is connectionless and message-
> oriented (but still reliable and flow-controlled). Homa's new API not
> only contributes to its performance gains, but it also eliminates the
> massive amount of connection state required by TCP for highly connected
> datacenter workloads (Homa uses ~ 1 socket per application, whereas
> TCP requires a separate socket for each peer).

make coccicheck says:

net/homa/homa_peer.c:213:21-22: WARNING opportunity for swap()

^ permalink raw reply

* Re: [PATCH v2] nfc: hci: fix OOB heap read on short HCP frames
From: Jakub Kicinski @ 2026-04-12 20:42 UTC (permalink / raw)
  To: Ashutosh Desai; +Cc: netdev, Eric Dumazet, davem, pabeni, horms, linux-kernel
In-Reply-To: <20260409150825.2217133-1-ashutoshdesai993@gmail.com>

On Thu,  9 Apr 2026 15:08:25 +0000 Ashutosh Desai wrote:
> Suggested-by: Eric Dumazet <edumazet@google.com>

As Eric mentioned elsewhere - he did not suggest any of this,
merely reviewed your submission.

> +++ b/net/nfc/hci/core.c
> @@ -134,6 +134,10 @@ static void nfc_hci_msg_rx_work(struct work_struct *work)
>  	u8 instruction;
>  
>  	while ((skb = skb_dequeue(&hdev->msg_rx_queue)) != NULL) {
> +		if (!pskb_may_pull(skb, NFC_HCI_HCP_HEADER_LEN)) {
> +			kfree_skb(skb);
> +			continue;

How did a broken packet get enqueued in the first place?

^ permalink raw reply

* RE: [PATCH net] net: ethernet: ravb: Do not check URAM suspension when WoL is active
From: Sai Krishna Gajula @ 2026-04-12 20:38 UTC (permalink / raw)
  To: Niklas Söderlund, Paul Barker, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Yoshihiro Shimoda,
	Geert Uytterhoeven, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org
In-Reply-To: <20260412173213.3179426-1-niklas.soderlund+renesas@ragnatech.se>

> -----Original Message-----
> From: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> Sent: Sunday, April 12, 2026 11:02 PM
> To: Paul Barker <paul@pbarker.dev>; Andrew Lunn
> <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; Yoshihiro Shimoda
> <yoshihiro.shimoda.uh@renesas.com>; Geert Uytterhoeven <geert@linux-
> m68k.org>; netdev@vger.kernel.org; linux-renesas-soc@vger.kernel.org
> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> Subject: [PATCH net] net: ethernet: ravb: Do not check URAM
> suspension when WoL is active
> 
> When updating the driver to match latest datasheet to suspend access to
> URAM when suspending DMA transfers a corner-case was missed, URAM
> access will not be suspended if WoL is enabled. This lead to the error
> message (correctly) being triggered
> When updating the driver to match latest datasheet to suspend access to
> URAM when suspending DMA transfers a corner-case was missed, URAM
> access will not be suspended if WoL is enabled. This lead to the error
> message
> (correctly) being triggered as URAM access is not suspended even tho it's
> requested as part of stopping DMA.
> 
> Avoid checking if URAM access is suspended and printing the error message if
> WoL is enabled when we suspend the system, as we know it will not be.
> 
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Closes: https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__lore.kernel.org_all_CAMuHMdWnjV-
> 253DHGE1o08zLhUfTgOSene5fYx1J5GG10mB-252BToq8qg-
> 40mail.gmail.com_&d=DwIDaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=c3MsgrR-
> U-HFhmFd6R4MWRZG-8QeikJn5PkjqMTpBSg&m=zcNb0FL70yebEHmL9-
> Sb2w05J7NxodKS6m5O_dpUxTZVY_5wbpd-
> Pls5yPmFMa4D&s=unSmIn3N04eAyEfuFm7ADIhCkckecCQL2hGzpgeEdQc&e=
> Fixes: 353d8e7989b6 ("net: ethernet: ravb: Suspend and resume the
> transmission flow")
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c
> b/drivers/net/ethernet/renesas/ravb_main.c
> index 1dbfadb2a881..5f88733094d0 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -1108,9 +1108,12 @@ static int ravb_stop_dma(struct net_device *ndev)
> 
>  	/* Request for transmission suspension */
>  	ravb_modify(ndev, CCC, CCC_DTSR, CCC_DTSR);
> -	error = ravb_wait(ndev, CSR, CSR_DTS, CSR_DTS);
> -	if (error)
> -		netdev_err(ndev, "failed to stop AXI BUS\n");
> +	/* Access to URAM will not be suspended if WoL is enabled. */
> +	if (!priv->wol_enabled) {
> +		error = ravb_wait(ndev, CSR, CSR_DTS, CSR_DTS);
> +		if (error)
> +			netdev_err(ndev, "failed to stop AXI BUS\n");
> +	}
> 
>  	/* Stop AVB-DMAC process */
>  	return ravb_set_opmode(ndev, CCC_OPC_CONFIG);
> --
> 2.53.0
> 
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>

^ permalink raw reply

* Re: [PATCH net-next v5 1/2] net: hsr: require valid EOT supervision TLV
From: Jakub Kicinski @ 2026-04-12 20:31 UTC (permalink / raw)
  To: Luka Gejak; +Cc: davem, edumazet, pabeni, netdev, fmaurer, horms
In-Reply-To: <DDF9CC02-6FC1-44F0-B95D-967151BF0592@linux.dev>

On Sun, 12 Apr 2026 22:13:35 +0200 Luka Gejak wrote:
> Regarding the TLV loop: I actually implemented a TLV walker in v4 [1] 
> for this exact reason, but I moved to strict sequential parsing in v5 
> based on reviewer's feedback to keep the implementation simple. Could 
> you please check if the approach used in v4 is what you had in mind? 
> If so, I will rebase that logic onto the memory safety fixes 
> (pskb_may_pull) from v5 and submit it as v6.

That's not really what I had in mind. I was thinking of a loop which
just skips the TLVs in order, leaving the parsing of known TLVs as is.
But I've never used HSR maybe this sort of strict validation is somehow
okay in HSR deployments.

Please just undo the comment tweaks then.

^ permalink raw reply

* Re: [PATCH v2 net 0/2] net: hamradio: fix missing input validation in bpqether and scc
From: patchwork-bot+netdevbpf @ 2026-04-12 20:30 UTC (permalink / raw)
  To: Mashiro Chen
  Cc: netdev, andrew+netdev, davem, edumazet, kuba, pabeni, jreuter,
	linux-hams, linux-kernel
In-Reply-To: <20260409024927.24397-1-mashiro.chen@mailbox.org>

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu,  9 Apr 2026 10:49:25 +0800 you wrote:
> This series fixes two missing input validation bugs in the hamradio
> drivers. Both patches were reviewed by Joerg Reuter (hamradio
> maintainer).
> 
> v2 changes:
> - bpqether: no code change; add Acked-by: Joerg Reuter
> - scc: drop the upper bound of 4096 per reviewer feedback;
>   only enforce the minimum of 16
> 
> [...]

Here is the summary with links:
  - [v2,net,1/2] net: hamradio: bpqether: validate frame length in bpq_rcv()
    https://git.kernel.org/netdev/net/c/6183bd8723a3
  - [v2,net,2/2] net: hamradio: scc: validate bufsize in SIOCSCCSMEM ioctl
    https://git.kernel.org/netdev/net/c/8263e484d662

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next 0/5] net: enetc: improve statistics for v1 and add statistics for v4
From: patchwork-bot+netdevbpf @ 2026-04-12 20:20 UTC (permalink / raw)
  To: Wei Fang
  Cc: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
	davem, edumazet, kuba, pabeni, netdev, linux-kernel, imx
In-Reply-To: <20260408055849.1314033-1-wei.fang@nxp.com>

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  8 Apr 2026 13:58:44 +0800 you wrote:
> For ENETC v1, some standardized statistics were redundantly included in
> the unstructured statistics, so remove these duplicated entries.
> Previously, the unstructured statistics only contained eMAC data and
> did not include pMAC data; add pMAC statistics to ensure completeness.
> 
> For ENETC v4, the driver previously reported MAC statistics only for the
> internal ENETC (Pseudo MAC). Extend the implementation to provide
> additional statistics for both the internal ENETC and the standalone
> ENETC.
> 
> [...]

Here is the summary with links:
  - [net-next,1/5] net: enetc: add support for the standardized counters
    https://git.kernel.org/netdev/net-next/c/c6c223fd06ed
  - [net-next,2/5] net: enetc: show RX drop counters only for assigned RX rings
    https://git.kernel.org/netdev/net-next/c/c571d309d4cf
  - [net-next,3/5] net: enetc: remove standardized counters from enetc_pm_counters
    https://git.kernel.org/netdev/net-next/c/6d78c37a73e0
  - [net-next,4/5] net: enetc: add unstructured pMAC counters for ENETC v1
    https://git.kernel.org/netdev/net-next/c/dbc30b154e33
  - [net-next,5/5] net: enetc: add unstructured counters for ENETC v4
    https://git.kernel.org/netdev/net-next/c/98a4f3d34132

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net] net: rose: reject truncated CLEAR_REQUEST frames in state machines
From: patchwork-bot+netdevbpf @ 2026-04-12 20:20 UTC (permalink / raw)
  To: Mashiro Chen
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, linux-hams,
	linux-kernel, stable
In-Reply-To: <20260408172551.281486-1-mashiro.chen@mailbox.org>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu,  9 Apr 2026 01:25:51 +0800 you wrote:
> All five ROSE state machines (states 1-5) handle ROSE_CLEAR_REQUEST
> by reading the cause and diagnostic bytes directly from skb->data[3]
> and skb->data[4] without verifying that the frame is long enough:
> 
>   rose_disconnect(sk, ..., skb->data[3], skb->data[4]);
> 
> The entry-point check in rose_route_frame() only enforces
> ROSE_MIN_LEN (3 bytes), so a remote peer on a ROSE network can
> send a syntactically valid but truncated CLEAR_REQUEST (3 or 4
> bytes) while a connection is open in any state.  Processing such a
> frame causes a one- or two-byte out-of-bounds read past the skb
> data, leaking uninitialized heap content as the cause/diagnostic
> values returned to user space via getsockopt(ROSE_GETCAUSE).
> 
> [...]

Here is the summary with links:
  - [net] net: rose: reject truncated CLEAR_REQUEST frames in state machines
    https://git.kernel.org/netdev/net/c/2835750dd647

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v2 net] net: ax25: fix integer overflow in ax25_rx_fragment()
From: Jakub Kicinski @ 2026-04-12 20:17 UTC (permalink / raw)
  To: Mashiro Chen
  Cc: netdev, davem, edumazet, pabeni, horms, jreuter, linux-hams,
	linux-kernel, stable
In-Reply-To: <20260409025026.24575-1-mashiro.chen@mailbox.org>

On Thu,  9 Apr 2026 10:50:26 +0800 Mashiro Chen wrote:
> Fix mirrors the identical bug fixed in NET/ROM (nr_in.c): check for
> overflow before adding skb->len to fraglen, and abort fragment
> reassembly cleanly if the limit would be exceeded.

Same problem as reported by Simon on the netrom patch applies here.

nit: I don't think you need to cast ax25->fraglen to unsigned int
in the comparison. since it's added with skb->len it should get
auto-prompted to unsigned int.
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net-next v5 1/2] net: hsr: require valid EOT supervision TLV
From: Luka Gejak @ 2026-04-12 20:13 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, edumazet, pabeni, netdev, fmaurer, horms, luka.gejak
In-Reply-To: <20260412124558.190c725f@kernel.org>

On April 12, 2026 9:45:58 PM GMT+02:00, Jakub Kicinski <kuba@kernel.org> wrote:
>On Tue,  7 Apr 2026 18:25:01 +0200 luka.gejak@linux.dev wrote:
>> From: Luka Gejak <luka.gejak@linux.dev>
>> 
>> Supervision frames are only valid if terminated with a zero-length EOT
>> TLV. The current check fails to reject non-EOT entries as the terminal
>> TLV, potentially allowing malformed supervision traffic.
>> 
>> Fix this by strictly requiring the terminal TLV to be HSR_TLV_EOT
>> with a length of zero, and properly linearizing the TLV header before
>> access.
>
>> diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
>> index 0aca859c88cb..eb89cc44eac0 100644
>> --- a/net/hsr/hsr_forward.c
>> +++ b/net/hsr/hsr_forward.c
>> @@ -82,35 +82,33 @@ static bool is_supervision_frame(struct hsr_priv *hsr, struct sk_buff *skb)
>>  	    hsr_sup_tag->tlv.HSR_TLV_length != sizeof(struct hsr_sup_payload))
>>  		return false;
>>  
>> -	/* Get next tlv */
>> +	/* Get next TLV */
>
>The capitalization of the comments makes the real changes in this patch
>harder to spot and follow. Please don't tweak the comments unless you're
>really changing / improving them.
>
>>  	total_length += hsr_sup_tag->tlv.HSR_TLV_length;
>> -	if (!pskb_may_pull(skb, total_length))
>> +	if (!pskb_may_pull(skb, total_length + sizeof(struct hsr_sup_tlv)))
>>  		return false;
>>  	skb_pull(skb, total_length);
>>  	hsr_sup_tlv = (struct hsr_sup_tlv *)skb->data;
>>  	skb_push(skb, total_length);
>>  
>> -	/* if this is a redbox supervision frame we need to verify
>> -	 * that more data is available
>> -	 */
>> +	/* If this is a RedBox supervision frame, verify additional data */
>>  	if (hsr_sup_tlv->HSR_TLV_type == PRP_TLV_REDBOX_MAC) {
>> -		/* tlv length must be a length of a mac address */
>> +		/* TLV length must be the size of a MAC address */
>>  		if (hsr_sup_tlv->HSR_TLV_length != sizeof(struct hsr_sup_payload))
>>  			return false;
>>  
>> -		/* make sure another tlv follows */
>> +		/* Make sure another TLV follows */
>>  		total_length += sizeof(struct hsr_sup_tlv) + hsr_sup_tlv->HSR_TLV_length;
>> -		if (!pskb_may_pull(skb, total_length))
>> +		if (!pskb_may_pull(skb, total_length + sizeof(struct hsr_sup_tlv)))
>>  			return false;
>>  
>> -		/* get next tlv */
>> +		/* Get next TLV */
>>  		skb_pull(skb, total_length);
>>  		hsr_sup_tlv = (struct hsr_sup_tlv *)skb->data;
>>  		skb_push(skb, total_length);
>>  	}
>>  
>> -	/* end of tlvs must follow at the end */
>> -	if (hsr_sup_tlv->HSR_TLV_type == HSR_TLV_EOT &&
>> +	/* Supervision frame must end with EOT TLV */
>> +	if (hsr_sup_tlv->HSR_TLV_type != HSR_TLV_EOT ||
>>  	    hsr_sup_tlv->HSR_TLV_length != 0)
>>  		return false;
>
>Aren't there more optional TLVs that we don't support?
>You mentioned making sure that the final TLV is a zero-length EOT
>but this check is far stricter.
>
>Should we replace this check with a loop which skips over TLVs
>until EOT is reached?

Hi Jakub,
Thank you for the feedback.
I apologize for the noise in the comments. I will revert all cosmetic 
capitalization changes in v6 to keep the diff focused on the logic.
Regarding the TLV loop: I actually implemented a TLV walker in v4 [1] 
for this exact reason, but I moved to strict sequential parsing in v5 
based on reviewer's feedback to keep the implementation simple. Could 
you please check if the approach used in v4 is what you had in mind? 
If so, I will rebase that logic onto the memory safety fixes 
(pskb_may_pull) from v5 and submit it as v6.

[1] https://lore.kernel.org/netdev/20260401092324.52266-2-luka.gejak@linux.dev/

Best regards,
Luka Gejak

^ permalink raw reply

* Re: [PATCH net-next v2] r8169: Use napi_schedule_irqoff()
From: Heiner Kallweit @ 2026-04-12 20:12 UTC (permalink / raw)
  To: Matt Vollrath, netdev; +Cc: edumazet, pabeni, kuba, andrew+netdev, nic_swsd
In-Reply-To: <b6c325ea-8865-4aea-addd-3be2fe178244@gmail.com>

On 12.04.2026 15:51, Matt Vollrath wrote:
> On 4/12/26 07:30, Heiner Kallweit wrote:
>> On 12.04.2026 03:40, Matt Vollrath wrote:
>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>> index 791277e750ba..4c0ad0de3410 100644
>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>> @@ -4873,7 +4873,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>>>           phy_mac_interrupt(tp->phydev);
>>>         rtl_irq_disable(tp);
>>> -    napi_schedule(&tp->napi);
>>> +    napi_schedule_irqoff(&tp->napi);
>>>   out:
>>>       rtl_ack_events(tp, status);
>>>   
>>
>> Not using napi_schedule_irqoff() here is intentional,
>> see 2734a24e6e5d18522fbf599135c59b82ec9b2c9e.
> 
> It looks like forced threading was fixed after your fix
> to mitigate the issue of forced threading not masking
> interrupts.
> 
> see 81e2073c175b887398e5bca6c004efa89983f58d
> 
> If I understand correctly, this should make
> napi_schedule_irqoff() safe in any interrupt handler.
> 

I think 8380c81d5c4fced6f4397795a5ae65758272bbfd needs to be
mentioned too, because only with this change your patch is safe
under PREEMPT_RT. Best extend the commit message based on our
discussion. With that one:
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>


^ permalink raw reply

* Re: [PATCH net-next v6 0/2] net: mana: add ethtool private flag for full-page RX buffers
From: Jakub Kicinski @ 2026-04-12 19:59 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, leitao, kees, john.fastabend,
	hawk, bpf, daniel, ast, sdf, dipayanroy
In-Reply-To: <20260409183509.0b24dea6@kernel.org>

On Thu, 9 Apr 2026 18:35:09 -0700 Jakub Kicinski wrote:
> On Tue,  7 Apr 2026 12:59:17 -0700 Dipayaan Roy wrote:
> > This behavior is observed on a single platform; other platforms
> > perform better with page_pool fragments, indicating this is not a
> > page_pool issue but platform-specific.  
> 
> Well, someone has to run some experiments and confirm other ARM
> platforms are not impacted, with data. I was hoping to do it myself
> but doesn't look like that will happen in time for the merge window :(

Please repost with the perf analysis on other commercially available
ARM platform. Something like:

  This is a workaround applicable to only some platforms. Modifying
  driver X to use a similar workaround on [Ampere Max|nVidia
  Grace|Amazon Graviton 3|..] the performance for split pages is
  y% higher than when using single pages.
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH 2/4] tools: ynl-gen-c: optionally emit structs and helpers
From: Jakub Kicinski @ 2026-04-12 19:55 UTC (permalink / raw)
  To: Christoph Böhmwalder
  Cc: Jens Axboe, drbd-dev, linux-kernel, Lars Ellenberg,
	Philipp Reisner, linux-block, Donald Hunter, Eric Dumazet, netdev
In-Reply-To: <20260407173356.873887-3-christoph.boehmwalder@linbit.com>

On Tue,  7 Apr 2026 19:33:54 +0200 Christoph Böhmwalder wrote:
> The new flags in the genetlink-legacy spec that are required for
> existing consumers to keep working are:
> 
>   "default": a literal value or C define that sets the default value
>   for an attribute, consumed by set_defaults().
> 
>   "required": if true, from_attrs() returns an error when this
>   attribute is missing from the request message.
> 
>   "nla-policy-type": can be used to override the NLA type used in
>   policy arrays. This is needed when the semantic type differs from
>   the wire type for backward compatibility: genl_magic maps s32 fields
>   to NLA_U32/nla_get_u32, and existing userspace might depend on this
>   encoding. The immediate motivation is DRBD, whose genl spec
>   definition predates the addition of signed types in genl. However,
>   this is a generic issue that potentially affects multiple families:
>   for example, nftables has NFTA_HOOK_PRIORITY as s32 in the spec but
>   NLA_U32 in the actual kernel policy.

The series doesn't apply for me (neither to Linus's tree nor 
to networking trees), so I didn't experiment with this code.

Are the new code gen additions purely for the kernel?
Can we just commit the code they output and leave the YNL itself be?
Every single legacy family has some weird quirks the point of YNL
is to get rid of them, not support them all..

^ permalink raw reply

* Re: [PATCH net-next v5 1/2] net: hsr: require valid EOT supervision TLV
From: Jakub Kicinski @ 2026-04-12 19:45 UTC (permalink / raw)
  To: luka.gejak; +Cc: davem, edumazet, pabeni, netdev, fmaurer, horms
In-Reply-To: <20260407162502.19462-2-luka.gejak@linux.dev>

On Tue,  7 Apr 2026 18:25:01 +0200 luka.gejak@linux.dev wrote:
> From: Luka Gejak <luka.gejak@linux.dev>
> 
> Supervision frames are only valid if terminated with a zero-length EOT
> TLV. The current check fails to reject non-EOT entries as the terminal
> TLV, potentially allowing malformed supervision traffic.
> 
> Fix this by strictly requiring the terminal TLV to be HSR_TLV_EOT
> with a length of zero, and properly linearizing the TLV header before
> access.

> diff --git a/net/hsr/hsr_forward.c b/net/hsr/hsr_forward.c
> index 0aca859c88cb..eb89cc44eac0 100644
> --- a/net/hsr/hsr_forward.c
> +++ b/net/hsr/hsr_forward.c
> @@ -82,35 +82,33 @@ static bool is_supervision_frame(struct hsr_priv *hsr, struct sk_buff *skb)
>  	    hsr_sup_tag->tlv.HSR_TLV_length != sizeof(struct hsr_sup_payload))
>  		return false;
>  
> -	/* Get next tlv */
> +	/* Get next TLV */

The capitalization of the comments makes the real changes in this patch
harder to spot and follow. Please don't tweak the comments unless you're
really changing / improving them.

>  	total_length += hsr_sup_tag->tlv.HSR_TLV_length;
> -	if (!pskb_may_pull(skb, total_length))
> +	if (!pskb_may_pull(skb, total_length + sizeof(struct hsr_sup_tlv)))
>  		return false;
>  	skb_pull(skb, total_length);
>  	hsr_sup_tlv = (struct hsr_sup_tlv *)skb->data;
>  	skb_push(skb, total_length);
>  
> -	/* if this is a redbox supervision frame we need to verify
> -	 * that more data is available
> -	 */
> +	/* If this is a RedBox supervision frame, verify additional data */
>  	if (hsr_sup_tlv->HSR_TLV_type == PRP_TLV_REDBOX_MAC) {
> -		/* tlv length must be a length of a mac address */
> +		/* TLV length must be the size of a MAC address */
>  		if (hsr_sup_tlv->HSR_TLV_length != sizeof(struct hsr_sup_payload))
>  			return false;
>  
> -		/* make sure another tlv follows */
> +		/* Make sure another TLV follows */
>  		total_length += sizeof(struct hsr_sup_tlv) + hsr_sup_tlv->HSR_TLV_length;
> -		if (!pskb_may_pull(skb, total_length))
> +		if (!pskb_may_pull(skb, total_length + sizeof(struct hsr_sup_tlv)))
>  			return false;
>  
> -		/* get next tlv */
> +		/* Get next TLV */
>  		skb_pull(skb, total_length);
>  		hsr_sup_tlv = (struct hsr_sup_tlv *)skb->data;
>  		skb_push(skb, total_length);
>  	}
>  
> -	/* end of tlvs must follow at the end */
> -	if (hsr_sup_tlv->HSR_TLV_type == HSR_TLV_EOT &&
> +	/* Supervision frame must end with EOT TLV */
> +	if (hsr_sup_tlv->HSR_TLV_type != HSR_TLV_EOT ||
>  	    hsr_sup_tlv->HSR_TLV_length != 0)
>  		return false;

Aren't there more optional TLVs that we don't support?
You mentioned making sure that the final TLV is a zero-length EOT
but this check is far stricter.

Should we replace this check with a loop which skips over TLVs
until EOT is reached?
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH bpf-next v2 2/3] bpf: Use kmalloc_nolock() universally in local storage
From: Slava Imameev @ 2026-04-12 19:40 UTC (permalink / raw)
  To: alexei.starovoitov
  Cc: ameryhung, andrii, ast, bot+bpf-ci, bpf, clm, daniel, eddyz87,
	ihor.solodrai, kernel-team, martin.lau, memxor, netdev,
	yonghong.song
In-Reply-To: <CAADnVQKeFF--bgnZZSU12UY0muuwYA=7EdzLyOi837oZs+bXTA@mail.gmail.com>

On Fri, 10 Apr 2026 21:39:00 -0700 Alexei Starovoitov wrote:
> >
> >
> > This allows value sizes up to ~65KB. Before this patch, socket and
> > inode storage used bpf_map_kzalloc() (backed by regular kmalloc)
> > which could handle those large sizes. After this patch, any
> > elem_size above KMALLOC_MAX_CACHE_SIZE will silently fail: the map
> > creation succeeds via bpf_local_storage_map_alloc_check() but every
> > element allocation returns NULL.
> >
> > Should BPF_LOCAL_STORAGE_MAX_VALUE_SIZE be updated to use
> > KMALLOC_MAX_CACHE_SIZE instead of KMALLOC_MAX_SIZE now that all
> > storage types go through kmalloc_nolock()?
> >
> > Slava Imameev raised the same concern for task storage in
> > https://lore.kernel.org/bpf/20260410014341.47043-1-slava.imameev@crowdstrike.com/
> 
> Right. Let's update it, but I don't think it's a regression.
> On a loaded system kmalloc_large() rarely succeeds for order 2+.
> That's why kmalloc_nolock() doesn't attempt to bridge that gap.
> One or two contiguous physical pages is the best one can expect.
> In early bpf days we picked KMALLOC_MAX_SIZE assuming that
> it's a realistic max for kmalloc().
> It turned out to be wishful thinking.
> kmalloc_large concept should really be removed.
> It deceives users into thinking that it's usable.

Do you think it would be viable to extend task storage to
support larger allocations, to restore support for 64KB or maybe
less value like 32 KB, using vmalloc or bpf_mem_cache_alloc,
with the obvious restrictions that vmalloc imposes? Perhaps we
could use bpf_mem_cache_alloc as the primary mechanism with
vmalloc as a fallback when the caller context permits?

We've found task storage allocations larger than 8KB quite
valuable for scenarios involving processing multiple file paths.
Currently, without large task storage support, we're forced to
preallocate maps with 12KB+ values and significantly
over-provision the number of entries to reduce the probability
of free entry depletion. This approach places unnecessary burden
on the memory subsystem since much of this pre-allocated memory
remains unused.

Even if task storage allocation fails due to lack of contiguous
physical memory and vmalloc is not possible, there's an option to
maintain an emergency preallocated map of much smaller size
compared to when this map serves as the primary mechanism.

With larger task storage allocations, we've implemented a simple
memory allocator that operates over task storage. For example, a
16KB task storage can accommodate multiple allocations, one big
and couple of small, which has substantially reduced our memory
footprint compared to the current map-based approach. We've also
experimented successfully with 32KB arenas for workloads
requiring even larger working sets.

^ permalink raw reply

* Re: [PATCH net-next] gre: Count GRE packet drops
From: patchwork-bot+netdevbpf @ 2026-04-12 19:40 UTC (permalink / raw)
  To: Gal Pressman
  Cc: davem, edumazet, kuba, pabeni, andrew+netdev, netdev, dsahern,
	horms, dtatulea, noren
In-Reply-To: <20260409090945.1542440-1-gal@nvidia.com>

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 9 Apr 2026 12:09:45 +0300 you wrote:
> GRE is silently dropping packets without updating statistics.
> 
> In case of drop, increment rx_dropped counter to provide visibility into
> packet loss. For the case where no GRE protocol handler is registered,
> use rx_nohandler.
> 
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Nimrod Oren <noren@nvidia.com>
> Signed-off-by: Gal Pressman <gal@nvidia.com>
> 
> [...]

Here is the summary with links:
  - [net-next] gre: Count GRE packet drops
    https://git.kernel.org/netdev/net-next/c/8632175ccb0c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next 0/3] net: phy: add support for disabling autonomous EEE
From: patchwork-bot+netdevbpf @ 2026-04-12 19:40 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: andrew, hkallweit1, linux, davem, edumazet, kuba, pabeni,
	florian.fainelli, bcm-kernel-feedback-list, netdev, linux-kernel
In-Reply-To: <20260406-devel-autonomous-eee-v1-0-b335e7143711@tipi-net.de>

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 06 Apr 2026 09:13:06 +0200 you wrote:
> Some PHYs implement autonomous EEE where the PHY manages EEE
> independently, preventing the MAC from controlling LPI signaling.
> This conflicts with MACs that implement their own LPI control.
> 
> This series adds a .disable_autonomous_eee callback to struct phy_driver
> and calls it from phy_support_eee(). When a MAC indicates it supports
> EEE, the PHY's autonomous EEE is automatically disabled. The setting is
> persisted across suspend/resume by re-applying it in phy_init_hw() after
> soft reset, following the same pattern suggested by Russell King for PHY
> tunables [1].
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] net: phy: add support for disabling PHY-autonomous EEE
    https://git.kernel.org/netdev/net-next/c/7ef629b45801
  - [net-next,2/3] net: phy: broadcom: implement .disable_autonomous_eee for BCM54xx
    https://git.kernel.org/netdev/net-next/c/bcb3e89fc0ec
  - [net-next,3/3] net: phy: realtek: convert RTL8211F to .disable_autonomous_eee
    https://git.kernel.org/netdev/net-next/c/bb14e3b63c63

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v2 0/3] bpf: fix sock_ops rtt_min OOB read and related guard issues
From: patchwork-bot+netdevbpf @ 2026-04-12 19:40 UTC (permalink / raw)
  To: Werner Kasselman
  Cc: martin.lau, ast, daniel, andrii, john.fastabend, brakmo, eddyz87,
	song, yonghong.song, kpsingh, sdf, haoluo, jolsa, davem, edumazet,
	kuba, pabeni, horms, bpf, netdev, linux-kernel
In-Reply-To: <20260412030306.3469543-1-werner@verivus.com>

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sun, 12 Apr 2026 03:03:08 +0000 you wrote:
> Patch 3 fixes an out-of-bounds read in sock_ops_convert_ctx_access()
> for the rtt_min context field. It is the only tcp_sock-backed field
> that bypasses the is_locked_tcp_sock guard, so on request_sock-backed
> sock_ops callbacks the converted BPF load reads past the end of a
> tcp_request_sock.
> 
> Patches 1 and 2 are groundwork. Patch 1 fixes a pre-existing info
> leak in SOCK_OPS_GET_FIELD() and SOCK_OPS_GET_SK() where dst_reg is
> left holding the context pointer on the guard-failure branch when
> dst_reg == src_reg, instead of being zeroed. Patch 2 extracts
> SOCK_OPS_LOAD_TCP_SOCK_FIELD() from SOCK_OPS_GET_FIELD() so the
> rtt_min sub-field access in patch 3 can reuse it.
> 
> [...]

Here is the summary with links:
  - [v2,1/3] bpf: zero dst_reg on sock_ops field guard failure when dst == src
    https://git.kernel.org/netdev/net/c/10f86a2a5c91
  - [v2,2/3] bpf: extract SOCK_OPS_LOAD_TCP_SOCK_FIELD from SOCK_OPS_GET_FIELD
    (no matching commit)
  - [v2,3/3] bpf: guard sock_ops rtt_min against non-locked tcp_sock
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH bpf v3 0/2] bpf: Fix SOCK_OPS_GET_SK same-register OOB read in sock_ops and add selftest
From: patchwork-bot+netdevbpf @ 2026-04-12 19:40 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: bpf, werner, martin.lau, daniel, john.fastabend, sdf, ast, andrii,
	eddyz87, memxor, song, yonghong.song, jolsa, davem, edumazet,
	kuba, pabeni, horms, shuah, sun.jian.kdev, linux-kernel, netdev,
	linux-kselftest
In-Reply-To: <20260407022720.162151-1-jiayuan.chen@linux.dev>

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue,  7 Apr 2026 10:26:26 +0800 you wrote:
> When a BPF sock_ops program accesses ctx fields with dst_reg == src_reg,
> the SOCK_OPS_GET_SK() and SOCK_OPS_GET_FIELD() macros fail to zero the
> destination register in the !fullsock / !locked_tcp_sock path, leading to
> OOB read (GET_SK) and kernel pointer leak (GET_FIELD).
> 
> Patch 1: Fix both macros by adding BPF_MOV64_IMM(si->dst_reg, 0) in the
> !fullsock landing pad.
> Patch 2: Add selftests covering same-register and different-register cases
> for both GET_SK and GET_FIELD.
> 
> [...]

Here is the summary with links:
  - [bpf,v3,1/2] bpf: Fix same-register dst/src OOB read and pointer leak in sock_ops
    https://git.kernel.org/netdev/net/c/10f86a2a5c91
  - [bpf,v3,2/2] selftests/bpf: Add tests for sock_ops ctx access with same src/dst register
    https://git.kernel.org/netdev/net/c/04013c3ca022

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v2 net] openvswitch: fix vport netlink reply size for large upcall PID arrays
From: Ilya Maximets @ 2026-04-12 19:33 UTC (permalink / raw)
  To: Weiming Shi, Aaron Conole, Eelco Chaudron
  Cc: i.maximets, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Flavio Leitner, Mark Gray, netdev,
	Xiang Mei, sunichi, ovs dev
In-Reply-To: <20260411141448.1479933-3-bestswngs@gmail.com>

On 4/11/26 4:14 PM, Weiming Shi wrote:
> The vport netlink reply helpers allocate a fixed-size skb with
> nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID
> array via ovs_vport_get_upcall_portids(). Since
> ovs_vport_set_upcall_portids() accepts any non-zero multiple of
> sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID
> array large enough to overflow the reply buffer. On systems with
> unprivileged user namespaces enabled (e.g., Ubuntu default), this is
> reachable via unshare -Urn since all OVS vport genl operations use
> GENL_UNS_ADMIN_PERM.
> 
> When the subsequent nla_put() fails with -EMSGSIZE, five BUG_ON(err < 0)
> sites fire and panic the kernel:
> 
>  kernel BUG at net/openvswitch/datapath.c:2414!
>  Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
>  CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1
>  RIP: 0010:ovs_vport_cmd_set+0x34c/0x400
>  Call Trace:
>   <TASK>
>   genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116)
>   genl_rcv_msg (net/netlink/genetlink.c:1194)
>   netlink_rcv_skb (net/netlink/af_netlink.c:2550)
>   genl_rcv (net/netlink/genetlink.c:1219)
>   netlink_unicast (net/netlink/af_netlink.c:1344)
>   netlink_sendmsg (net/netlink/af_netlink.c:1894)
>   __sys_sendto (net/socket.c:2206)
>   __x64_sys_sendto (net/socket.c:2209)
>   do_syscall_64 (arch/x86/entry/syscall_64.c:63)
>   entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
>   </TASK>
>  Kernel panic - not syncing: Fatal exception
> 
> Fix this by dynamically sizing the reply skb to account for the actual
> PID array length, and replace the BUG_ON() calls with graceful error
> returns.

Hi, Weiming.  Thanks for working on this!  The earlier attempt to fix this
problem was here:
  https://lore.kernel.org/netdev/20260323071435.1945543-1-sunyiqixm@gmail.com/
CC: sunichi, maybe you can cooperate on the fix somehow.

A few problems with the solution here:

- You're locking to count the number of pids, then unlocking and then
  re-locking to fill the output.  This is racy and may still cause the
  inability to put all the data into allocated space, since the number can
  theoretically change.

- Failing the del command is very unfriendly to the userspace and also it
  becomes impossible for the user to delete the port without clearing the
  upcall pids first, which they will not know to do.

- Failing the new command as done in this change is very confusing as the
  port is actually created while the user gets the error.

- NLMSG_DEFAULT_SIZE + portids_size is not the actual size of the message,
  nothing in the code guarantees that the rest fits into NLMSG_DEFAULT_SIZE.

- The number of pids being unbounded is a problem in itself.

So, we need to approach this issue differently.  As I suggested in the thread
linked above, we should limit the maximum number of the pid to the number of
CPUs, as it is done for the per-cpu dispatch socket array.  There is no point
in more sockets than CPUs and the userspace never creates that many.  This way
we can:
- Fail the attempts to set up more pids than CPUs in the first place.
- Know beforehand the maximum size to allocate - no need to check the actual
  number and re-lock.  Create a function similar to ovs_dp_cmd_msg_size()
  to allocate based on what can be included in the worst case, which shouldn't
  be a lot.

And all the BUG_ON() calls must remain BUG_ON()s as it is a bug if we're not
counting correctly.

> Fixes: b83d23a2a38b ("openvswitch: Introduce per-cpu upcall dispatch")

This is not the right commit.  The change you're making has nothing to do with
the per-cpu dispatch.  The actual commit you're looking for is much older:
  5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.")

Couple more notes:

- Wait at least 24 hours between versions.  Otherwise, people have no enough
  time to look at your patch.

- Please, CC all maintainers including the ovs-dev list.  It is moderated for
  new senders (the only reliable way to keep it out of spam) but we approve
  fast, so your next emails should go right through.

Best regards, Ilya Maximets.

^ permalink raw reply

* Re: [RFC net-next v5 0/3] Add RSS and LRO support
From: Frank Wunderlich @ 2026-04-12 19:24 UTC (permalink / raw)
  To: Jakub Kicinski, Frank Wunderlich
  Cc: linux, nbd, sean.wang, lorenzo, andrew+netdev, davem, edumazet,
	pabeni, matthias.bgg, angelogioacchino.delregno, linux, daniel,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek
In-Reply-To: <20260412075402.2eda04d3@kernel.org>

Am 12. April 2026 um 16:54 schrieb "Jakub Kicinski" <kuba@kernel.org>:
> 
> On Sun, 12 Apr 2026 11:57:47 +0000 Frank Wunderlich wrote:
> 
> > 
> > some time has passed without a single comment, so i just send a friendly reminder ;)
> > 
> You have a lot of people in the To:
> Could you clarify who you expect to action these patches?
> Patches are an RFC and I suppose ain't nobody got much comments?

Hi,

imho 11 people are in "To" is not much, but i was told that no comments can mean
"OK so far" so i will send v6 soon rebased on current net-next.

regards Frank

^ permalink raw reply

* Re: [PATCH net-next v6 3/7] net: bcmgenet: add basic XDP support (PASS/DROP)
From: Jakub Kicinski @ 2026-04-12 19:22 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: netdev, Justin Chen, Simon Horman, Mohsin Bashir, Doug Berger,
	Florian Fainelli, Broadcom internal kernel review list,
	Andrew Lunn, Eric Dumazet, Paolo Abeni, David S. Miller,
	Alexei Starovoitov, Daniel Borkmann, Jesper Dangaard Brouer,
	John Fastabend, Stanislav Fomichev, linux-kernel, bpf
In-Reply-To: <20260406083536.839517-4-nb@tipi-net.de>

On Mon,  6 Apr 2026 10:35:27 +0200 Nicolai Buchwitz wrote:
> Add XDP program attachment via ndo_bpf and execute XDP programs in the
> RX path. XDP_PASS builds an SKB from the xdp_buff (handling
> xdp_adjust_head/tail), XDP_DROP returns the page to page_pool without
> SKB allocation.
> 
> XDP_TX and XDP_REDIRECT are not yet supported and return XDP_ABORTED.
> 
> Advertise NETDEV_XDP_ACT_BASIC in xdp_features.


> -		skb_mark_for_recycle(skb);
> -
> -		/* Reserve the RSB + pad, then set the data length */
> -		skb_reserve(skb, GENET_RSB_PAD);
> -		__skb_put(skb, len - GENET_RSB_PAD);
> +		{

floating code blocks are considered poor coding style in the kernel
Why not push the variables up into the outer scope or make this 
a helper?

> +			struct xdp_buff xdp;
> +			unsigned int xdp_act;
> +			int pkt_len;
> +
> +			pkt_len = len - GENET_RSB_PAD;
> +			if (priv->crc_fwd_en)
> +				pkt_len -= ETH_FCS_LEN;
> +
> +			/* Save rx_csum before XDP runs - an XDP program
> +			 * could overwrite the RSB via bpf_xdp_adjust_head.
> +			 */
> +			if (dev->features & NETIF_F_RXCSUM)
> +				rx_csum = (__force __be16)(status->rx_csum
> +							   & 0xffff);

FWIW this could be before the block

> +			xdp_init_buff(&xdp, PAGE_SIZE, &ring->xdp_rxq);
> +			xdp_prepare_buff(&xdp, page_address(rx_page),
> +					 GENET_RX_HEADROOM, pkt_len, true);
> +
> +			if (xdp_prog) {
> +				xdp_act = bcmgenet_run_xdp(ring, xdp_prog,
> +							   &xdp, rx_page);

Since you pass the xdp_prog in you can save yourself the indentation by
making bcmgenet_run_xdp() return PASS when no program is set.
bcmgenet_run_xdp() has one caller, it's going to get inlined.

> +				if (xdp_act != XDP_PASS)
> +					goto next;
> +			}
>  
> -		if (priv->crc_fwd_en) {
> -			skb_trim(skb, skb->len - ETH_FCS_LEN);
> +			skb = bcmgenet_xdp_build_skb(ring, &xdp);
> +			if (unlikely(!skb)) {
> +				BCMGENET_STATS64_INC(stats, dropped);
> +				page_pool_put_full_page(ring->page_pool,
> +							rx_page, true);
> +				goto next;
> +			}
>  		}
>  
>  		/* Set up checksum offload */
>  		if (dev->features & NETIF_F_RXCSUM) {
> -			rx_csum = (__force __be16)(status->rx_csum & 0xffff);
>  			if (rx_csum) {
>  				skb->csum = (__force __wsum)ntohs(rx_csum);
>  				skb->ip_summed = CHECKSUM_COMPLETE;
> @@ -3744,6 +3810,37 @@ static int bcmgenet_change_carrier(struct net_device *dev, bool new_carrier)
>  	return 0;
>  }
>  
> +static int bcmgenet_xdp_setup(struct net_device *dev,
> +			      struct netdev_bpf *xdp)
> +{
> +	struct bcmgenet_priv *priv = netdev_priv(dev);
> +	struct bpf_prog *old_prog;
> +	struct bpf_prog *prog = xdp->prog;
> +
> +	if (prog && dev->mtu > PAGE_SIZE - GENET_RX_HEADROOM -
> +	    SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) {

I'm confused by this check, it appears that the max page size this
driver can Rx in the first place is 2kB (RX_BUF_LENGTH). And max_mtu 
is 1.5kB.

If GENET_RX_HEADROOM + SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
is larger than 2kB the Rx path will break completely whether XDP was
attached or not.

This check seems to be cargo culting what other drivers do?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox