Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2] net: add sock_open() with flags for socket creation
From: David Laight @ 2026-06-21 22:02 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Alex Goltsev, davem, netdev, linux-kernel, Al Viro
In-Reply-To: <b057ea36-9810-40c1-9154-859eb9e0da5e@lunn.ch>

On Sun, 21 Jun 2026 15:59:30 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

> On Sun, Jun 21, 2026 at 01:57:46PM +0100, David Laight wrote:
> > On Sun, 21 Jun 2026 14:05:40 +0300
> > Alex Goltsev <sasha.goltsev777@gmail.com> wrote:
> >   
> > > From a9316957e594708dfb4258ad968fe88666c9b736 Mon Sep 17 00:00:00 2001
> > > From: 0-x-0-0 <sasha.goltsev777@gmail.com>
> > > Date: Sun, 21 Jun 2026 13:24:29 +0300
> > > Subject: [PATCH v2] net: add sock_open() with flags for socket creation  
> > 
> > A) There is no info here.
> > B) You've not said why this is of any use.
> > C) It isn't a bug fix so would go into net-next
> > D) net-next is closed.  
> 
> E) The patch has had all its whitespace corrupted.

F) Patch sent as a reply to version 1.

> 
> Please "submit" the patch to yourself and ensure you can cleanly apply
> it.
> 
>    Andrew
> 


^ permalink raw reply

* Re: [PATCH] octeontx2-pf: Clear stats of all resources when freeing resources
From: patchwork-bot+netdevbpf @ 2026-06-21 22:01 UTC (permalink / raw)
  To: Subbaraya Sundeep
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth, netdev, linux-kernel
In-Reply-To: <1781636420-19816-2-git-send-email-sbhatta@marvell.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 00:30:19 +0530 you wrote:
> When all MCS resources mapped to a PF are being freed then clear
> stats of all those resources too.
> 
> Fixes: 815debbbf7b5 ("octeontx2-pf: mcs: Clear stats before freeing resource")
> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
> ---
>  drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c | 1 +
>  1 file changed, 1 insertion(+)

Here is the summary with links:
  - octeontx2-pf: Clear stats of all resources when freeing resources
    https://git.kernel.org/netdev/net/c/fd4460721fb4

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [net PATCH v2] octeontx2-pf: mcs: Fix mcs resources free on PF shutdown
From: patchwork-bot+netdevbpf @ 2026-06-21 22:01 UTC (permalink / raw)
  To: Subbaraya Sundeep
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth, netdev, linux-kernel
In-Reply-To: <1781636420-19816-3-git-send-email-sbhatta@marvell.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 00:30:20 +0530 you wrote:
> From: Geetha sowjanya <gakula@marvell.com>
> 
> On PF shutdown, the current driver free mcs hardware
> resources though mcs resources are not allocated to it.
> This patch checks the mcs resources status and if resources
> are allocated then only sends mailbox message to free them.
> 
> [...]

Here is the summary with links:
  - [net,v2] octeontx2-pf: mcs: Fix mcs resources free on PF shutdown
    https://git.kernel.org/netdev/net/c/450d0e90b103

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [net PATCH v2] octeontx2-af: mcs: Fix unsupported secy stats read
From: patchwork-bot+netdevbpf @ 2026-06-21 22:01 UTC (permalink / raw)
  To: Subbaraya Sundeep
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth, netdev, linux-kernel
In-Reply-To: <1781636420-19816-1-git-send-email-sbhatta@marvell.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 00:30:18 +0530 you wrote:
> From: Geetha sowjanya <gakula@marvell.com>
> 
> Secy control stats counter doesn't exist for CNF10KB platform.
> Skip reading this respective register for CNF10KB silicon while
> fetching secy stats.
> 
> Fixes: 9312150af8da ("octeontx2-af: cn10k: mcs: Support for stats collection")
> Signed-off-by: Geetha sowjanya <gakula@marvell.com>
> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
> 
> [...]

Here is the summary with links:
  - [net,v2] octeontx2-af: mcs: Fix unsupported secy stats read
    https://git.kernel.org/netdev/net/c/d4b7440f7316

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: pull-request: ieee802154-next 2026-06-20
From: patchwork-bot+netdevbpf @ 2026-06-21 22:00 UTC (permalink / raw)
  To: Stefan Schmidt
  Cc: davem, kuba, pabeni, linux-wpan, alex.aring, miquel.raynal,
	netdev
In-Reply-To: <20260620174903.1010671-1-stefan@datenfreihafen.org>

Hello:

This pull request was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 20 Jun 2026 19:49:03 +0200 you wrote:
> Hello Dave, Jakub, Paolo.
> 
> An overdue pull request for ieee802154, catching up on all the AI found issues
> at last.
> 
> Shitalkumar Gandhi fixed problems in the ca8210 driver for cases where we could
> have a leak or a pointer truncation.
> 
> [...]

Here is the summary with links:
  - pull-request: ieee802154-next 2026-06-20
    https://git.kernel.org/netdev/net/c/617fb6fa9c34

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net 01/15] batman-adv: gw: don't deselect gateway with active hardif
From: patchwork-bot+netdevbpf @ 2026-06-21 22:00 UTC (permalink / raw)
  To: Simon Wunderlich
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, b.a.t.m.a.n, sven,
	stable, neocturne
In-Reply-To: <20260619070045.438101-2-sw@simonwunderlich.de>

Hello:

This series was applied to netdev/net.git (main)
by Sven Eckelmann <sven@narfation.org>:

On Fri, 19 Jun 2026 09:00:31 +0200 you wrote:
> From: Sven Eckelmann <sven@narfation.org>
> 
> The batadv_hardif_cnt() was previously checking if there is an
> batadv_hard_iface->mesh_iface which is has the same mesh_iface. And since
> batadv_hardif_disable_interface() was resetting the
> batadv_hard_iface->mesh_iface after this check, it had to verify whether
> *1* interface was still part of the mesh_iface before it started the
> gateway deselection.
> 
> [...]

Here is the summary with links:
  - [net,01/15] batman-adv: gw: don't deselect gateway with active hardif
    https://git.kernel.org/netdev/net/c/df97a7107b16
  - [net,02/15] batman-adv: ensure bcast is writable before modifying TTL
    https://git.kernel.org/netdev/net/c/4cd6d3a4b96a
  - [net,03/15] batman-adv: fix (m|b)cast csum after decrementing TTL
    https://git.kernel.org/netdev/net/c/e728bbdf3266
  - [net,04/15] batman-adv: frag: ensure fragment is writable before modifying TTL
    https://git.kernel.org/netdev/net/c/b7293c6e8c15
  - [net,05/15] batman-adv: frag: avoid underflow of TTL
    https://git.kernel.org/netdev/net/c/493d9d2528e1
  - [net,06/15] batman-adv: v: prevent OGM aggregation on disabled hardif
    https://git.kernel.org/netdev/net/c/d11c00b95b2a
  - [net,07/15] batman-adv: tp_meter: restrict number of unacked list entries
    https://git.kernel.org/netdev/net/c/e7c775110e18
  - [net,08/15] batman-adv: tp_meter: annotate last_recv_time access with READ/WRITE_ONCE
    https://git.kernel.org/netdev/net/c/d67c728f07fc
  - [net,09/15] batman-adv: tp_meter: prevent parallel modifications of last_recv
    https://git.kernel.org/netdev/net/c/6dde0cfcb36e
  - [net,10/15] batman-adv: tp_meter: handle overlapping packets
    https://git.kernel.org/netdev/net/c/cbde75c38b21
  - [net,11/15] batman-adv: tt: don't merge change entries with different VIDs
    https://git.kernel.org/netdev/net/c/f08e06c2d5c3
  - [net,12/15] batman-adv: tt: track roam count per VID
    https://git.kernel.org/netdev/net/c/12407d5f61c2
  - [net,13/15] batman-adv: dat: prevent false sharing between VLANs
    https://git.kernel.org/netdev/net/c/20d7658b7416
  - [net,14/15] batman-adv: tvlv: enforce 2-byte alignment
    https://git.kernel.org/netdev/net/c/32a679925552
  - [net,15/15] batman-adv: tvlv: avoid race of cifsnotfound handler state
    https://git.kernel.org/netdev/net/c/edb557b2ba38

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2] amt: don't read the IP source address from a reallocated skb header
From: Jakub Kicinski @ 2026-06-21 22:00 UTC (permalink / raw)
  To: Michael Bommarito
  Cc: Taehee Yoo, David S . Miller, Paolo Abeni, Eric Dumazet,
	Andrew Lunn, netdev, linux-kernel
In-Reply-To: <20260617123443.3586930-1-michael.bommarito@gmail.com>

On Wed, 17 Jun 2026 08:34:43 -0400 Michael Bommarito wrote:
> amt_update_handler() caches iph = ip_hdr(skb) and then calls
> pskb_may_pull(). pskb_may_pull() can reallocate the skb head: the new
> head is allocated and the old one is freed. The cached iph is not
> refreshed, so the following tunnel lookup reads iph->saddr from the
> freed head. On an AMT relay this lookup runs for every incoming
> membership update, before the update's nonce and response MAC are
> validated.
> 
> The sibling handlers amt_multicast_data_handler() and
> amt_membership_query_handler() re-read ip_hdr() after the pull and are
> not affected; only amt_update_handler() keeps the pre-pull pointer.

Sashikos point out a bunch more of these in AMT:
https://sashiko.dev/#/patchset/20260617123443.3586930-1-michael.bommarito@gmail.com
https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260617123443.3586930-1-michael.bommarito@gmail.com

Let's fix them all with one patch?
-- 
pw-bot: cr

^ permalink raw reply

* [PATCH iproute2-next] "ip help" wrong output, exit code.
From: Dmitri Seletski @ 2026-06-21 21:56 UTC (permalink / raw)
  To: netdev

Changed output of "ip help" from standard error to standard output. And 
Exit is now 0 instead of -1. "ip help|grep bridge" - now gives bridge 
syntax instead of flooding user with everything from "ip help".
---
ip/ip.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ip/ip.c b/ip/ip.c
index e4b71bde..4627b61c 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -56,7 +56,7 @@ static void usage(void) __attribute__((noreturn));

static void usage(void)
{
-fprintf(stderr,
+fprintf(stdout,
"Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }\n"
"       ip [ -force ] -batch filename\n"
"where  OBJECT := { address | addrlabel | fou | help | ila | ioam | l2tp 
| link |\n"
@@ -72,7 +72,7 @@ static void usage(void)
"                    -o[neline] | -t[imestamp] | -ts[hort] | -b[atch] 
[filename] |\n"
"                    -rc[vbuf] [size] | -n[etns] name | -N[umeric] | 
-a[ll] |\n"
"                    -c[olor]}\n");
-exit(-1);
+exit(0);
}

static int do_help(int argc, char **argv)
-- 
2.53.0



^ permalink raw reply related

* Re: "ip help" output is an error
From: Dmitri Seletski @ 2026-06-21 21:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20260621082105.1196ef72@phoenix.local>

I never done C or github submit before, I hope I did it right way.

Regards

Dmitri Seletski

On 6/21/26 16:21, Stephen Hemminger wrote:
> On Sat, 20 Jun 2026 10:36:31 +0100
> Dmitri Seletski <drjoms@gmail.com> wrote:
>
>> Hello iproute2 maintainers,
>>
>> I am reporting an inconsistency regarding the exit status of the ip help
>> command.
>>
>> Current Behavior:
>> When running ip help, the command prints the help documentation to
>> stdout, but exits with a non-zero status (error). This causes issues in
>> shell scripts that rely on exit codes for control flow.
>>
>> Steps to reproduce:
>> bash
>>
>> # This returns "FAIL" because the exit code is non-zero
>> if ip help > /dev/null; then
>>       echo "SUCCESS"
>> else
>>       echo "FAIL"
>> fi
>>
>> Expected Behavior:
>> Since the command successfully performs the requested task (displaying
>> help information) and does not encounter a system error, it should
>> return an exit code of 0.
>>
>> Context:
>> This behavior breaks standard Bash logic for automation. For example:
>> ip help && echo "This will not execute"
>>
>> "ip help |grep br" - this will bring no result.
>>
>> Current version tested: iproute2-6.19.0
>>
>> Thank you for your time and for maintaining this tool.
>>
>> Regards,
>> Dmitri Seletski
>>
>>
> Yes iproute2 doesn't do a great job of handling error codes
> with usage vs help. Its a bug and no one has bothered to fix it.

^ permalink raw reply

* Re: [PATCH net] octeontx2-af: npc: cn20k: fix NPC defrag
From: patchwork-bot+netdevbpf @ 2026-06-21 21:50 UTC (permalink / raw)
  To: Ratheesh Kannoth
  Cc: kuba, linux-kernel, netdev, andrew+netdev, davem, edumazet,
	pabeni, sgoutham
In-Reply-To: <20260617102149.1309913-1-rkannoth@marvell.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 15:51:49 +0530 you wrote:
> npc_defrag_alloc_free_slots() always passed NPC_MCAM_KEY_X2 into
> __npc_subbank_alloc(), which must match sb->key_type, so defrag never
> allocated replacement slots on X4 banks. Pass the subbank key type for
> bank 0, and only extend the search into bank 1 for X2 (X4 MCAM indices
> are confined to b0b..b0t).
> 
> Fixes: 645c6e3c1999 ("octeontx2-af: npc: cn20k: virtual index support")
> Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
> 
> [...]

Here is the summary with links:
  - [net] octeontx2-af: npc: cn20k: fix NPC defrag
    https://git.kernel.org/netdev/net/c/48b67c0e8af6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [net PATCH v3] octeontx2-af: Validate NIX maximum LFs correctly
From: Jakub Kicinski @ 2026-06-21 21:49 UTC (permalink / raw)
  To: Subbaraya Sundeep
  Cc: andrew+netdev, davem, edumazet, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth, netdev, linux-kernel
In-Reply-To: <1781710853-23420-1-git-send-email-sbhatta@marvell.com>

On Wed, 17 Jun 2026 21:10:53 +0530 Subbaraya Sundeep wrote:
> NIX maximum number of LFs can be set via devlink command
> but that can be done before assigning any LFs to a PF/VF.
> The condition used to check whether any LFs are assigned is
> incorrect. This patch fixes that condition.

Does not apply, please rebase & repost.
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net] octeontx2-af: fix CGX debugfs RVU AF PCI reference leaks
From: Jakub Kicinski @ 2026-06-21 21:44 UTC (permalink / raw)
  To: Ratheesh Kannoth
  Cc: davem, hkelam, lcherian, linux-kernel, netdev, pabeni, sgoutham,
	andrew+netdev, edumazet, Yuho Choi
In-Reply-To: <20260617104525.1321395-1-rkannoth@marvell.com>

On Wed, 17 Jun 2026 16:15:25 +0530 Ratheesh Kannoth wrote:
> +		{
> +			struct rvu_cgx_lmac_dbgfs_ctx *ctx;
> +
> +			ctx = devm_kzalloc(rvu->dev, sizeof(*ctx), GFP_KERNEL);
> +			if (!ctx)
> +				continue;

In addition to Simon's nit - please don't create floating code blocks,
just add the var decl at the start of the function.
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net v3 1/2] net: macb: give reasons for Tx SKB kfree
From: Jakub Kicinski @ 2026-06-21 21:40 UTC (permalink / raw)
  To: Théo Lebrun
  Cc: Nicolas Ferre, Claudiu Beznea, Andrew Lunn, David S. Miller,
	Eric Dumazet, Paolo Abeni, Haavard Skinnemoen, Jeff Garzik,
	Conor Dooley, Paolo Valerio, Nicolai Buchwitz, netdev,
	linux-kernel, Vladimir Kondratiev, Gregory CLEMENT,
	Benoît Monin, Tawfik Bayouk, Thomas Petazzoni,
	Maxime Chevallier, stable
In-Reply-To: <20260617-macb-drop-tx-v3-1-d4c7e57d890b@bootlin.com>

On Wed, 17 Jun 2026 11:17:29 +0200 Théo Lebrun wrote:
> Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver")
> Cc: stable@vger.kernel.org

Interesting, did AI suggest this? It's fairly uncommon for drivers
to care about drop reasons, packet loss on egress ports is pretty
clearly attributed by tx_drops.

I don't think this belongs in net, net-next would be fine, if you think
it's necessary. Sashiko seems to point out a few more clear cut bugs.
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net] net: dst_metadata: fix false-positive memcpy overflow in tun_dst_unclone
From: patchwork-bot+netdevbpf @ 2026-06-21 21:40 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, kees, gustavoars,
	nathan, nick.desaulniers+lkml, morbo, justinstitt, linux-kernel,
	linux-hardening, llvm, write
In-Reply-To: <20260616100332.1308294-1-i.maximets@ovn.org>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 16 Jun 2026 12:03:29 +0200 you wrote:
> kmalloc_flex() in metadata_dst_alloc() sets __counted_by for the
> structure to the options_len, which is then initialized to zero.
> Later, we're initializing the structure by copying the tunnel info
> together with the options, and this triggers a warning for a potential
> memcpy overflow, since the compiler estimates that the options can't
> fit into the structure, even though the memory for them is actually
> allocated.
> 
> [...]

Here is the summary with links:
  - [net] net: dst_metadata: fix false-positive memcpy overflow in tun_dst_unclone
    https://git.kernel.org/netdev/net/c/4c6d43db2a4d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v3] tipc: fix use-after-free of the discoverer in tipc_disc_rcv()
From: patchwork-bot+netdevbpf @ 2026-06-21 21:40 UTC (permalink / raw)
  To: Weiming Shi
  Cc: jmaloy, davem, edumazet, kuba, pabeni, horms, ying.xue, netdev,
	tipc-discussion, linux-kernel, xmei5
In-Reply-To: <20260617135744.3383175-3-bestswngs@gmail.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 21:57:45 +0800 you wrote:
> bearer_disable() frees b->disc with tipc_disc_delete()'s plain kfree(),
> but tipc_disc_rcv() still dereferences b->disc in RX softirq under
> rcu_read_lock() (tipc_udp_recv -> tipc_rcv -> tipc_disc_rcv).
> 
> L2 bearers are safe thanks to the synchronize_net() in
> tipc_disable_l2_media(), but the UDP bearer defers that call to the
> cleanup_bearer() workqueue, so the discoverer is freed with no grace
> period:
> 
> [...]

Here is the summary with links:
  - [net,v3] tipc: fix use-after-free of the discoverer in tipc_disc_rcv()
    https://git.kernel.org/netdev/net/c/1579342d7113

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net] net: ethernet: mtk_ppe: Fix rhashtable leak in mtk_ppe_init error paths
From: patchwork-bot+netdevbpf @ 2026-06-21 21:40 UTC (permalink / raw)
  To: Wayen Yan
  Cc: netdev, lorenzo, horms, pabeni, kuba, edumazet, andrew+netdev,
	angelogioacchino.delregno, matthias.bgg, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <178167550101.2217645.14579307712717502425@gmail.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 17 Jun 2026 13:48:13 +0800 you wrote:
> In mtk_ppe_init(), when accounting is enabled, the error paths for
> dmam_alloc_coherent(mib) and devm_kzalloc(acct) failures return NULL
> directly, bypassing the err_free_l2_flows label that destroys the
> rhashtable initialized earlier.
> 
> While this leak only occurs during probe (not runtime) and the leaked
> memory is minimal (an empty rhash table), fixing it ensures proper
> error path cleanup consistency.
> 
> [...]

Here is the summary with links:
  - [net] net: ethernet: mtk_ppe: Fix rhashtable leak in mtk_ppe_init error paths
    https://git.kernel.org/netdev/net/c/41782770be56

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v2 0/4] net: pse-pd: decouple controller lookup from MDIO probe
From: Jakub Kicinski @ 2026-06-21 21:32 UTC (permalink / raw)
  To: Carlo Szelinsky
  Cc: Oleksij Rempel, Kory Maincent, Andrew Lunn, Heiner Kallweit,
	Russell King, David S . Miller, Eric Dumazet, Paolo Abeni,
	Corey Leavitt, Jonas Jelonek, netdev, linux-kernel
In-Reply-To: <20260620112440.1734404-1-github@szelinsky.de>

On Sat, 20 Jun 2026 13:24:36 +0200 Carlo Szelinsky wrote:
> This is v2 of Corey's RFC [1]. Corey is busy at the moment, so I'm picking
> it up to unblock everyone. The design is unchanged. The main thing v2
> fixes is the SFP deadlock Jonas reported, plus a couple of smaller points
> from the review.

net-next is closed during the merge window. We can merge the first
patch, tho, if you repost is separately for net, since it's a fix.
-- 
pw-bot: defer

^ permalink raw reply

* Re: [PATCH net v2] net: marvell: prestera: initialize err in prestera_port_sfp_bind
From: patchwork-bot+netdevbpf @ 2026-06-21 21:30 UTC (permalink / raw)
  To: Ruoyu Wang
  Cc: taras.chornyi, andrew+netdev, davem, edumazet, kuba, pabeni,
	linux, oleksandr.mazur, yevhen.orlov, netdev, linux-kernel
In-Reply-To: <20260617193228.1653582-1-ruoyuw560@gmail.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 18 Jun 2026 03:32:28 +0800 you wrote:
> prestera_port_sfp_bind() returns err after walking the ports node. If no
> child node matches the port's front-panel id, err is never assigned.
> 
> Initialize err to 0 because absence of a matching optional port device
> tree node is not an error. In that case no phylink is created and port
> creation should continue with port->phy_link left NULL. Errors from
> malformed matched nodes and phylink_create() still propagate.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: marvell: prestera: initialize err in prestera_port_sfp_bind
    https://git.kernel.org/netdev/net/c/62b01f72d93c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v3] net: dsa: Fix skb ownership in taggers
From: Jakub Kicinski @ 2026-06-21 21:27 UTC (permalink / raw)
  To: linusw
  Cc: Jakub Kicinski, andrew, olteanv, davem, edumazet, pabeni, horms,
	florian.fainelli, jonas.gorski, hauke, kurt, woojung.huh,
	UNGLinuxDriver, chester.a.unal, daniel, matthias.bgg,
	angelogioacchino.delregno, wei.fang, xiaoning.wang, clement.leger,
	george.mccollister, mmyangfl, netdev, sashiko-bot
In-Reply-To: <20260617-dsa-fix-free-skb-v3-1-cdd4e0778a39@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: Fix skb ownership in taggers

This patch moves SKB drop responsibility from dsa_user_xmit() and
dsa_switch_rcv() into the per-tagger ->xmit() and ->rcv() callbacks, so
that taggers consume the skb on their own error paths and the call sites
no longer free a returned NULL skb.

Closes: https://lore.kernel.org/r/20260610153952.1685895-1-kuba@kernel.org/

> diff --git a/net/dsa/tag.c b/net/dsa/tag.c
> index 79ad105902d9..cfc8f5a0cbd9 100644
> --- a/net/dsa/tag.c
> +++ b/net/dsa/tag.c
> @@ -84,10 +84,8 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
>  		nskb = cpu_dp->rcv(skb, dev);
>  	}
>  
> -	if (!nskb) {
> -		kfree_skb(skb);
> +	if (!nskb)
>  		return 0;
> -	}
>  
>  	skb = nskb;
>  	skb_push(skb, ETH_HLEN);

[High]
Does the new contract that taggers consume the skb on error cover the
inline METADATA_HW_PORT_MUX branch in dsa_switch_rcv() as well?  That
branch is not a tagger callback, it is open-coded in dsa_switch_rcv():

	if (md_dst && md_dst->type == METADATA_HW_PORT_MUX) {
		unsigned int port = md_dst->u.port_info.port_id;

		skb_dst_drop(skb);
		if (!skb_has_extensions(skb))
			skb->slow_gro = 0;

		skb->dev = dsa_conduit_find_user(dev, 0, port);
		if (likely(skb->dev)) {
			dsa_default_offload_fwd_mark(skb);
			nskb = skb;
		}
	} else {
		nskb = cpu_dp->rcv(skb, dev);
	}

	if (!nskb)
		return 0;

When dsa_conduit_find_user() returns NULL (for example a port_id that
does not match any registered DSA user, such as the CPU port index, an
invalid id from the conduit driver, or a port being torn down), nskb
stays NULL and skb is not freed anywhere, so the new

	if (!nskb)
		return 0;

path looks like it leaks the skb on every such packet.

Pre-patch, the removed kfree_skb(skb) at the !nskb site covered this
case.  Should the metadata-mux branch free the skb itself when
skb->dev is NULL, or should the unconditional kfree_skb(skb) at the
!nskb site be kept for this path?

[ ... ]
-- 
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net v2 0/2] net: ethernet: sunplus: spl2sw: fix of_node refcount leaks
From: Jakub Kicinski @ 2026-06-21 20:22 UTC (permalink / raw)
  To: 呂芳騰
  Cc: Shitalkumar Gandhi, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, netdev, linux-kernel,
	Shitalkumar Gandhi
In-Reply-To: <CAFnkrs=kE3thiFLaOULGv3n_KgR-r5T4vB7hxJsZL4iAihO31g@mail.gmail.com>

On Sun, 21 Jun 2026 12:38:06 +0800 呂芳騰 wrote:
> I'm sorry that I can't test the fix.
> I've left from Suplus and don't have the relevant hardware.

That makes things harder.. but you don't necessarily need HW to review
most of the patches. If you don't intend to serve as a maintainer of
the sunplus driver please sense a patch to MAINTAINERS and step down.
Right now you are listed but don't seem to be fulfilling the duties.
Or please review the patches to the best of your ability without
testing.

^ permalink raw reply

* [PATCH net] selftests: drv-net: so_txtime: relax variance bounds
From: Willem de Bruijn @ 2026-06-21 20:01 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, horms, Willem de Bruijn

From: Willem de Bruijn <willemb@google.com>

The net-next-hw spinners on netdev.bots.linux.dev observe failing
so-txtime-py tests. A review of stdout shows most failures to be
due to exceeding the 4ms grace period. All I saw were within 8ms.
So increase to that.

Double the bounds from 4 to 8ms. This is still is small enough to
differentiate the delays programmed by the test, 10 and 20ms.

Fixes: 5c6baef3885c ("selftests: drv-net: convert so_txtime to drv-net")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20260610170651.1b644001@kernel.org/
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 tools/testing/selftests/drivers/net/so_txtime.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/so_txtime.c b/tools/testing/selftests/drivers/net/so_txtime.c
index 75f3beef13d9..55a386f3d1b9 100644
--- a/tools/testing/selftests/drivers/net/so_txtime.c
+++ b/tools/testing/selftests/drivers/net/so_txtime.c
@@ -37,7 +37,7 @@
 
 static int	cfg_clockid	= CLOCK_TAI;
 static uint16_t	cfg_port	= 8000;
-static int	cfg_variance_us	= 4000;
+static int	cfg_variance_us	= 8000;
 static bool	cfg_machine_slow;
 static uint64_t	cfg_start_time_ns;
 static int	cfg_mark;
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related

* Re: [PATCH nf-next v3] netfilter: TCPMSS: handle packets with unaligned MSS option
From: Pablo Neira Ayuso @ 2026-06-21 19:46 UTC (permalink / raw)
  To: Kacper Kokot
  Cc: netfilter-devel, kadlec, fmancera, fw, david.laight.linux,
	Phil Sutter, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, coreteam, netdev, linux-kernel
In-Reply-To: <20260621184934.75832-1-kacper.kokot.44@gmail.com>

On Sun, Jun 21, 2026 at 07:49:33PM +0100, Kacper Kokot wrote:
> RFC 9293 permits TCP options to begin on any octet boundary. Padding
> to a word boundary with NOPs is a sender convention, not a requirement,
> and robust receivers must handle unaligned options (MUST-64).
> 
> The xt_TCPMSS target's incremental checksum update assumes the MSS
> option is word-aligned. When it's not, the modified bytes straddle
> two checksum words and the resulting checksum is incorrect. The mangled
> packet may then fail checksum validation and be dropped downstream.
> That said, all mainstream stacks emit a word-aligned MSS, this change is
> motivated by spec conformance rather than a bug observed in the wild.
> 
> Extend the checksum update to handle unaligned MSS options. When the
> changed word is unaligned, the modified bytes b' and c' straddle two
> checksum words w1 and w2:
> 
>     | w1     | w2     |
> OLD |  a  b  |  c  d  |
> NEW |  a  b' |  c' d  |
> 
> The two-step update C' = C - w1 + w1' - w2 + w2' reduces algebraically
> to a single word incremental checksum update with byteswapped operands:
> 
>     C' = C - w1 - w2 + w1' + w2'
>        = C - (a * 2^8 + b)  - (c * 2^8 + d)
>            + (a * 2^8 + b') + (c' * 2^8 + d)
>        = C + 2^8 * (a - a + c' - c) + (b' - b + d - d)
>        = C + 2^8 * (c' - c) + (b' - b)
>        = C - (2^8 * c + b) + (2^8 * c' + b')
> 
> So the unaligned case adds no extra checksum operations.
> 
> Signed-off-by: Kacper Kokot <kacper.kokot.44@gmail.com>
> ---
> v3:
>  - Reframe as enhancement, not a fix (Pablo/Fernando)
>  - Rename subject to xt_TCPMSS, drop "fix" wording
>  - Reword commit message: packet may fail checksum validation and be
>    dropped downstream (Pablo)
>  - Target nf-next (Fernando)
>  - Use __be16 for csum_oldmss/csum_newmss (sparse warning from
>    kernel test robot)
>  - Reorder local variable declarations to reverse xmas tree (Fernando)
> 
> v2:
>  - Use get_unaligned_be16 (Fernando's suggestion)
>  - Fix alignment check expression (David)
>  - Mention it's a theoretical bug in the commit message
>  - Drop cc stable, the bug is only theoretical
> 
> diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c
> index 80e1634bc51f..037add799d41 100644
> --- a/net/netfilter/xt_TCPMSS.c
> +++ b/net/netfilter/xt_TCPMSS.c
> @@ -116,9 +116,10 @@ tcpmss_mangle_packet(struct sk_buff *skb,
>  	opt = (u_int8_t *)tcph;
>  	for (i = sizeof(struct tcphdr); i <= tcp_hdrlen - TCPOLEN_MSS; i += optlen(opt, i)) {
>  		if (opt[i] == TCPOPT_MSS && opt[i+1] == TCPOLEN_MSS) {
> +			__be16 csum_oldmss, csum_newmss;
>  			u_int16_t oldmss;
>  
> -			oldmss = (opt[i+2] << 8) | opt[i+3];
> +			oldmss = get_unaligned_be16(&opt[i + 2]);
>  
>  			/* Never increase MSS, even when setting it, as
>  			 * doing so results in problems for hosts that rely
> @@ -130,8 +131,25 @@ tcpmss_mangle_packet(struct sk_buff *skb,
>  			opt[i+2] = (newmss & 0xff00) >> 8;
>  			opt[i+3] = newmss & 0x00ff;
>  
> +			csum_oldmss = htons(oldmss);
> +			csum_newmss = htons(newmss);
> +
> +			if (((char *)&opt[i + 2] - (char *)tcph) & 0x1) {
> +				/* MSS option is unaligned: the modified bytes
> +				 * straddle two checksum words. Byteswapping
> +				 * the operands lets a single incremental
> +				 * update produce the correct checksum delta
> +				 * (see commit message for the derivation).
> +				 */
> +				csum_oldmss = htons(swab16(oldmss));
> +				csum_newmss = htons(swab16(newmss));
> +			} else {
> +				csum_oldmss = htons(oldmss);
> +				csum_newmss = htons(newmss);
> +			}

After seeing this unaligned in other areas in the Netfilter tree, I am
not sure it is worth to add workarounds everywhere in this codebase to
deal with updates that span two 16-bits words for such a hypothetical
case like this.

By now, patches that call get_unaligned_be16() for correctness are OK
IMO. This is to deal with arches which cannot cope with unaligned
access. This will corrupt such rare packet but that it addresses the
unaligned splats.

If we start seeing real stacks which provide real unaligned access
like this, maybe by then we can revisit.

So I am leaning towards a small patches to introduce
get_unaligned_be16() and document that this corrupts packets with such
a rare unaligned TCP option.

IIRC, x86_64 has a inet checksum function that can deal with 1-byte
words, although other arches cannot do that and still need to
operation with 16-bit words. Given Linux is multi-arch, this all need
to stick to the 16-bit word arithmetics when mangling packets

Maybe in the future all checksum functions in every arch are updated
too to deal with 1-byte word updates, and maybe real stacks pop up
with such a rare packets. But by then these ugly workaround won't be
needed at all.

> +
>  			inet_proto_csum_replace2(&tcph->check, skb,
> -						 htons(oldmss), htons(newmss),
> +						 csum_oldmss, csum_newmss,
>  						 false);
>  			return 0;
>  		}
> -- 
> 2.43.0
> 
> 

^ permalink raw reply

* Re: [REGRESSION 6.16] r8169 RTL8168h/8111h fails to probe — "Unable to change power state from D3cold to D0" — bisected to 4d4c10f763d7
From: Mario Limonciello @ 2026-06-21 19:24 UTC (permalink / raw)
  To: Thorsten Leemhuis, Josh Perry, bhelgaas
  Cc: hkallweit1, nic_swsd, rafael, linux-pci, netdev, regressions
In-Reply-To: <e8acc151-19f3-4823-83a1-e0906dd9f0f0@leemhuis.info>



On 6/17/26 01:32, Thorsten Leemhuis wrote:
> On 6/12/26 03:07, Josh Perry wrote:
>> #regzbot introduced: 4d4c10f763d7
>>
>> Since v6.16 one of two onboard RTL8168h/8111h NICs on this board fails
>> to probe on boot; the device drops to D3cold and the driver can't bring
>> it back:
> 
> FWIW, that commit is 4d4c10f763d780 ("PCI: Explicitly put devices into
> D0 when initializing") [v6.16-rc1] from Mario, who is already CCed, but
> looks like might be on holiday or something due to inactivity on the
> lists in the recent days. So it might take a few days before this moves on.
> 
> Josh, this is not my area of expertise, but there are two things I guess
> might be helpful:
> 
> * retry with 7.1
> * upload "dmesg" and "sudo lspci -vvv" output from working and broken
> kernels somewhere (like bugzilla.kernel.org).

Yes; please retry with mainline.  We already had multiple regressions 
from that commit fixed, so if you bisected down to this commit then it's 
plausible that there is already a fix.

> 
> Ciao, Thorsten
> 
>>    r8169 0000:02:00.0 eth0: RTL8168h/8111h, 00:2b:67:48:40:01, XID 541,
>> IRQ 137
>>    r8169 0000:04:00.0: Unable to change power state from D3cold to D0,
>> device inaccessible
>>    r8169 0000:04:00.0: Mem-Wr-Inval unavailable
>>    r8169 0000:04:00.0: error -EIO: PCI read failed
>>    r8169 0000:04:00.0: probe with driver r8169 failed with error -5
>>
>> The board has two identical RTL8168h NICs (both XID 541): 0000:02:00.0
>> and 0000:04:00.0. Only 04:00.0 fails — its sibling 02:00.0, on a
>> different root port, probes and works normally on the very same kernel
>> and boot. The failing NIC then does not appear (no enp4s0), taking the
>> machine's WAN offline. This strongly suggests the problem is port/
>> topology-specific rather than device- or driver-specific: the upstream
>> port behind 04:00.0 is placed in D3cold and the endpoint cannot be
>> resumed to D0.
>>
>> Hardware: RTL8168h/8111h, XID 541, PCI 04:00.0 (onboard 1GbE).
>> Platform: Lenovo ThinkCentre M90n-1 (11AHS0B200), BIOS M2AKT49A
>> (2026-03-25, latest available). Firmware is current, so this is not a
>> platform-firmware issue.
>>
>> Bisection: v6.15 good, v6.16 bad (verified by booting both). I then
>> reverted 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when
>> initializing") together with its follow-up 907a7a2e5bf4 ("PCI/PM: Set up
>> runtime PM even for devices without PCI PM") on top of 6.16.7: the NIC
>> probes and links at 1Gbps/Full normally, with no workaround:
>>
>>    r8169 0000:04:00.0 eth1: RTL8168h/8111h, 00:2b:67:48:40:02, XID 541,
>> IRQ 138
>>    r8169 0000:04:00.0 enp4s0: Link is Up - 1Gbps/Full - flow control rx/tx
>>
>> Workaround: booting an unmodified v6.16+ kernel with pcie_port_pm=off
>> also restores the NIC, which is consistent with the upstream port being
>> placed in D3cold and the device failing to resume to D0 after the
>> explicit-D0 init change.
>>
>> The follow-up 907a7a2e5bf4 does not fix this resume case: v6.18.33 is
>> still affected (retested today on current firmware).
>>
>> Happy to test patches or provide full dmesg / lspci.
>>
> 


^ permalink raw reply

* [PATCH net-next RESEND v3 2/2] selftests: net: add FOU multicast encapsulation resubmit test
From: Anton Danilov @ 2026-06-21 19:04 UTC (permalink / raw)
  To: netdev
  Cc: Willem de Bruijn, davem, David Ahern, Eric Dumazet,
	Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kselftest
In-Reply-To: <cover.1782067871.git.littlesmilingcloud@gmail.com>

Add a selftest to verify that FOU-encapsulated packets addressed to a
multicast destination are correctly resubmitted to the inner protocol
handler (GRE) via the UDP multicast delivery path.

The test creates two network namespaces connected by a veth pair with
a FOU/GRETAP tunnel using a multicast remote address (239.0.0.1).
Ping is sent through the tunnel and received packets are counted on
the receiver's tunnel interface.

A static neighbor entry is configured on the sender because ARP
replies from the receiver cannot traverse the unidirectional multicast
tunnel back to the sender.

The early demux optimization (net.ipv4.ip_early_demux) is disabled on
the receiver to force packets through __udp4_lib_mcast_deliver(),
which is the code path being tested.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
 tools/testing/selftests/net/Makefile          |   1 +
 .../testing/selftests/net/fou_mcast_encap.sh  | 112 ++++++++++++++++++
 2 files changed, 113 insertions(+)
 create mode 100755 tools/testing/selftests/net/fou_mcast_encap.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 708d960ae07d..7e9ae937cffa 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -39,6 +39,7 @@ TEST_PROGS := \
 	fib_rule_tests.sh \
 	fib_tests.sh \
 	fin_ack_lat.sh \
+	fou_mcast_encap.sh \
 	fq_band_pktlimit.sh \
 	gre_gso.sh \
 	gre_ipv6_lladdr.sh \
diff --git a/tools/testing/selftests/net/fou_mcast_encap.sh b/tools/testing/selftests/net/fou_mcast_encap.sh
new file mode 100755
index 000000000000..8db9633f4c28
--- /dev/null
+++ b/tools/testing/selftests/net/fou_mcast_encap.sh
@@ -0,0 +1,112 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Test that UDP encapsulation (FOU) correctly handles packet resubmit
+# when packets are delivered via the multicast UDP delivery path.
+#
+# When a FOU-encapsulated packet arrives with a multicast destination IP,
+# __udp4_lib_mcast_deliver() must resubmit it to the inner protocol
+# handler (e.g., GRE) rather than consuming it. This test verifies that
+# by creating a FOU/GRETAP tunnel with a multicast remote address and
+# sending ping through it.
+#
+# The early demux optimization can mask this issue by routing packets via
+# the unicast path (udp_unicast_rcv_skb), so we disable it to force
+# packets through __udp4_lib_mcast_deliver().
+
+source lib.sh
+
+NSENDER=""
+NRECV=""
+
+cleanup() {
+	cleanup_all_ns
+}
+
+trap cleanup EXIT
+
+setup() {
+	setup_ns NSENDER NRECV
+
+	ip link add veth_s type veth peer name veth_r
+	ip link set veth_s netns "$NSENDER"
+	ip link set veth_r netns "$NRECV"
+
+	ip -n "$NSENDER" addr add 10.0.0.1/24 dev veth_s
+	ip -n "$NSENDER" link set veth_s up
+
+	ip -n "$NRECV" addr add 10.0.0.2/24 dev veth_r
+	ip -n "$NRECV" link set veth_r up
+
+	# Disable early demux to force multicast delivery path
+	ip netns exec "$NRECV" sysctl -wq net.ipv4.ip_early_demux=0
+
+	# Join multicast group on receiver
+	ip -n "$NRECV" addr add 239.0.0.1/32 dev veth_r autojoin
+
+	# Multicast routes
+	ip -n "$NRECV" route add 239.0.0.0/8 dev veth_r
+	ip -n "$NSENDER" route add 239.0.0.0/8 dev veth_s
+
+	# Sender: GRETAP with FOU encap (no FOU listener needed on TX side)
+	ip -n "$NSENDER" link add eoudp0 type gretap \
+		remote 239.0.0.1 local 10.0.0.1 \
+		encap fou encap-sport 4797 encap-dport 4797 \
+		key 239.0.0.1
+	ip -n "$NSENDER" link set eoudp0 up
+	ip -n "$NSENDER" addr add 192.168.99.1/24 dev eoudp0
+
+	# Receiver: FOU listener + GRETAP
+	ip netns exec "$NRECV" ip fou add port 4797 ipproto 47
+	ip -n "$NRECV" link add eoudp0 type gretap \
+		remote 239.0.0.1 local 10.0.0.2 \
+		encap fou encap-sport 4797 encap-dport 4797 \
+		key 239.0.0.1
+	ip -n "$NRECV" link set eoudp0 up
+	ip -n "$NRECV" addr add 192.168.99.2/24 dev eoudp0
+
+	# Static neigh entry on sender: ARP replies cannot traverse the
+	# multicast tunnel back, so pre-populate the neighbor cache.
+	local recv_mac
+	recv_mac=$(ip -n "$NRECV" link show eoudp0 | awk '/ether/{print $2}')
+	ip -n "$NSENDER" neigh add 192.168.99.2 lladdr "$recv_mac" dev eoudp0
+}
+
+get_rx_packets() {
+	ip -n "$NRECV" -s link show eoudp0 | awk '/RX:/{getline; print $2}'
+}
+
+test_fou_mcast_encap() {
+	local count=100
+	local rx_before
+	local rx_after
+	local rx_delta
+
+	# Warmup: let any initial broadcast/ARP traffic settle
+	ip netns exec "$NSENDER" ping -c 1 -W 1 192.168.99.2 >/dev/null 2>&1
+	sleep 1
+
+	rx_before=$(get_rx_packets)
+	ip netns exec "$NSENDER" ping -c $count -W 1 192.168.99.2 >/dev/null 2>&1
+	sleep 1
+	rx_after=$(get_rx_packets)
+
+	rx_delta=$((rx_after - rx_before))
+
+	if [ "$rx_delta" -ge "$count" ]; then
+		echo "PASS: received $rx_delta/$count packets via multicast FOU/GRETAP"
+		return "$ksft_pass"
+	elif [ "$rx_delta" -gt 0 ]; then
+		echo "FAIL: only $rx_delta/$count packets received (partial delivery)"
+		return "$ksft_fail"
+	else
+		echo "FAIL: 0/$count packets received (multicast encap resubmit broken)"
+		return "$ksft_fail"
+	fi
+}
+
+echo "TEST: FOU/GRETAP multicast encapsulation resubmit"
+
+setup
+test_fou_mcast_encap
+exit $?
-- 
2.47.3


^ permalink raw reply related

* [PATCH net-next RESEND v3 1/2] udp: fix encapsulation packet resubmit in multicast deliver
From: Anton Danilov @ 2026-06-21 19:04 UTC (permalink / raw)
  To: netdev
  Cc: Willem de Bruijn, davem, David Ahern, Eric Dumazet,
	Kuniyuki Iwashima, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Shuah Khan, linux-kselftest
In-Reply-To: <cover.1782067871.git.littlesmilingcloud@gmail.com>

When a UDP encapsulation socket (e.g., FOU) receives a multicast
packet, __udp4_lib_mcast_deliver() and __udp6_lib_mcast_deliver()
call consume_skb() when udp_queue_rcv_skb() returns a positive value.
A positive return value from udp_queue_rcv_skb() indicates that the
encap_rcv handler (e.g., fou_udp_recv) has consumed the UDP header
and wants the packet to be resubmitted to the IP protocol handler
for further processing (e.g., as a GRE packet).

The unicast path in udp_unicast_rcv_skb() handles this correctly by
returning -ret, which propagates up to ip_protocol_deliver_rcu() for
resubmission. However, the multicast path destroys the packet via
consume_skb() instead of resubmitting it, causing silent packet loss.

This affects any UDP encapsulation (FOU, GUE) combined with multicast
destination addresses.

Fix this by returning -ret instead of calling consume_skb() when the
return value is positive, matching the behavior of the unicast path.
This avoids growing the call stack compared to calling
ip_protocol_deliver_rcu() directly.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Assisted-by: Claude:claude-opus-4-6
---
 net/ipv4/udp.c | 6 ++++--
 net/ipv6/udp.c | 6 ++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 70f6cbd4ef73..b0910659391e 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2475,6 +2475,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct udp_hslot *hslot;
 	struct sk_buff *nskb;
 	bool use_hash2;
+	int ret;

 	hash2_any = 0;
 	hash2 = 0;
@@ -2519,8 +2520,9 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	}

 	if (first) {
-		if (udp_queue_rcv_skb(first, skb) > 0)
-			consume_skb(skb);
+		ret = udp_queue_rcv_skb(first, skb);
+		if (ret > 0)
+			return -ret;
 	} else {
 		kfree_skb(skb);
 		__UDP_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 15e032194ecc..ff2e389e286b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -949,6 +949,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	struct udp_hslot *hslot;
 	struct sk_buff *nskb;
 	bool use_hash2;
+	int ret;

 	hash2_any = 0;
 	hash2 = 0;
@@ -998,8 +999,9 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	}

 	if (first) {
-		if (udpv6_queue_rcv_skb(first, skb) > 0)
-			consume_skb(skb);
+		ret = udpv6_queue_rcv_skb(first, skb);
+		if (ret > 0)
+			return -ret;
 	} else {
 		kfree_skb(skb);
 		__UDP6_INC_STATS(net, UDP_MIB_IGNOREDMULTI);
-- 
2.47.3

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox