Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/2] octeontx2-af: Bug fixes for KPU profile and VF RX mode
From: nshettyj @ 2026-06-19  9:07 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: sgoutham, lcherian, gakula, hkelam, sbhatta, andrew+netdev, davem,
	edumazet, kuba, pabeni, Sunil.Goutham, naveenm, hkalra,
	Nitin Shetty J

From: Nitin Shetty J <nshettyj@marvell.com>

This patch series contains two standalone bug fixes for the Octeontx2 
administrative function (AF) driver targeting the net branch.

The first patch addresses a spurious firmware loading warning by 
switching to the non-warning variant of the firmware request API when 
falling back to alternative loading methods.

The second patch resolves an issue where a VF changing its interface 
state could inadvertently delete the RX promiscuous and all-multicast 
MCAM rules belonging to the host PF.

Harman Kalra (2):
  octeontx2-af: fix VF bringup affecting PF promiscuous state
  octeontx2-af: suppress kpu profile loading warning

 drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c | 4 ++--
 drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
2.48.1


^ permalink raw reply

* [PATCH net 1/2] octeontx2-af: fix VF bringup affecting PF promiscuous state
From: nshettyj @ 2026-06-19  9:07 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: sgoutham, lcherian, gakula, hkelam, sbhatta, andrew+netdev, davem,
	edumazet, kuba, pabeni, Sunil.Goutham, naveenm, hkalra,
	Nitin Shetty J
In-Reply-To: <20260619090746.1829416-1-nshettyj@marvell.com>

From: Harman Kalra <hkalra@marvell.com>

Mbox handling of nix_set_rx_mode for a VF with promiscuous and
all_multi flags set to false causes deletion of the PF's promiscuous
and allmulti MCAM rules. This occurs because the APIs that
enable/disable these rules operate only on the PF, even when the
mbox request is made via a VF interface.

Guard both rvu_npc_enable_allmulti_entry() and
rvu_npc_enable_promisc_entry() disable paths with an is_vf() check so
that a VF bringing up or tearing down its interface cannot inadvertently
clear the PF's MCAM rules.

Fixes: 967db3529eca ("octeontx2-af: add support for multicast/promisc packet replication feature")
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Nitin Shetty J <nshettyj@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
index f977734ae712..f4c066aff371 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
@@ -4548,7 +4548,7 @@ int rvu_mbox_handler_nix_set_rx_mode(struct rvu *rvu, struct nix_rx_mode *req,
 		rvu_npc_install_allmulti_entry(rvu, pcifunc, nixlf,
 					       pfvf->rx_chan_base);
 	} else {
-		if (!nix_rx_multicast)
+		if (!nix_rx_multicast && !is_vf(pcifunc))
 			rvu_npc_enable_allmulti_entry(rvu, pcifunc, nixlf, false);
 	}
 
@@ -4558,7 +4558,7 @@ int rvu_mbox_handler_nix_set_rx_mode(struct rvu *rvu, struct nix_rx_mode *req,
 					      pfvf->rx_chan_base,
 					      pfvf->rx_chan_cnt);
 	else
-		if (!nix_rx_multicast)
+		if (!nix_rx_multicast && !is_vf(pcifunc))
 			rvu_npc_enable_promisc_entry(rvu, pcifunc, nixlf, false);
 
 	return 0;
-- 
2.48.1


^ permalink raw reply related

* [PATCH net 2/2] octeontx2-af: suppress kpu profile loading warning
From: nshettyj @ 2026-06-19  9:07 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: sgoutham, lcherian, gakula, hkelam, sbhatta, andrew+netdev, davem,
	edumazet, kuba, pabeni, Sunil.Goutham, naveenm, hkalra,
	Nitin Shetty J
In-Reply-To: <20260619090746.1829416-1-nshettyj@marvell.com>

From: Harman Kalra <hkalra@marvell.com>

There are three ways in which a KPU profile can be loaded
(in high to low priority order):
1. profile image integrated in kernel image
2. firmware database method
3. default profile

In most cases the profile is loaded using the 2nd method, which
causes a spurious warning from the Linux firmware subsystem (method 1)
due to the absence of firmware in the kernel image.

Replace request_firmware_direct() with firmware_request_nowarn() to
suppress such warnings when no image is integrated into the kernel image.

Fixes: cf2437626502 ("octeontx2-af: suppress external profile loading warning")
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Nitin Shetty J <nshettyj@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 4994385a822b..a2de7f1c6c22 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -1966,7 +1966,7 @@ void npc_load_kpu_profile(struct rvu *rvu)
 	 * Firmware database method.
 	 * Default KPU profile.
 	 */
-	if (!request_firmware_direct(&fw, kpu_profile, rvu->dev)) {
+	if (!firmware_request_nowarn(&fw, kpu_profile, rvu->dev)) {
 		dev_info(rvu->dev, "Loading KPU profile from firmware: %s\n",
 			 kpu_profile);
 		rvu->kpu_fwdata = kzalloc(fw->size, GFP_KERNEL);
-- 
2.48.1


^ permalink raw reply related

* Re: [PATCH bpf v3 2/2] selftests/bpf: Cover partial copy of non-linear test_run output
From: sun jian @ 2026-06-19  9:08 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: bot+bpf-ci, bpf, netdev, linux-kselftest, linux-kernel, ast,
	daniel, andrii, martin.lau, eddyz87, memxor, song, yonghong.song,
	jolsa, davem, edumazet, kuba, pabeni, horms, shuah, hawk,
	john.fastabend, sdf, toke, lorenzo, martin.lau, clm,
	ihor.solodrai
In-Reply-To: <ajT3x2H0swzg4yhm@mail.gmail.com>

On Fri, Jun 19, 2026 at 4:03 PM Paul Chaignon <paul.chaignon@gmail.com> wrote:
>
> On Thu, Jun 18, 2026 at 06:45:18PM +0800, sun jian wrote:
> > On Thu, Jun 18, 2026 at 4:44 PM Paul Chaignon <paul.chaignon@gmail.com> wrote:
> > >
> > > On Wed, Jun 17, 2026 at 10:19:52PM +0800, sun jian wrote:
> > > > On Wed, Jun 17, 2026 at 6:31 PM <bot+bpf-ci@kernel.org> wrote:
>
> [...]
>
> > > > I tried reusing pkt_v4 and the existing TC program, but they do not fit
> > > > the skb case this test is trying to cover.
> > > >
> > > > For skb test_run, IPv4/IPv6 inputs with a too-short L3 header in the
> > > > linear area are rejected before bpf_test_finish(). With pkt_v4 and a
> > > > linear area of ETH_HLEN, the test fails with -EINVAL before reaching the
> > > > partial copy-out path. If the linear area is increased enough to pass the
> > > > IPv4 check, pkt_v4 is too small to both trigger the old
> > > > copy_size - frag_size path and verify that the copied prefix spans the
> > > > linear data and the first fragment. pkt_v6 has the same issue: after
> > > > making the IPv6 header linear, only 20 bytes remain in frags.
> > > >
> > > > The existing test_pkt_access program has its own packet-access coverage
> > > > goals and is not just a pass-through carrier. With such a short linear
> > > > area or small packet fixture, it can fail before the test hits the
> > > > bpf_test_finish()'s partial copy-out path. A pass-through TC program is
> > > > therefore a better fit, because it keeps the test focused on the
> > > > bpf_test_finish() copy-out semantics.
> > >
> > > If we're keeping tc_pass_prog() then can't we use pkt_v4 and get rid of
> > > init_pkt?
> > >
> >
> > pkt_v4 is too small to construct a meaningful nonlinear skb with a stable
> > linear/frag split while still exercising the partial copy-out boundary in
> > bpf_test_finish().
> >
> > With pkt_v4, we either do not reach a fragmented layout, or lose control over
> > the linear/frag boundary needed to exercise the regression path.
>
> I think I'm missing something. Why can't we use pkt_v4 with
> tc_pass_prog() and a linear area of ETH_HLEN? That would leave 42 bytes
> of non-linear area, so a SHORT_OUT_LEN of 30 should work to trigger the
> bug, no?
>

I should have made that part clearer. With a linear area of ETH_HLEN, the
skb test_run path fails before reaching bpf_test_finish(). After
eth_type_trans(), skb->data is advanced past the Ethernet header, so the
remaining linear L3 area is 0. The short IPv4 input check then rejects the
packet with -EINVAL.

To pass that check with pkt_v4, the linear area has to include the IPv4
header as well, i.e. ETH_HLEN + sizeof(struct iphdr). That leaves only about
20 bytes in frags, so a SHORT_OUT_LEN of 30 no longer triggers the old
copy_size - frag_size < 0 path.

> >
> > This test uses a 9000B packet so it does not depend on small-packet
> > allocation details. Smaller packets might work depending on allocation
> > state, but 9000B reliably gives us a non-linear skb with page frags and a
> > stable linear/frag boundary for the copy-out regression.
> >
> > init_pkt() is needed to ensure deterministic byte content across both linear
> > and fragmented regions so that the memcmp-based validation is stable.
> >
> > Thanks,
> > Sun Jian
> >
> >
> > > >
> > > > For XDP, this object does not have an existing xdp.frags pass-through
> > > > program, so the small XDP frags program is needed to cover the other
> > > > caller of the shared bpf_test_finish() path.
> > > >
> > > > Thanks,
> > > > Sun Jian

^ permalink raw reply

* Re: [PATCH net] octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order
From: Simon Horman @ 2026-06-19  9:13 UTC (permalink / raw)
  To: rkannoth
  Cc: Simon Horman, kuba, linux-kernel, netdev, andrew+netdev, davem,
	edumazet, pabeni, sgoutham
In-Reply-To: <20260618035926.1490794-1-rkannoth@marvell.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order

This change re-keys the xa_sb_free free list using each subbank's slot
in subbank_srch_order[] rather than using subbank_srch_order[sb->idx],
moving the population into a new helper called from npc_priv_init().
A few comments inline.

> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> index 354c4e881c6a..d38e848add93 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> @@ -3423,6 +3423,36 @@ static int npc_create_srch_order(int cnt)
>  	return 0;
>  }
>  
> +static int npc_subbanks_srch_oder_init(struct rvu *rvu)
                                ^^^^

[Low]
Is "oder" intended here? The neighbouring identifiers
npc_create_srch_order() and subbank_srch_order use "order", so this
appears to be a typo that would be easier to fix before the symbol
lands.

> +{
> +	struct npc_subbank *sb;
> +	int sb_idx;
> +	int i, j;
> +	int rc;
> +
> +	for (i = 0; i < npc_priv->num_subbanks; i++) {
> +		sb_idx = subbank_srch_order[i];
> +		sb = &npc_priv->sb[sb_idx];
> +		sb->arr_idx = i;
> +
> +		dev_dbg(rvu->dev, "%s: sb->idx=%u sb->arr_idx=%u\n",
> +			__func__, sb->idx, sb->arr_idx);
> +
> +		rc = xa_err(xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
> +				     xa_mk_value(sb->idx), GFP_KERNEL));
> +		if (rc) {
> +			dev_err(rvu->dev,
> +				"%s: xa_store(xa_sb_free) failed at slot %d (sb=%d): %d\n",
> +				__func__, i, sb_idx, rc);
> +			for (j = 0; j < i; j++)
> +				xa_erase(&npc_priv->xa_sb_free, j);
> +			return rc;
> +		}
> +	}
> +
> +	return 0;
> +}
> +

[ ... ]

> @@ -4727,6 +4748,10 @@ static int npc_priv_init(struct rvu *rvu)
>  	for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
>  		npc_subbank_init(rvu, sb, i);
>  
> +	ret = npc_subbanks_srch_oder_init(rvu);
> +	if (ret)
> +		goto fail2;
> +

[Medium]
Can this leak subbank_srch_order on the new failure path?

npc_create_srch_order() allocates subbank_srch_order via kcalloc().
Before this patch, the only way to reach fail2 was when
npc_create_srch_order() itself failed and left subbank_srch_order
NULL, so fail2 not freeing it was safe.

With this new goto fail2, subbank_srch_order has already been
allocated, but the fail2 label only frees npc_priv->sb and npc_priv:

fail2:
	xa_destroy(&npc_priv->xa_sb_used);
	[...]
	kfree(npc_priv->sb);
	npc_priv->sb = NULL;
fail1:
	kfree(npc_priv);
	npc_priv = NULL;
	return ret;

A subsequent re-init would unconditionally reassign
subbank_srch_order = kcalloc(...) in npc_create_srch_order(),
permanently leaking the previous buffer.

Should this path go to a new label that also kfree()s
subbank_srch_order, or should fail2 free it?

>  	/* Get number of pcifuncs in the system */
>  	npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
>  	npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,

[High]
Can npc_priv_init() return 0 on a real failure here?

After npc_subbanks_srch_oder_init() returns successfully, ret is 0.
If the kcalloc() for npc_priv->xa_pf2idx_map fails right after, the
code does goto fail3 which falls through fail2 and fail1 to
return ret, propagating 0 to the caller:

fail3:
	kfree(subbank_srch_order);
	subbank_srch_order = NULL;

fail2:
	[...]
fail1:
	kfree(npc_priv);
	npc_priv = NULL;
	return ret;

The caller npc_cn20k_init() then sees err == 0 and continues into
npc_setup_mcam_section(), which dereferences npc_priv (now NULL).

Pre-patch the function ended in return -ENOMEM, so the fail3 path
returned -ENOMEM unconditionally. Should ret be reset to -ENOMEM
before goto fail3, or should a separate variable be used for the
new helper's return value?

> @@ -4760,7 +4785,7 @@ static int npc_priv_init(struct rvu *rvu)
>  fail1:
>  	kfree(npc_priv);
>  	npc_priv = NULL;
> -	return -ENOMEM;
> +	return ret;
>  }

^ permalink raw reply

* Re: [PATCH net] ipv6: Fix null-ptr-deref in fib6_nh_mtu_change().
From: Fernando Fernandez Mancera @ 2026-06-19  9:13 UTC (permalink / raw)
  To: xmei5, dsahern, idosch, netdev
  Cc: davem, edumazet, pabeni, kuba, horms, bestswngs
In-Reply-To: <20260619045334.2427073-1-xmei5@asu.edu>

On 6/19/26 6:53 AM, xmei5@asu.edu wrote:
> From: Xiang Mei <xmei5@asu.edu>
> 
> fib6_nh_mtu_change() re-fetches idev via __in6_dev_get(arg->dev) and
> dereferences idev->cnf.mtu6 without a NULL check. addrconf_ifdown()
> clears dev->ip6_ptr with RCU_INIT_POINTER() after rt6_disable_ip() has
> released tb6_lock, so the RA-driven MTU walk can observe a NULL idev and
> oops. The caller rt6_mtu_change_route() guards its own __in6_dev_get(),
> but this re-fetch is unguarded; nexthop-backed routes survive
> addrconf_ifdown()'s flush, so the walk still reaches it after ip6_ptr is
> nulled.
> 
> Return 0 when idev is NULL, matching rt6_mtu_change_route() and the
> fib6_mtu() fix in commit 5ad509c1fdad ("ipv6: Fix null-ptr-deref in
> fib6_mtu().").
> 
>    Oops: general protection fault, ... KASAN: null-ptr-deref in range
>          [0x00000000000002a8-0x00000000000002af]
>    RIP: 0010:fib6_nh_mtu_change+0x203/0x990
>     rt6_mtu_change_route+0x141/0x1d0
>     __fib6_clean_all+0xd0/0x160
>     rt6_mtu_change+0xb4/0x100
>     ndisc_router_discovery+0x24b5/0x2cb0
>     icmpv6_rcv+0x12e9/0x1710
>     ipv6_rcv+0x39b/0x410
> 
> Fixes: c0b220cf7d80 ("ipv6: Refactor exception functions")
> Reported-by: Weiming Shi <bestswngs@gmail.com>
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Xiang Mei <xmei5@asu.edu>

Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>

Thanks!

^ permalink raw reply

* AW: AW: [PATCH net] net: usb: lan78xx: restore VLAN filter table after device reset
From: Sven Schuchmann @ 2026-06-19  9:18 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: Thangaraj Samynathan, Rengarajan Sundararajan,
	UNGLinuxDriver@microchip.com, Woojung.Huh@microchip.com,
	Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, netdev@vger.kernel.org, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <6d638042ad07a8bfb09ff3ee923e7b3f@tipi-net.de>

Hello Nicolai,

my first opservation is that calling lan78xx_write_vlan_table()
at the end lan78xx_start_rx_path() fixes the problem. I was able
to do over 200 connect/disconnects without any problem.

On 19.6.2026 10:30, Nicolai Buchwitz wrote:
> [...]
> 
> Can you please try with the following changes to lan78xx_mac_reset()?
> 
>         ret = lan78xx_stop_rx_path(dev);
>         if (ret < 0)
>                 goto link_down_fail;
> 
> -       /* MAC reset seems to not affect MAC configuration, no idea if it is
> -        * really needed, but it was done in previous driver version. So,
> leave
> -        * it here.
> -        */
> -       ret = lan78xx_mac_reset(dev);
> -       if (ret < 0)
> -               goto link_down_fail;
> -
>         return;

Actually with removing the the call to lan78xx_mac_reset()
in lan78xx_mac_link_down() problem still persists.
(Note: lan78xx_mac_reset() will never be called then!)

See over here:

[Fri Jun 19 11:06:37 2026] Connect LAN Cable
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_up()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_usb()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start tx path
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start rx path
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_up()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_usb()
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start tx path
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start rx path
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:38 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:41 2026] Disconnect LAN Cable
[Fri Jun 19 11:06:41 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_down()
[Fri Jun 19 11:06:41 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop tx path
[Fri Jun 19 11:06:41 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop rx path
[Fri Jun 19 11:06:41 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Down
[Fri Jun 19 11:06:42 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_down()
[Fri Jun 19 11:06:42 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop tx path
[Fri Jun 19 11:06:42 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop rx path
[Fri Jun 19 11:06:42 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Down
[Fri Jun 19 11:06:43 2026] Connect LAN Cable
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_up()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_usb()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start tx path
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start rx path
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_up()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_usb()
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start tx path
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start rx path
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:43 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:45 2026] Disconnect LAN Cable
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_down()
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop tx path
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop rx path
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Down
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_down()
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop tx path
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop rx path
[Fri Jun 19 11:06:46 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Down
[Fri Jun 19 11:06:47 2026] Connect LAN Cable
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_up()
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_usb()
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start tx path
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start rx path
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:47 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_up()
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_usb()
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start tx path
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start rx path
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:48 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:49 2026] Disconnect LAN Cable
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_down()
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop tx path
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop rx path
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Down
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_down()
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop tx path
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop rx path
[Fri Jun 19 11:06:50 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Down
[Fri Jun 19 11:06:52 2026] Connect LAN Cable
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_up()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_usb()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start tx path
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start rx path
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_up()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_usb()
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start tx path
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start rx path
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:52 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:54 2026] Disconnect LAN Cable
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_down()
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop tx path
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: stop rx path
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Down
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_down()
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop tx path
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: stop rx path
[Fri Jun 19 11:06:54 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Down
[Fri Jun 19 11:06:56 2026] Connect LAN Cable
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_mac_link_up()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: lan78xx_configure_usb()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start tx path
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: start rx path
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: VLAN TABLE 0: 0x05 0x05 - Ok
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.2:1.0 100BASE-T1-2: Link is Up - 100Mbps/Full - flow control off
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_mac_link_up()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_flowcontrol()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: lan78xx_configure_usb()
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start tx path
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: start rx path
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: VLAN TABLE 0: 0x05 0x00 - ERROR
[Fri Jun 19 11:06:56 2026] lan78xx 1-1.1:1.0 100BASE-T1-1: Link is Up - 100Mbps/Full - flow control off

Best Regards,

   Sven

^ permalink raw reply

* [PATCH 5.10] netdevsim: Fix memory leak of nsim_dev->fa_cookie
From: Mikhail Dmitrichenko @ 2026-06-19  9:15 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Mikhail Dmitrichenko, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, netdev, linux-kernel, Andrew Lunn, Eric Dumazet,
	Paolo Abeni, Jiri Pirko, lvc-project, Wang Yufen

From: Wang Yufen <wangyufen@huawei.com>

commit 064bc7312bd09a48798418663090be0c776183db upstream.

kmemleak reports this issue:

unreferenced object 0xffff8881bac872d0 (size 8):
  comm "sh", pid 58603, jiffies 4481524462 (age 68.065s)
  hex dump (first 8 bytes):
    04 00 00 00 de ad be ef                          ........
  backtrace:
    [<00000000c80b8577>] __kmalloc+0x49/0x150
    [<000000005292b8c6>] nsim_dev_trap_fa_cookie_write+0xc1/0x210 [netdevsim]
    [<0000000093d78e77>] full_proxy_write+0xf3/0x180
    [<000000005a662c16>] vfs_write+0x1c5/0xaf0
    [<000000007aabf84a>] ksys_write+0xed/0x1c0
    [<000000005f1d2e47>] do_syscall_64+0x3b/0x90
    [<000000006001c6ec>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

The issue occurs in the following scenarios:

nsim_dev_trap_fa_cookie_write()
  kmalloc() fa_cookie
  nsim_dev->fa_cookie = fa_cookie
..
nsim_drv_remove()

The fa_cookie allocked in nsim_dev_trap_fa_cookie_write() is not freed. To
fix, add kfree(nsim_dev->fa_cookie) to nsim_drv_remove().

Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Link: https://lore.kernel.org/r/1668504625-14698-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ The context change is due to the commit 5e388f3dc38c
("netdevsim: move vfconfig to nsim_dev") in v5.16
which is irrelevant to the logic of this patch. ]
Signed-off-by: Mikhail Dmitrichenko <mdmitrichenko@astralinux.ru>
---
Backport fix for CVE-2022-49803
 drivers/net/netdevsim/dev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index c8834ea84732..a106365ce485 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -1173,6 +1173,7 @@ void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev)
 				  ARRAY_SIZE(nsim_devlink_params));
 	devlink_unregister(devlink);
 	devlink_resources_unregister(devlink, NULL);
+	kfree(nsim_dev->fa_cookie);
 	devlink_free(devlink);
 }
 
-- 
2.47.3

^ permalink raw reply related

* Re: [PATCH net 1/6] ipv6: fix error handling in disable_ipv6 sysctl
From: Nicolas Dichtel @ 2026-06-19  9:33 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-2-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When writing to the disable_ipv6 sysctl, if proc_dointvec() fails to
> parse the input, it returns a negative error code. The current
> implementation is overwriting that error for write operations.
> 
> This results in a silent failure, it returns a successful write although
> the configuration was not modified at all. When modifying the "all"
> variant it can also modify the configuration of existing interfaces to
> the wrong value.
> 
> Fix this by checking the return value of proc_dointvec() and returning
> early on failure.
> 
> Fixes: 56d417b12e57 ("IPv6: Add 'autoconf' and 'disable_ipv6' module parameters")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

^ permalink raw reply

* Re: [PATCH net 2/6] ipv6: fix error handling in ignore_routes_with_linkdown sysctl
From: Nicolas Dichtel @ 2026-06-19  9:34 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-3-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When writing to the ignore_routes_with_linkdown sysctl, if
> proc_dointvec() fails to parse the input, it returns a negative error
> code. The current implementation is overwriting that error for write
> operations.
> 
> This results in a silent failure, it returns a successful write although
> the configuration was not modified at all. When modifying the "all"
> variant it can also modify the configuration of existing interfaces to
> the wrong value.
> 
> Fix this by checking the return value of proc_dointvec() and returning
> early on failure.
> 
> Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

^ permalink raw reply

* Re: [PATCH net 1/2] net: airoha: Fix off-by-one in airoha_tc_remove_htb_queue()
From: Simon Horman @ 2026-06-19  9:34 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Wayen Yan, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260618-airoha-qos-fixes-v1-1-37192652157f@kernel.org>

On Thu, Jun 18, 2026 at 08:00:29AM +0200, Lorenzo Bianconi wrote:
> airoha_tc_htb_alloc_leaf_queue() computes the HTB QoS channel index
> as opt->classid % AIROHA_NUM_QOS_CHANNELS and stores it in qos_sq_bmap.
> However, airoha_tc_remove_htb_queue() clears the HTB configuration
> using queue + 1 as the channel index, causing an off-by-one error.
> Use queue directly as the QoS channel index to match the allocation
> logic.
> 
> Fixes: ef1ca9271313b ("net: airoha: Add sched HTB offload support")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 3/6] ipv6: fix error handling in forwarding sysctl
From: Nicolas Dichtel @ 2026-06-19  9:34 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-4-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When writing to the forwarding sysctl, if proc_dointvec() fails to parse
> the input, it returns a negative error code. The current implementation
> is overwriting that error for write operations.
> 
> This results in a silent failure, it returns a successful write although
> the configuration was not modified at all. When modifying the "all"
> variant it can also modify the configuration of existing interfaces to
> the wrong value.
> 
> Fix this by checking the return value of proc_dointvec() and returning
> early on failure.
> 
> Fixes: b325fddb7f86 ("ipv6: Fix sysctl unregistration deadlock")
The bug existed before the git era.
Maybe
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")


^ permalink raw reply

* Re: [PATCH net 2/2] net: airoha: fix netif_set_real_num_tx_queues for sparse QoS channels
From: Simon Horman @ 2026-06-19  9:35 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Wayen Yan, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260618-airoha-qos-fixes-v1-2-37192652157f@kernel.org>

On Thu, Jun 18, 2026 at 08:00:30AM +0200, Lorenzo Bianconi wrote:
> airoha_tc_htb_alloc_leaf_queue() assigns queue IDs based on the channel
> index (opt->qid = AIROHA_NUM_TX_RING + channel), but updates
> real_num_tx_queues with a simple increment (num_tx_queues + 1). When QoS
> channels are allocated sparsely (e.g., channels 0 and 3 without 1 and
> 2), the returned qid can exceed real_num_tx_queues, causing out-of-bounds
> accesses in the networking stack.
> For example, allocating channel 0 then channel 3 results in
> real_num_tx_queues = 34 but qid = 35, which is out of range [0, 34).
> Fix this by computing real_num_tx_queues based on the highest active
> channel index rather than using a simple counter, in both the allocation
> and deletion paths.
> 
> Fixes: ef1ca9271313b ("net: airoha: Add sched HTB offload support")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c

...

> @@ -2806,7 +2806,10 @@ static int airoha_tc_htb_alloc_leaf_queue(struct net_device *netdev,
>  	if (err)
>  		goto error;
>  
> -	err = netif_set_real_num_tx_queues(netdev, num_tx_queues + 1);
> +	if (num_tx_queues <= netdev->real_num_tx_queues)
> +		goto set_qos_sq_bmap;
> +
> +	err = netif_set_real_num_tx_queues(netdev, num_tx_queues);
>  	if (err) {
>  		airoha_qdma_set_tx_rate_limit(netdev, channel, 0,
>  					      opt->quantum);
> @@ -2815,6 +2818,7 @@ static int airoha_tc_htb_alloc_leaf_queue(struct net_device *netdev,
>  		goto error;
>  	}
>  
> +set_qos_sq_bmap:

I would prefer if this could be achieved without a goto.

>  	set_bit(channel, dev->qos_sq_bmap);
>  	opt->qid = AIROHA_NUM_TX_RING + channel;
>  

...

^ permalink raw reply

* Re: [PATCH net 4/6] ipv6: fix error handling in disable_policy sysctl
From: Nicolas Dichtel @ 2026-06-19  9:35 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-5-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When writing to the disable_policy sysctl, if proc_dointvec() fails to
> parse the input, it returns a negative error code. The current
> implementation is resetting the position argument even if an error
> occurred during proc_dointvec() and not only during sysctl restart.
> 
> Fix this by checking the return value of proc_dointvec() and returning
> early on failure.
> 
> Fixes: df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

^ permalink raw reply

* Re: [PATCH net] octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order
From: Ratheesh Kannoth @ 2026-06-19  9:45 UTC (permalink / raw)
  To: Simon Horman
  Cc: kuba, linux-kernel, netdev, andrew+netdev, davem, edumazet,
	pabeni, sgoutham
In-Reply-To: <20260619091341.918165-1-horms@kernel.org>

On 2026-06-19 at 14:43:41, Simon Horman (horms@kernel.org) wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://netdev-ai.bots.linux.dev/sashiko/
> ---
> octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order
>
> This change re-keys the xa_sb_free free list using each subbank's slot
> in subbank_srch_order[] rather than using subbank_srch_order[sb->idx],
> moving the population into a new helper called from npc_priv_init().
> A few comments inline.
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > index 354c4e881c6a..d38e848add93 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > @@ -3423,6 +3423,36 @@ static int npc_create_srch_order(int cnt)
> >  	return 0;
> >  }
> >
> > +static int npc_subbanks_srch_oder_init(struct rvu *rvu)
>                                 ^^^^
>
> [Low]
> Is "oder" intended here? The neighbouring identifiers
> npc_create_srch_order() and subbank_srch_order use "order", so this
> appears to be a typo that would be easier to fix before the symbol
> lands.
>
> > +{
> > +	struct npc_subbank *sb;
> > +	int sb_idx;
> > +	int i, j;
> > +	int rc;
> > +
> > +	for (i = 0; i < npc_priv->num_subbanks; i++) {
> > +		sb_idx = subbank_srch_order[i];
> > +		sb = &npc_priv->sb[sb_idx];
> > +		sb->arr_idx = i;
> > +
> > +		dev_dbg(rvu->dev, "%s: sb->idx=%u sb->arr_idx=%u\n",
> > +			__func__, sb->idx, sb->arr_idx);
> > +
> > +		rc = xa_err(xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
> > +				     xa_mk_value(sb->idx), GFP_KERNEL));
> > +		if (rc) {
> > +			dev_err(rvu->dev,
> > +				"%s: xa_store(xa_sb_free) failed at slot %d (sb=%d): %d\n",
> > +				__func__, i, sb_idx, rc);
> > +			for (j = 0; j < i; j++)
> > +				xa_erase(&npc_priv->xa_sb_free, j);
> > +			return rc;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
>
> [ ... ]
>
> > @@ -4727,6 +4748,10 @@ static int npc_priv_init(struct rvu *rvu)
> >  	for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
> >  		npc_subbank_init(rvu, sb, i);
> >
> > +	ret = npc_subbanks_srch_oder_init(rvu);
> > +	if (ret)
> > +		goto fail2;
> > +
>
> [Medium]
> Can this leak subbank_srch_order on the new failure path?
>
> npc_create_srch_order() allocates subbank_srch_order via kcalloc().
> Before this patch, the only way to reach fail2 was when
> npc_create_srch_order() itself failed and left subbank_srch_order
> NULL, so fail2 not freeing it was safe.
>
> With this new goto fail2, subbank_srch_order has already been
> allocated, but the fail2 label only frees npc_priv->sb and npc_priv:
>
> fail2:
> 	xa_destroy(&npc_priv->xa_sb_used);
> 	[...]
> 	kfree(npc_priv->sb);
> 	npc_priv->sb = NULL;
> fail1:
> 	kfree(npc_priv);
> 	npc_priv = NULL;
> 	return ret;
>
> A subsequent re-init would unconditionally reassign
> subbank_srch_order = kcalloc(...) in npc_create_srch_order(),
> permanently leaking the previous buffer.
>
> Should this path go to a new label that also kfree()s
> subbank_srch_order, or should fail2 free it?
>
> >  	/* Get number of pcifuncs in the system */
> >  	npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
> >  	npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,
>
> [High]
> Can npc_priv_init() return 0 on a real failure here?
>
> After npc_subbanks_srch_oder_init() returns successfully, ret is 0.
> If the kcalloc() for npc_priv->xa_pf2idx_map fails right after, the
> code does goto fail3 which falls through fail2 and fail1 to
> return ret, propagating 0 to the caller:
>
> fail3:
> 	kfree(subbank_srch_order);
> 	subbank_srch_order = NULL;
>
> fail2:
> 	[...]
> fail1:
> 	kfree(npc_priv);
> 	npc_priv = NULL;
> 	return ret;
>
> The caller npc_cn20k_init() then sees err == 0 and continues into
> npc_setup_mcam_section(), which dereferences npc_priv (now NULL).
>
> Pre-patch the function ended in return -ENOMEM, so the fail3 path
> returned -ENOMEM unconditionally. Should ret be reset to -ENOMEM
> before goto fail3, or should a separate variable be used for the
> new helper's return value?
>
> > @@ -4760,7 +4785,7 @@ static int npc_priv_init(struct rvu *rvu)
> >  fail1:
> >  	kfree(npc_priv);
> >  	npc_priv = NULL;
> > -	return -ENOMEM;
> > +	return ret;
> >  }

ACK for all.

pw-bot: changes-requested

^ permalink raw reply

* [PATCH net v2] octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order
From: Ratheesh Kannoth @ 2026-06-19  9:51 UTC (permalink / raw)
  To: kuba, linux-kernel, netdev, rkannoth
  Cc: andrew+netdev, davem, edumazet, pabeni, sgoutham

subbank_srch_order[i] is the physical subbank at search-order slot i,
so each subbank's arr_idx must be i (its slot), not
subbank_srch_order[sb->idx].  The old logic mis-keyed xa_sb_free
and broke allocation traversal order.

Populate arr_idx and xa_sb_free in a single pass over the search
order after subbank structs are initialized.

Fixes: 7ac9d4c4075c ("octeontx2-af: npc: cn20k: add subbank search order control")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>

---
v1 -> v2: Addressed simon comments
	https://lore.kernel.org/netdev/20260619091341.918165-1-horms@kernel.org/
---
 .../ethernet/marvell/octeontx2/af/cn20k/npc.c | 51 ++++++++++++++-----
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
index 354c4e881c6a..51fe82f1343f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
@@ -3423,6 +3423,36 @@ static int npc_create_srch_order(int cnt)
 	return 0;
 }
 
+static int npc_subbanks_srch_order_init(struct rvu *rvu)
+{
+	struct npc_subbank *sb;
+	int sb_idx;
+	int i, j;
+	int rc;
+
+	for (i = 0; i < npc_priv->num_subbanks; i++) {
+		sb_idx = subbank_srch_order[i];
+		sb = &npc_priv->sb[sb_idx];
+		sb->arr_idx = i;
+
+		dev_dbg(rvu->dev, "%s: sb->idx=%u sb->arr_idx=%u\n",
+			__func__, sb->idx, sb->arr_idx);
+
+		rc = xa_err(xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
+				     xa_mk_value(sb->idx), GFP_KERNEL));
+		if (rc) {
+			dev_err(rvu->dev,
+				"%s: xa_store(xa_sb_free) failed at slot %d (sb=%d): %d\n",
+				__func__, i, sb_idx, rc);
+			for (j = 0; j < i; j++)
+				xa_erase(&npc_priv->xa_sb_free, j);
+			return rc;
+		}
+	}
+
+	return 0;
+}
+
 static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
 {
 	mutex_init(&sb->lock);
@@ -3435,16 +3465,6 @@ static void npc_subbank_init(struct rvu *rvu, struct npc_subbank *sb, int idx)
 
 	sb->flags = NPC_SUBBANK_FLAG_FREE;
 	sb->idx = idx;
-	sb->arr_idx = subbank_srch_order[idx];
-
-	dev_dbg(rvu->dev, "%s: sb->idx=%u sb->arr_idx=%u\n",
-		__func__, sb->idx, sb->arr_idx);
-
-	/* Keep first and last subbank at end of free array; so that
-	 * it will be used at last
-	 */
-	xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
-		 xa_mk_value(sb->idx), GFP_KERNEL);
 }
 
 static int npc_pcifunc_map_create(struct rvu *rvu)
@@ -4635,6 +4655,7 @@ static int npc_priv_init(struct rvu *rvu)
 	int num_subbanks, subbank_depth;
 	u64 npc_const1, npc_const2 = 0;
 	struct npc_subbank *sb;
+	int ret = -ENOMEM;
 	u64 cfg;
 	int i;
 
@@ -4727,13 +4748,19 @@ static int npc_priv_init(struct rvu *rvu)
 	for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
 		npc_subbank_init(rvu, sb, i);
 
+	ret = npc_subbanks_srch_order_init(rvu);
+	if (ret)
+		goto fail3;
+
 	/* Get number of pcifuncs in the system */
 	npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
 	npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,
 					  sizeof(struct xarray),
 					  GFP_KERNEL);
-	if (!npc_priv->xa_pf2idx_map)
+	if (!npc_priv->xa_pf2idx_map) {
+		ret = -ENOMEM;
 		goto fail3;
+	}
 
 	for (i = 0; i < npc_priv->pf_cnt; i++)
 		xa_init_flags(&npc_priv->xa_pf2idx_map[i], XA_FLAGS_ALLOC);
@@ -4760,7 +4787,7 @@ static int npc_priv_init(struct rvu *rvu)
 fail1:
 	kfree(npc_priv);
 	npc_priv = NULL;
-	return -ENOMEM;
+	return ret;
 }
 
 void npc_cn20k_deinit(struct rvu *rvu)
-- 
2.43.0


^ permalink raw reply related

* Re: AW: AW: [PATCH net] net: usb: lan78xx: restore VLAN filter table after device reset
From: Nicolai Buchwitz @ 2026-06-19  9:53 UTC (permalink / raw)
  To: Sven Schuchmann
  Cc: Thangaraj Samynathan, Rengarajan Sundararajan, UNGLinuxDriver,
	Woojung.Huh, Andrew Lunn, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-usb, linux-kernel
In-Reply-To: <BEZP281MB2245422BB9FBFF4AFE081451D9E22@BEZP281MB2245.DEUP281.PROD.OUTLOOK.COM>

Hi Sven

On 19.6.2026 11:18, Sven Schuchmann wrote:
> Hello Nicolai,
> 
> my first opservation is that calling lan78xx_write_vlan_table()
> at the end lan78xx_start_rx_path() fixes the problem. I was able
> to do over 200 connect/disconnects without any problem.

Thanks, that's the right direction. For the final patch I'd move it
to lan78xx_mac_link_up(), which is IMHO a bit "cleaner":

[...]
  static void lan78xx_rx_urb_submit_all(struct lan78xx_net *dev);
+static int lan78xx_write_vlan_table(struct lan78xx_net *dev);
[...]
static void lan78xx_mac_link_up(struct phylink_config *config,
[...]
  	if (ret < 0)
  		goto link_up_fail;

+	ret = lan78xx_write_vlan_table(dev);
+	if (ret < 0)
+		goto link_up_fail;
+
  	netif_start_queue(net);
[...]

Could you give this version a quick test and confirm? Then I'll add
your Tested-by.

> [...]

Thanks
Nicolai

^ permalink raw reply

* Re: [PATCH net] netpoll: run NAPI poll in softirq context to avoid rq->lock self-deadlock
From: Petr Mladek @ 2026-06-19  9:53 UTC (permalink / raw)
  To: John Ogness
  Cc: Breno Leitao, Peter Zijlstra, Jakub Kicinski,
	Sebastian Andrzej Siewior, Sergey Senozhatsky, Vlad Poenaru,
	Thomas Gleixner, netdev, David S . Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Clark Williams, Steven Rostedt,
	linux-rt-devel, linux-kernel, stable, Frederic Weisbecker,
	Ingo Molnar, Vincent Guittot, Dietmar Eggemann, K Prateek Nayak
In-Reply-To: <87tsr1m6y5.fsf@jogness.linutronix.de>

On Wed 2026-06-17 19:13:30, John Ogness wrote:
> On 2026-06-17, Breno Leitao <leitao@debian.org> wrote:
> > On Wed, Jun 17, 2026 at 01:19:58PM +0200, Peter Zijlstra wrote:
> >> But anything using locking is not ->write_atomic() and should be driven
> >> from a kthread, no?
> >
> > Good point. If that's the case, netconsole might not ever be able to drop
> > CON_NBCON_ATOMIC_UNSAFE for any network-based console driver at all. 
> 
> It depends on what it needs to synchronize against. For example, the
> UART consoles cannot write if the port lock is taken by another
> context. And the port lock is the sole lock for writing to the UART. To
> deal with this, we added wrappers [0] for acquiring/releasing the port
> lock. The wrappers acquire the nbcon hardware after taking the port
> lock.
>
> The write_atomic() implementations for UART consoles do not take the
> port lock. Only the nbcon hardware is acquired (which can be done from
> any context). This automatically provides the synchronization based on
> the port lock.
> 
> > As far as I can tell, there isn't a network driver today whose transmit
> > path is completely lockless, so, even if we make netpoll lockless.
> >
> > It's unlikely any NIC will ever achieve this, given that NIC TX
> > fundamentally relies on a shared DMA ring and doorbell register, which
> > inherently cannot be made lockless.
> >
> > So, is it correct to state that CON_NBCON_ATOMIC_UNSAFE will be part of
> > netconsole forever-ish?
> 
> Is there some lock that can be taken to synchronize all writing of
> packets to the network? If yes, the netconsole can use a similar
> solution.

We need to be careful here. If more locks depend on the nbcon
ownership than it might become a kind of big kernel lock.

It might suffer from lock contention.

Another complication is that it is supposed to be a tail lock.

Finally, it might create tricky lockdep dependencies. But nbcon
context locking is not tracked by locked so it is not easy to be sure.

More details:

I always forget the details. But it seems that sleeping is allowed
in the nbcon context, see cant_migrate() in nbcon_device_try_acquire().
Which might break when someone tries to take it in atomic context.

AFAIK, the motivation was to allow using the normal (sleeping)
spin locks for serial console synchronization in RT. The nested nbcon
context locking should not disable the preemption when called
in NBCON_PRIO_NORMAL context.

It would still allow to take the nbcon context in atomic context
when called in NBCON_PRIO_EMERGENCY or _PANIC context because
nbcon_context_try_acquire() is able to take over the ownership
even from a sleeping NBCON_PRIO_NORMAL context.

But we need to make sure that outer locks behave the same.
In practice, they must be normal spin_locks. We could probably
add some lockdep annotation to catch eventual problems.

Sigh, I hope that I have got it right. I seem to be a bit lost
this week.

> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/serial_core.h?h=v7.1#n715

Best Regards,
Petr

^ permalink raw reply

* [PATCH v6.1 0/3] Fix CVE-2026-23272
From: Shivani Agarwal @ 2026-06-19  9:28 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms,
	netfilter-devel, coreteam, netdev, linux-kernel, ajay.kaher,
	alexey.makhalov, vamsi-krishna.brahmajosyula, yin.ding,
	tapas.kundu, Shivani Agarwal

To fix CVE-2026-23272, commit def602e498a4 is required; however,
it depends on commit d4b7f29eb85c and 8d738c1869f6. Therefore,
both patches have been backported to v6.1.

Florian Westphal (1):
  netfilter: nf_tables: always increment set element count

Pablo Neira Ayuso (2):
  netfilter: nf_tables: fix set size with rbtree backend
  netfilter: nf_tables: unconditionally bump set->nelems before
    insertion

 include/net/netfilter/nf_tables.h |  6 +++
 net/netfilter/nf_tables_api.c     | 72 ++++++++++++++++++++++++++-----
 net/netfilter/nft_set_rbtree.c    | 43 ++++++++++++++++++
 3 files changed, 110 insertions(+), 11 deletions(-)

-- 
2.53.0


^ permalink raw reply

* [PATCH v6.1 1/3] netfilter: nf_tables: always increment set element count
From: Shivani Agarwal @ 2026-06-19  9:28 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms,
	netfilter-devel, coreteam, netdev, linux-kernel, ajay.kaher,
	alexey.makhalov, vamsi-krishna.brahmajosyula, yin.ding,
	tapas.kundu, Shivani Agarwal
In-Reply-To: <20260619092850.1274076-1-shivani.agarwal@broadcom.com>

From: Florian Westphal <fw@strlen.de>

[ Upstream commit d4b7f29eb85c93893bc27388b37709efbc3c9a0e ]

At this time, set->nelems counter only increments when the set has
a maximum size.

All set elements decrement the counter unconditionally, this is
confusing.

Increment the counter unconditionally to make this symmetrical.
This would also allow changing the set maximum size after set creation
in a later patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
[ Shivani: Modified to apply on 6.1.y ]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
---
 net/netfilter/nf_tables_api.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0c4224282..ec4bfe53b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -6670,10 +6670,13 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 		goto err_element_clash;
 	}
 
-	if (!(flags & NFT_SET_ELEM_CATCHALL) && set->size &&
-	    !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) {
-		err = -ENFILE;
-		goto err_set_full;
+	if (!(flags & NFT_SET_ELEM_CATCHALL)) {
+		unsigned int max = set->size ? set->size + set->ndeact : UINT_MAX;
+
+		if (!atomic_add_unless(&set->nelems, 1, max)) {
+			err = -ENFILE;
+			goto err_set_full;
+		}
 	}
 
 	nft_trans_elem(trans) = elem;
-- 
2.53.0


^ permalink raw reply related

* [PATCH v6.1 2/3] netfilter: nf_tables: fix set size with rbtree backend
From: Shivani Agarwal @ 2026-06-19  9:28 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms,
	netfilter-devel, coreteam, netdev, linux-kernel, ajay.kaher,
	alexey.makhalov, vamsi-krishna.brahmajosyula, yin.ding,
	tapas.kundu, Sasha Levin, Shivani Agarwal
In-Reply-To: <20260619092850.1274076-1-shivani.agarwal@broadcom.com>

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit 8d738c1869f611955d91d8d0fd0012d9ef207201 ]

The existing rbtree implementation uses singleton elements to represent
ranges, however, userspace provides a set size according to the number
of ranges in the set.

Adjust provided userspace set size to the number of singleton elements
in the kernel by multiplying the range by two.

Check if the no-match all-zero element is already in the set, in such
case release one slot in the set size.

Fixes: 0ed6389c483d ("netfilter: nf_tables: rename set implementations")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Shivani: Modified to apply on 6.1.y ]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
---
 include/net/netfilter/nf_tables.h |  6 ++++
 net/netfilter/nf_tables_api.c     | 49 +++++++++++++++++++++++++++++--
 net/netfilter/nft_set_rbtree.c    | 43 +++++++++++++++++++++++++++
 3 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index dafa0a32e..3329c2eae 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -422,6 +422,9 @@ struct nft_set_ext;
  *	@remove: remove element from set
  *	@walk: iterate over all set elements
  *	@get: get set elements
+ *	@ksize: kernel set size
+ *	@usize: userspace set size
+ *	@adjust_maxsize: delta to adjust maximum set size
  *	@privsize: function to return size of set private data
  *	@init: initialize private data of new set instance
  *	@destroy: destroy private data of set instance
@@ -470,6 +473,9 @@ struct nft_set_ops {
 					       const struct nft_set *set,
 					       const struct nft_set_elem *elem,
 					       unsigned int flags);
+	u32				(*ksize)(u32 size);
+	u32				(*usize)(u32 size);
+	u32				(*adjust_maxsize)(const struct nft_set *set);
 	void				(*commit)(struct nft_set *set);
 	void				(*abort)(const struct nft_set *set);
 	u64				(*privsize)(const struct nlattr * const nla[],
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ec4bfe53b..15bfdf07c 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4264,6 +4264,14 @@ static int nf_tables_fill_set_concat(struct sk_buff *skb,
 	return 0;
 }
 
+static u32 nft_set_userspace_size(const struct nft_set_ops *ops, u32 size)
+{
+	if (ops->usize)
+		return ops->usize(size);
+
+	return size;
+}
+
 static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx,
 			      const struct nft_set *set, u16 event, u16 flags)
 {
@@ -4328,7 +4336,8 @@ static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx,
 	if (!nest)
 		goto nla_put_failure;
 	if (set->size &&
-	    nla_put_be32(skb, NFTA_SET_DESC_SIZE, htonl(set->size)))
+	    nla_put_be32(skb, NFTA_SET_DESC_SIZE,
+			 htonl(nft_set_userspace_size(set->ops, set->size))))
 		goto nla_put_failure;
 
 	if (set->field_count > 1 &&
@@ -4698,6 +4707,15 @@ static bool nft_set_is_same(const struct nft_set *set,
 	return true;
 }
 
+static u32 nft_set_kernel_size(const struct nft_set_ops *ops,
+			       const struct nft_set_desc *desc)
+{
+	if (ops->ksize)
+		return ops->ksize(desc->size);
+
+	return desc->size;
+}
+
 static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
 			    const struct nlattr * const nla[])
 {
@@ -4880,6 +4898,9 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
 		if (err < 0)
 			return err;
 
+		if (desc.size)
+			desc.size = nft_set_kernel_size(set->ops, &desc);
+
 		err = 0;
 		if (!nft_set_is_same(set, &desc, exprs, num_exprs, flags)) {
 			NL_SET_BAD_ATTR(extack, nla[NFTA_SET_NAME]);
@@ -4902,6 +4923,9 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
 	if (IS_ERR(ops))
 		return PTR_ERR(ops);
 
+	if (desc.size)
+		desc.size = nft_set_kernel_size(ops, &desc);
+
 	udlen = 0;
 	if (nla[NFTA_SET_USERDATA])
 		udlen = nla_len(nla[NFTA_SET_USERDATA]);
@@ -6327,6 +6351,27 @@ static bool nft_setelem_valid_key_end(const struct nft_set *set,
 	return true;
 }
 
+static u32 nft_set_maxsize(const struct nft_set *set)
+{
+	u32 maxsize, delta;
+
+	if (!set->size)
+		return UINT_MAX;
+
+	if (set->ops->adjust_maxsize)
+		delta = set->ops->adjust_maxsize(set);
+	else
+		delta = 0;
+
+	if (check_add_overflow(set->size, set->ndeact, &maxsize))
+		return UINT_MAX;
+
+	if (check_add_overflow(maxsize, delta, &maxsize))
+		return UINT_MAX;
+
+	return maxsize;
+}
+
 static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 			    const struct nlattr *attr, u32 nlmsg_flags)
 {
@@ -6671,7 +6716,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	}
 
 	if (!(flags & NFT_SET_ELEM_CATCHALL)) {
-		unsigned int max = set->size ? set->size + set->ndeact : UINT_MAX;
+		unsigned int max = nft_set_maxsize(set);
 
 		if (!atomic_add_unless(&set->nelems, 1, max)) {
 			err = -ENFILE;
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 426becaad..26e1d994f 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -775,6 +775,46 @@ static bool nft_rbtree_estimate(const struct nft_set_desc *desc, u32 features,
 	return true;
 }
 
+/* rbtree stores ranges as singleton elements, each range is composed of two
+ * elements ...
+ */
+static u32 nft_rbtree_ksize(u32 size)
+{
+	return size * 2;
+}
+
+/* ... hide this detail to userspace. */
+static u32 nft_rbtree_usize(u32 size)
+{
+	if (!size)
+		return 0;
+
+	return size / 2;
+}
+
+static u32 nft_rbtree_adjust_maxsize(const struct nft_set *set)
+{
+	struct nft_rbtree *priv = nft_set_priv(set);
+	struct nft_rbtree_elem *rbe;
+	struct rb_node *node;
+	const void *key;
+
+	node = rb_last(&priv->root);
+	if (!node)
+		return 0;
+
+	rbe = rb_entry(node, struct nft_rbtree_elem, node);
+	if (!nft_rbtree_interval_end(rbe))
+		return 0;
+
+	key = nft_set_ext_key(&rbe->ext);
+	if (memchr(key, 1, set->klen))
+		return 0;
+
+	/* this is the all-zero no-match element. */
+	return 1;
+}
+
 const struct nft_set_type nft_set_rbtree_type = {
 	.features	= NFT_SET_INTERVAL | NFT_SET_MAP | NFT_SET_OBJECT | NFT_SET_TIMEOUT,
 	.ops		= {
@@ -791,5 +831,8 @@ const struct nft_set_type nft_set_rbtree_type = {
 		.lookup		= nft_rbtree_lookup,
 		.walk		= nft_rbtree_walk,
 		.get		= nft_rbtree_get,
+		.ksize		= nft_rbtree_ksize,
+		.usize		= nft_rbtree_usize,
+		.adjust_maxsize = nft_rbtree_adjust_maxsize,
 	},
 };
-- 
2.53.0


^ permalink raw reply related

* [PATCH v6.1 3/3] netfilter: nf_tables: unconditionally bump set->nelems before insertion
From: Shivani Agarwal @ 2026-06-19  9:28 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms,
	netfilter-devel, coreteam, netdev, linux-kernel, ajay.kaher,
	alexey.makhalov, vamsi-krishna.brahmajosyula, yin.ding,
	tapas.kundu, Inseo An, Li hongliang, Sasha Levin, Shivani Agarwal
In-Reply-To: <20260619092850.1274076-1-shivani.agarwal@broadcom.com>

From: Pablo Neira Ayuso <pablo@netfilter.org>

[ Upstream commit def602e498a4f951da95c95b1b8ce8ae68aa733a ]

In case that the set is full, a new element gets published then removed
without waiting for the RCU grace period, while RCU reader can be
walking over it already.

To address this issue, add the element transaction even if set is full,
but toggle the set_full flag to report -ENFILE so the abort path safely
unwinds the set to its previous state.

As for element updates, decrement set->nelems to restore it.

A simpler fix is to call synchronize_rcu() in the error path.
However, with a large batch adding elements to already maxed-out set,
this could cause noticeable slowdown of such batches.

Fixes: 35d0ac9070ef ("netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL")
Reported-by: Inseo An <y0un9sa@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
[ Minor conflict resolved. ]
Signed-off-by: Li hongliang <1468888505@139.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Shivani: Modified to apply on 6.1.y ]
Signed-off-by: Shivani Agarwal <shivani.agarwal@broadcom.com>
---
 net/netfilter/nf_tables_api.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 15bfdf07c..196ac4e76 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -6388,6 +6388,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	struct nft_data_desc desc;
 	enum nft_registers dreg;
 	struct nft_trans *trans;
+	bool set_full = false;
 	u64 timeout;
 	u64 expiration;
 	int err, i;
@@ -6680,10 +6681,18 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	if (err < 0)
 		goto err_elem_free;
 
+	if (!(flags & NFT_SET_ELEM_CATCHALL)) {
+		unsigned int max = nft_set_maxsize(set), nelems;
+
+		nelems = atomic_inc_return(&set->nelems);
+		if (nelems > max)
+			set_full = true;
+	}
+
 	trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set);
 	if (trans == NULL) {
 		err = -ENOMEM;
-		goto err_elem_free;
+		goto err_set_size;
 	}
 
 	ext->genmask = nft_genmask_cur(ctx->net);
@@ -6715,23 +6724,16 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 		goto err_element_clash;
 	}
 
-	if (!(flags & NFT_SET_ELEM_CATCHALL)) {
-		unsigned int max = nft_set_maxsize(set);
-
-		if (!atomic_add_unless(&set->nelems, 1, max)) {
-			err = -ENFILE;
-			goto err_set_full;
-		}
-	}
-
 	nft_trans_elem(trans) = elem;
 	nft_trans_commit_list_add_tail(ctx->net, trans);
-	return 0;
 
-err_set_full:
-	nft_setelem_remove(ctx->net, set, &elem);
+	return set_full ? -ENFILE : 0;
+
 err_element_clash:
 	kfree(trans);
+err_set_size:
+	if (!(flags & NFT_SET_ELEM_CATCHALL))
+		atomic_dec(&set->nelems);
 err_elem_free:
 	nf_tables_set_elem_destroy(ctx, set, elem.priv);
 err_parse_data:
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH net 5/6] ipv6: reset value and position for proxy_ndp sysctl restart
From: Nicolas Dichtel @ 2026-06-19  9:58 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-6-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When handling proxy_ndp, if rtnl_net_trylock() fails, the operation is
> retried but as the value was already modified by the initial
> proc_dointvec() call, the restarted syscall will read the newly modified
> value as the 'old' state.
> 
> Fix this by restoring the original value and position pointer before
> restarting the syscall.
Is it not better to call rtnl_net_trylock() at the beginning of the function?
It avoids flapping the sysctl value.

> 
> Fixes: c92d5491a6d9 ("netconf: add support for IPv6 proxy_ndp")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> ---
>  net/ipv6/addrconf.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 8ff015975e27..1cfb223476bd 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -6483,8 +6483,9 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
>  		void *buffer, size_t *lenp, loff_t *ppos)
>  {
>  	int *valp = ctl->data;
> -	int ret;
> +	loff_t pos = *ppos;
>  	int old, new;
> +	int ret;
>  
>  	old = *valp;
>  	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
> @@ -6493,8 +6494,12 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
>  	if (write && old != new) {
>  		struct net *net = ctl->extra2;
>  
> -		if (!rtnl_net_trylock(net))
> +		if (!rtnl_net_trylock(net)) {
> +			/* Restore the original values before restarting */
> +			*valp = old;
> +			*ppos = pos;
>  			return restart_syscall();
> +		}
>  
>  		if (valp == &net->ipv6.devconf_dflt->proxy_ndp) {
>  			inet6_netconf_notify_devconf(net, RTM_NEWNETCONF,


^ permalink raw reply

* Re: [PATCH net 6/6] ipv6: fix missing notification for ignore_routes_with_linkdown
From: Nicolas Dichtel @ 2026-06-19 10:02 UTC (permalink / raw)
  To: Fernando Fernandez Mancera, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <20260618162225.4588-7-fmancera@suse.de>

Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
> When changing the ignore_routes_with_linkdown sysctl for a specific
> interface, the RTM_NEWNETCONF netlink notification was not being emitted
> to userspace. Fix this by emitting the notification when needed.
> 
> In addition, fix bogus return value for successful "all" and specific
> interface write operation leading to a wrong reset of the position
> pointer.
> 
> Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com

^ permalink raw reply

* Re: [PATCH net 5/6] ipv6: reset value and position for proxy_ndp sysctl restart
From: Fernando Fernandez Mancera @ 2026-06-19 10:09 UTC (permalink / raw)
  To: nicolas.dichtel, netdev
  Cc: shemminger, dforster, gospo, ddutt, brian.haley, horms, pabeni,
	kuba, edumazet, davem, idosch, dsahern
In-Reply-To: <491465c3-0d1a-42f6-86fe-6c31812e23c9@6wind.com>

On 6/19/26 11:58 AM, Nicolas Dichtel wrote:
> Le 18/06/2026 à 18:22, Fernando Fernandez Mancera a écrit :
>> When handling proxy_ndp, if rtnl_net_trylock() fails, the operation is
>> retried but as the value was already modified by the initial
>> proc_dointvec() call, the restarted syscall will read the newly modified
>> value as the 'old' state.
>>
>> Fix this by restoring the original value and position pointer before
>> restarting the syscall.
> Is it not better to call rtnl_net_trylock() at the beginning of the function?
> It avoids flapping the sysctl value.
> 

IMHO it is not better if we want to reduce the time we are holding RTNL 
lock. I think the idea is that if the user introduces a invalid value, 
we don't need to take the lock at all.

That is the general pattern I see around the sysctl code (IPv4 and 
IPv6). Given the current efforts to reduce the usage of RTNL I think 
this approach would be better.

In any case, it is not a blocker for me so if we all agree that your 
suggestion is better I don't mind taking that path.

Thanks for all the reviews!

>>
>> Fixes: c92d5491a6d9 ("netconf: add support for IPv6 proxy_ndp")
>> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
>> ---
>>   net/ipv6/addrconf.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index 8ff015975e27..1cfb223476bd 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -6483,8 +6483,9 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
>>   		void *buffer, size_t *lenp, loff_t *ppos)
>>   {
>>   	int *valp = ctl->data;
>> -	int ret;
>> +	loff_t pos = *ppos;
>>   	int old, new;
>> +	int ret;
>>   
>>   	old = *valp;
>>   	ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
>> @@ -6493,8 +6494,12 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
>>   	if (write && old != new) {
>>   		struct net *net = ctl->extra2;
>>   
>> -		if (!rtnl_net_trylock(net))
>> +		if (!rtnl_net_trylock(net)) {
>> +			/* Restore the original values before restarting */
>> +			*valp = old;
>> +			*ppos = pos;
>>   			return restart_syscall();
>> +		}
>>   
>>   		if (valp == &net->ipv6.devconf_dflt->proxy_ndp) {
>>   			inet6_netconf_notify_devconf(net, RTM_NEWNETCONF,
> 
> 


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox