* Re: [PATCH net-next] net: phy: call phy_init_hw() in phy resume path
From: Russell King (Oracle) @ 2026-04-11 16:46 UTC (permalink / raw)
To: Andrew Lunn
Cc: Biju Das, biju.das.au, Heiner Kallweit, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Ovidiu Panait,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Geert Uytterhoeven, Prabhakar Mahadev Lad,
linux-renesas-soc@vger.kernel.org
In-Reply-To: <dedab35c-39f4-469b-9227-cb8925d83b8e@lunn.ch>
On Sat, Apr 11, 2026 at 03:50:13PM +0200, Andrew Lunn wrote:
> > So, I question whether any of the functions in this driver actually
> > have a valid reason to take phydev->lock - looks to me like a not
> > very well written driver.
> >
> > In cases like this, I don't think we should make things more
> > difficult in the core just because we have a lockdep splat when that
> > can be avoided by killing off unnecessary locking.
>
> Agreed. This patchset should cleanup these locks.
>
> We also need to look at lan937x_dsp_workaround(). I also don't see
> what that mutex lock/unlock is protecting. Accessing bank registers
> need to be protected, so doing one additional access within that
> should not need additional protection.
Looking at access_ereg(), shouldn't it be taking the MDIO bus lock
and using the __phy_* accessors anyway because it's writing various
registers which determine what is being read via the
LAN87XX_EXT_REG_RD_DATA register or the value written via the
LAN87XX_EXT_REG_WR_DATA register.
Also, as it has access_ereg_modify_changed(), that entire sequence
needs to take the MDIO bus lock to safely do the read-modify-write.
Then there's lan87xx_config_rgmii_delay() which is a large open
coded read-modify-write for the PHYACC_ATTR_BANK_MISC, LAN87XX_CTRL_1
register.
To me, this looks like a racy driver, and it also looks like it's using
the wrong lock to try and protect hardware accesses.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply
* Re: [PATCH net] net: phy: qcom: at803x: Use the correct bit to disable extended next page
From: Maxime Chevallier @ 2026-04-11 16:54 UTC (permalink / raw)
To: Andrew Lunn
Cc: Jakub Kicinski, davem, Eric Dumazet, Paolo Abeni, Simon Horman,
Russell King, thomas.petazzoni, netdev, linux-kernel,
linux-arm-msm
In-Reply-To: <4b386a4a-9743-4e79-8d3d-3576bb9de492@lunn.ch>
Hi Andrew,
On 11/04/2026 16:10, Andrew Lunn wrote:
> On Fri, Apr 10, 2026 at 07:10:20PM +0200, Maxime Chevallier wrote:
>> As noted in the blamed commit, the AR8035 and other PHYs from this
>> family advertise the Extended Next Page support by default, which may be
>> understood by some partners as this PHY being multi-gig capable.
>>
>> The fix is to disable XNP advertising, which is done by setting bit 12
>> of the Auto-Negotiation Advertisement Register (MII_ADVERTISE).
>>
>> The blamed commit incorrectly uses MDIO_AN_CTRL1_XNP, which is bit 13 as per
>> 802.3 : 45.2.7.1 AN control register (Register 7.0)
>>
>> BIT 12 in MII_ADVERTISE is wrapped by ADVERTISE_RESV, used by some
>> drivers such as the aquantia one. 802.3 Clause 28 defines bit 12 as
>> Extended Next Page ability, at least in recent versions of the standard.
>
>> Let's add a define for it and use it in the at803x driver.
>
> I agree with this, it defines the C22 4.12 bit. And this is what the
> at803x driver is using it for.
>
>> static void at803x_link_change_notify(struct phy_device *phydev)
>> diff --git a/include/uapi/linux/mii.h b/include/uapi/linux/mii.h
>> index 39f7c44baf53..61d6edad4b94 100644
>> --- a/include/uapi/linux/mii.h
>> +++ b/include/uapi/linux/mii.h
>> @@ -82,7 +82,8 @@
>> #define ADVERTISE_100BASE4 0x0200 /* Try for 100mbps 4k packets */
>> #define ADVERTISE_PAUSE_CAP 0x0400 /* Try for pause */
>> #define ADVERTISE_PAUSE_ASYM 0x0800 /* Try for asymetric pause */
>> -#define ADVERTISE_RESV 0x1000 /* Unused... */
>> +#define ADVERTISE_XNP 0x1000 /* Extended Next Page */
>> +#define ADVERTISE_RESV ADVERTISE_XNP /* Used to be reserved */
>
> Should we keep ADVERTISE_RESV?
>
> 45.2.7.6 AN advertisement register
>
> If the Auto-Negotiation advertisement register (register 4) is
> present, (see 28.2.4.1.3), then this register is a copy of the
> Auto-Negotiation advertisement register (register 4). In this case,
> reads to the AN advertisement register (7.16) report the value of
> the Auto-Negotiation advertisement register (register 4); writes to
> the AN advertisement register (7.16) cause a write to occur to the
> Auto-Negotiation advertisement register.
>
> So MDIO_MMD_AN:MDIO_AN_ADVERTISE is a straight copy of MII_ADVERTISE.
>
> ef4_mdio_write(efx, MDIO_MMD_AN, MDIO_AN_ADVERTISE, reg);
> ret = phy_write_mmd(phydev, MDIO_MMD_AN, MDIO_AN_ADVERTISE, adv);
>
> So ADVERTISE_XNP is just as valid in the other two drivers using
> ADVERTISE_RESV. I think we should change those as well to
> ADVERTISE_XNP and remove ADVERTISE_RESV?
>
> Andrew
I agree with that yes and I've considered converting these drivers once
we have net merged into net-next should this patch be applied :)
That said, ADVERTISE_RESV is in uapi, is it even possible to remove it ?
I think the best we can hope for is to no longer have in-tree users of
ADVERTISE_RESV :(
Maxime
^ permalink raw reply
* [RFC PATCH] bpf: cpumap: report queue_index to xdp_rxq_info
From: Jose A. Perez de Azpillaga @ 2026-04-11 17:51 UTC (permalink / raw)
To: bpf
Cc: Madalin Bucur, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
netdev, linux-kernel
When a packet is redirected to a CPU map entry,
cpu_map_bpf_prog_run_xdp() reconstructs a minimal xdp_rxq_info from
xdp_frame fields (dev_rx and mem_type) before re-running the BPF program
on the target CPU. However, queue_index was never preserved across the
CPU boundary, so BPF programs running in cpumap context always observe
ctx->rx_queue_index == 0, regardless of which hardware queue originally
received the packet.
Fix this by storing the originating queue_index in struct xdp_frame,
following the same pattern already established for dev_rx and mem_type.
The field is populated from rxq->queue_index in
xdp_convert_buff_to_frame() during NAPI context, when the rxq_info is
still valid, and restored into the reconstructed rxq_info in
cpu_map_bpf_prog_run_xdp().
Also use xdpf->queue_index in __xdp_build_skb_from_frame() to call
skb_record_rx_queue(), which was previously listed as missing
information in that function's comment.
Also propagate queue_index in dpaa_a050385_wa_xdpf(), which manually
constructs a new xdp_frame from an uninitialized page. Without this,
queue_index would contain stale data from the page allocator.
Signed-off-by: Jose A. Perez de Azpillaga <azpijr@gmail.com>
---
Note: this patch was only compiled, not tested.
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 1 +
include/net/xdp.h | 4 +++-
kernel/bpf/cpumap.c | 2 +-
net/core/xdp.c | 2 +-
4 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 3edc8d142dd5..00e36b0ac74d 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2281,6 +2281,7 @@ static int dpaa_a050385_wa_xdpf(struct dpaa_priv *priv,
new_xdpf->headroom = priv->tx_headroom;
new_xdpf->frame_sz = DPAA_BP_RAW_SIZE;
new_xdpf->mem_type = MEM_TYPE_PAGE_ORDER0;
+ new_xdpf->queue_index = xdpf->queue_index;
/* Release the initial buffer */
xdp_return_frame_rx_napi(xdpf);
diff --git a/include/net/xdp.h b/include/net/xdp.h
index aa742f413c35..6db10e6a8864 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -297,10 +297,11 @@ struct xdp_frame {
u32 headroom;
u32 metasize; /* uses lower 8-bits */
/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
- * while mem_type is valid on remote CPU.
+ * while mem_type and queue_index are valid on remote CPU.
*/
enum xdp_mem_type mem_type:32;
struct net_device *dev_rx; /* used by cpumap */
+ u32 queue_index; /* used by cpumap */
u32 frame_sz;
u32 flags; /* supported values defined in xdp_buff_flags */
};
@@ -441,6 +442,7 @@ struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
/* rxq only valid until napi_schedule ends, convert to xdp_mem_type */
xdp_frame->mem_type = xdp->rxq->mem.type;
+ xdp_frame->queue_index = xdp->rxq->queue_index;
return xdp_frame;
}
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 5e59ab896f05..448da572de9a 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -197,7 +197,7 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
rxq.dev = xdpf->dev_rx;
rxq.mem.type = xdpf->mem_type;
- /* TODO: report queue_index to xdp_rxq_info */
+ rxq.queue_index = xdpf->queue_index;
xdp_convert_frame_to_buff(xdpf, &xdp);
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 9890a30584ba..326e3057ed7f 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -829,11 +829,11 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
/* Essential SKB info: protocol and skb->dev */
skb->protocol = eth_type_trans(skb, dev);
+ skb_record_rx_queue(skb, xdpf->queue_index);
/* Optional SKB info, currently missing:
* - HW checksum info (skb->ip_summed)
* - HW RX hash (skb_set_hash)
- * - RX ring dev queue index (skb_record_rx_queue)
*/
if (xdpf->mem_type == MEM_TYPE_PAGE_POOL)
--
2.53.0
^ permalink raw reply related
* Re: [RFC PATCH] bpf: cpumap: report queue_index to xdp_rxq_info
From: Alexei Starovoitov @ 2026-04-11 18:09 UTC (permalink / raw)
To: Jose A. Perez de Azpillaga
Cc: bpf, Madalin Bucur, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
Network Development, LKML
In-Reply-To: <20260411175134.771552-1-azpijr@gmail.com>
On Sat, Apr 11, 2026 at 10:51 AM Jose A. Perez de Azpillaga
<azpijr@gmail.com> wrote:
>
> When a packet is redirected to a CPU map entry,
> cpu_map_bpf_prog_run_xdp() reconstructs a minimal xdp_rxq_info from
> xdp_frame fields (dev_rx and mem_type) before re-running the BPF program
> on the target CPU. However, queue_index was never preserved across the
> CPU boundary, so BPF programs running in cpumap context always observe
> ctx->rx_queue_index == 0, regardless of which hardware queue originally
> received the packet.
>
> Fix this by storing the originating queue_index in struct xdp_frame,
> following the same pattern already established for dev_rx and mem_type.
> The field is populated from rxq->queue_index in
> xdp_convert_buff_to_frame() during NAPI context, when the rxq_info is
> still valid, and restored into the reconstructed rxq_info in
> cpu_map_bpf_prog_run_xdp().
>
> Also use xdpf->queue_index in __xdp_build_skb_from_frame() to call
> skb_record_rx_queue(), which was previously listed as missing
> information in that function's comment.
>
> Also propagate queue_index in dpaa_a050385_wa_xdpf(), which manually
> constructs a new xdp_frame from an uninitialized page. Without this,
> queue_index would contain stale data from the page allocator.
>
> Signed-off-by: Jose A. Perez de Azpillaga <azpijr@gmail.com>
> ---
> Note: this patch was only compiled, not tested.
>
> drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 1 +
> include/net/xdp.h | 4 +++-
> kernel/bpf/cpumap.c | 2 +-
> net/core/xdp.c | 2 +-
> 4 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> index 3edc8d142dd5..00e36b0ac74d 100644
> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> @@ -2281,6 +2281,7 @@ static int dpaa_a050385_wa_xdpf(struct dpaa_priv *priv,
> new_xdpf->headroom = priv->tx_headroom;
> new_xdpf->frame_sz = DPAA_BP_RAW_SIZE;
> new_xdpf->mem_type = MEM_TYPE_PAGE_ORDER0;
> + new_xdpf->queue_index = xdpf->queue_index;
>
> /* Release the initial buffer */
> xdp_return_frame_rx_napi(xdpf);
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index aa742f413c35..6db10e6a8864 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -297,10 +297,11 @@ struct xdp_frame {
> u32 headroom;
> u32 metasize; /* uses lower 8-bits */
> /* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
> - * while mem_type is valid on remote CPU.
> + * while mem_type and queue_index are valid on remote CPU.
> */
> enum xdp_mem_type mem_type:32;
> struct net_device *dev_rx; /* used by cpumap */
> + u32 queue_index; /* used by cpumap */
> u32 frame_sz;
> u32 flags; /* supported values defined in xdp_buff_flags */
> };
> @@ -441,6 +442,7 @@ struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
>
> /* rxq only valid until napi_schedule ends, convert to xdp_mem_type */
> xdp_frame->mem_type = xdp->rxq->mem.type;
> + xdp_frame->queue_index = xdp->rxq->queue_index;
>
> return xdp_frame;
> }
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index 5e59ab896f05..448da572de9a 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -197,7 +197,7 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
>
> rxq.dev = xdpf->dev_rx;
> rxq.mem.type = xdpf->mem_type;
> - /* TODO: report queue_index to xdp_rxq_info */
> + rxq.queue_index = xdpf->queue_index;
This is like 5th time people attempt to address this TODO.
Just remove that comment. Don't send broken patches.
pw-bot: cr
^ permalink raw reply
* Re: [RFC PATCH] bpf: cpumap: report queue_index to xdp_rxq_info
From: bot+bpf-ci @ 2026-04-11 18:30 UTC (permalink / raw)
To: azpijr, bpf
Cc: madalin.bucur, andrew+netdev, davem, edumazet, kuba, pabeni, ast,
daniel, hawk, john.fastabend, sdf, horms, andrii, martin.lau,
eddyz87, memxor, song, yonghong.song, jolsa, netdev, linux-kernel,
ast, andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
ihor.solodrai
In-Reply-To: <20260411175134.771552-1-azpijr@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2007 bytes --]
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -297,10 +297,11 @@ struct xdp_frame {
> u32 headroom;
> u32 metasize; /* uses lower 8-bits */
> /* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
> - * while mem_type is valid on remote CPU.
> + * while mem_type and queue_index are valid on remote CPU.
> */
> enum xdp_mem_type mem_type:32;
> struct net_device *dev_rx; /* used by cpumap */
> + u32 queue_index; /* used by cpumap */
> u32 frame_sz;
> u32 flags; /* supported values defined in xdp_buff_flags */
> };
Is the updated comment accurate for all paths?
xdp_convert_buff_to_frame() has an early return for the zero-copy
path:
if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL)
return xdp_convert_zc_to_xdp_frame(xdp);
xdp_convert_zc_to_xdp_frame() zeroes the frame with memset and
explicitly sets mem_type but never sets queue_index:
net/core/xdp.c:xdp_convert_zc_to_xdp_frame() {
...
memset(xdpf, 0, sizeof(*xdpf));
...
xdpf->mem_type = MEM_TYPE_PAGE_ORDER0;
/* queue_index is not set -- remains 0 from memset */
...
}
The comment says queue_index is "valid on remote CPU" but for
zero-copy converted frames it will always be 0 regardless of which
hardware queue originated the packet.
> @@ -441,6 +442,7 @@ struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
>
> /* rxq only valid until napi_schedule ends, convert to xdp_mem_type */
> xdp_frame->mem_type = xdp->rxq->mem.type;
> + xdp_frame->queue_index = xdp->rxq->queue_index;
>
> return xdp_frame;
This assignment is only reached for the non-ZC path. Should
xdp_convert_zc_to_xdp_frame() also set queue_index from
xdp->rxq->queue_index to match?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24288456014
^ permalink raw reply
* [PATCH] netfilter: nfnl_cthelper: apply per-class values when updating policies
From: David Carlier @ 2026-04-11 18:57 UTC (permalink / raw)
To: pablo
Cc: fw, phil, kadlec, netfilter-devel, coreteam, netdev, linux-kernel,
David Carlier, stable
When a userspace conntrack helper with multiple expectation classes is
updated via nfnetlink, every class ends up with the first class's
max_expected and timeout values.
nfnl_cthelper_update_policy_all() validates each new policy into the
corresponding slot of the temporary new_policy array, but the second
loop that commits the values into the live helper dereferences
new_policy as a pointer instead of indexing it, so every iteration
reads new_policy[0] regardless of i. An update that changes per-class
values is silently collapsed onto class 0's values with no error
returned to userspace.
Index the temporary array by i in the commit loop so each class gets
its own validated values.
Fixes: 2c422257550f ("netfilter: nfnl_cthelper: fix runtime expectation policy updates")
Cc: stable@vger.kernel.org
Signed-off-by: David Carlier <devnexen@gmail.com>
---
net/netfilter/nfnetlink_cthelper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c
index 0d16ad82d70c..34af6840803e 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -346,8 +346,8 @@ static int nfnl_cthelper_update_policy_all(struct nlattr *tb[],
for (i = 0; i < helper->expect_class_max + 1; i++) {
policy = (struct nf_conntrack_expect_policy *)
&helper->expect_policy[i];
- policy->max_expected = new_policy->max_expected;
- policy->timeout = new_policy->timeout;
+ policy->max_expected = new_policy[i].max_expected;
+ policy->timeout = new_policy[i].timeout;
}
err:
--
2.53.0
^ permalink raw reply related
* Re: [RFC PATCH] bpf: cpumap: report queue_index to xdp_rxq_info
From: Jose A. Perez de Azpillaga @ 2026-04-11 19:10 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: bpf, Madalin Bucur, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Alexei Starovoitov, Daniel Borkmann,
Jesper Dangaard Brouer, John Fastabend, Stanislav Fomichev,
Simon Horman, Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
Network Development, LKML
In-Reply-To: <CAADnVQJOKc0zP4=-STEX0szgzUeS7RaxQTtre=P92R0UStug8A@mail.gmail.com>
On Sat, Apr 11, 2026 at 11:09:56AM -0700, Alexei Starovoitov wrote:
> On Sat, Apr 11, 2026 at 10:51 AM Jose A. Perez de Azpillaga
> <azpijr@gmail.com> wrote:
> >
> > When a packet is redirected to a CPU map entry,
> > cpu_map_bpf_prog_run_xdp() reconstructs a minimal xdp_rxq_info from
> > xdp_frame fields (dev_rx and mem_type) before re-running the BPF program
> > on the target CPU. However, queue_index was never preserved across the
> > CPU boundary, so BPF programs running in cpumap context always observe
> > ctx->rx_queue_index == 0, regardless of which hardware queue originally
> > received the packet.
> >
> > Fix this by storing the originating queue_index in struct xdp_frame,
> > following the same pattern already established for dev_rx and mem_type.
> > The field is populated from rxq->queue_index in
> > xdp_convert_buff_to_frame() during NAPI context, when the rxq_info is
> > still valid, and restored into the reconstructed rxq_info in
> > cpu_map_bpf_prog_run_xdp().
> >
> > Also use xdpf->queue_index in __xdp_build_skb_from_frame() to call
> > skb_record_rx_queue(), which was previously listed as missing
> > information in that function's comment.
> >
> > Also propagate queue_index in dpaa_a050385_wa_xdpf(), which manually
> > constructs a new xdp_frame from an uninitialized page. Without this,
> > queue_index would contain stale data from the page allocator.
> >
> > Signed-off-by: Jose A. Perez de Azpillaga <azpijr@gmail.com>
> > ---
> > Note: this patch was only compiled, not tested.
> >
> > drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 1 +
> > include/net/xdp.h | 4 +++-
> > kernel/bpf/cpumap.c | 2 +-
> > net/core/xdp.c | 2 +-
> > 4 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > index 3edc8d142dd5..00e36b0ac74d 100644
> > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > @@ -2281,6 +2281,7 @@ static int dpaa_a050385_wa_xdpf(struct dpaa_priv *priv,
> > new_xdpf->headroom = priv->tx_headroom;
> > new_xdpf->frame_sz = DPAA_BP_RAW_SIZE;
> > new_xdpf->mem_type = MEM_TYPE_PAGE_ORDER0;
> > + new_xdpf->queue_index = xdpf->queue_index;
> >
> > /* Release the initial buffer */
> > xdp_return_frame_rx_napi(xdpf);
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index aa742f413c35..6db10e6a8864 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -297,10 +297,11 @@ struct xdp_frame {
> > u32 headroom;
> > u32 metasize; /* uses lower 8-bits */
> > /* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
> > - * while mem_type is valid on remote CPU.
> > + * while mem_type and queue_index are valid on remote CPU.
> > */
> > enum xdp_mem_type mem_type:32;
> > struct net_device *dev_rx; /* used by cpumap */
> > + u32 queue_index; /* used by cpumap */
> > u32 frame_sz;
> > u32 flags; /* supported values defined in xdp_buff_flags */
> > };
> > @@ -441,6 +442,7 @@ struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
> >
> > /* rxq only valid until napi_schedule ends, convert to xdp_mem_type */
> > xdp_frame->mem_type = xdp->rxq->mem.type;
> > + xdp_frame->queue_index = xdp->rxq->queue_index;
> >
> > return xdp_frame;
> > }
> > diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> > index 5e59ab896f05..448da572de9a 100644
> > --- a/kernel/bpf/cpumap.c
> > +++ b/kernel/bpf/cpumap.c
> > @@ -197,7 +197,7 @@ static int cpu_map_bpf_prog_run_xdp(struct bpf_cpu_map_entry *rcpu,
> >
> > rxq.dev = xdpf->dev_rx;
> > rxq.mem.type = xdpf->mem_type;
> > - /* TODO: report queue_index to xdp_rxq_info */
> > + rxq.queue_index = xdpf->queue_index;
>
> This is like 5th time people attempt to address this TODO.
>
> Just remove that comment. Don't send broken patches.
oh... okay. but I have a question, since the bot detected something
I didn't and queue_index should be propagated in
xdp_convert_zc_to_xdp_frame(), or maybe intentional?
is it better to do as you said, removing the comment, or doing what the
bot said with proper test?
--
regards,
jose a. p-a
^ permalink raw reply
* Re: [PATCH v1 net-next] selftest: net: Use port outside of the default ip_local_ports in csum.c.
From: Willem de Bruijn @ 2026-04-11 19:18 UTC (permalink / raw)
To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Simon Horman, Willem de Bruijn, Mahesh Bandewar,
Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
In-Reply-To: <20260410215420.1698033-1-kuniyu@google.com>
Kuniyuki Iwashima wrote:
> csum.c binds a socket on a fixed port in init_net to test
> the csum offload feature between two machines.
>
> In our testbed, the test sometimes fails with -EADDRINUSE.
>
> bind r: Address already in use
> bind dgram 6: Address already in use
>
> The fixed ports (33000, 33001, 34000) are all within the default
> ip_local_ports range (32768 ~ 60999), and other processes may
> happen to be using them.
>
> Let's use ports outside of the default ip_local_ports range to
> deflake the test.
>
> # cat /etc/services | grep -E "(13000|13001|13002)" | echo no service
> no service
> # rpm -qf /etc/services
> setup-2.15.0-28.fc44.noarch
>
> We could add an option to specify ports if needed.
>
> Suggested-by: Mahesh Bandewar <maheshb@google.com>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
> tools/testing/selftests/net/lib/csum.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/net/lib/csum.c b/tools/testing/selftests/net/lib/csum.c
> index e28884ce3ab3..4e044689bc37 100644
> --- a/tools/testing/selftests/net/lib/csum.c
> +++ b/tools/testing/selftests/net/lib/csum.c
> @@ -105,9 +105,9 @@ static char *cfg_mac_src;
> static int cfg_proto = IPPROTO_UDP;
> static int cfg_payload_char = 'a';
> static int cfg_payload_len = 100;
> -static uint16_t cfg_port_dst = 34000;
This is paired with wait_port_listen(3400, .. in
tools/testing/selftests/drivers/net/hw/csum.py
It is also used in tools/testing/selftests/drivers/net/hw/tso.py,
which uses rand_port to select a port.
Probably more robust to indeed add an option to specify a port, and
in all callers use rand_port(s).
> -static uint16_t cfg_port_src = 33000;
> -static uint16_t cfg_port_src_encap = 33001;
> +static uint16_t cfg_port_dst = 13000;
> +static uint16_t cfg_port_src = 13001;
> +static uint16_t cfg_port_src_encap = 13002;
> static unsigned int cfg_random_seed;
> static int cfg_rcvbuf = 1 << 22; /* be able to queue large cfg_num_pkt */
> static bool cfg_send_pfpacket;
> --
> 2.53.0.1213.gd9a14994de-goog
>
^ permalink raw reply
* Re: [PATCH net 1/2] netfilter: skip recording stale or retransmitted INIT
From: Florian Westphal @ 2026-04-11 20:16 UTC (permalink / raw)
To: Xin Long
Cc: network dev, linux-sctp, davem, kuba, Eric Dumazet, Paolo Abeni,
Simon Horman, Marcelo Ricardo Leitner, Yi Chen
In-Reply-To: <6e09f9a8d1f13f3ce691c696d3dd7b2a2e6c6184.1775847557.git.lucien.xin@gmail.com>
Xin Long <lucien.xin@gmail.com> wrote:
> diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
> index 645d2c43ebf7..7e10fa65cbdd 100644
> --- a/net/netfilter/nf_conntrack_proto_sctp.c
> +++ b/net/netfilter/nf_conntrack_proto_sctp.c
> @@ -466,9 +466,13 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct,
> if (!ih)
> goto out_unlock;
>
> - if (ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir])
> - ct->proto.sctp.init[!dir] = 0;
> - ct->proto.sctp.init[dir] = 1;
> + /* Do not record INIT matching peer vtag (stale or retransmitted INIT). */
> + if (old_state == SCTP_CONNTRACK_NONE ||
> + ct->proto.sctp.vtag[!dir] != ih->init_tag) {
Should ct->proto.sctp.vtag[!dir] == ih->init_tag case also
set ignore = true?
^ permalink raw reply
* Re: [PATCH net] netrom: do some basic forms of validation on incoming frames
From: Chris Maness @ 2026-04-11 20:33 UTC (permalink / raw)
To: hugh
Cc: Craig, Kuniyuki Iwashima, kuba, davem, edumazet, gregkh, horms,
linux-hams, linux-kernel, netdev, pabeni, stable, workflows,
yizhe
In-Reply-To: <CANnsUMEniMzLnp5h=Gz83=Wcegc-jGz9vqyWyEpWx-OH=Dij1w@mail.gmail.com>
forms of validation on incoming frames
I for one run two BBS’s that use LinFBB. LinFBB uses the kernel code
for its AX.25 stack. This is still excellent software that is
maintained. I also use Linux as a netrom node for said BBS with real
radio ports and internet connectivity to out of town nodes using the
ax25ipd/netromd that both use kernel stacks.
73 de Chris KQ6UP
Thanks,
Chris Maness
Thanks,
Chris Maness
-Sent from my iPhone
On Fri, Apr 10, 2026 at 4:53 PM Chris Maness
<christopher.maness@gmail.com> wrote:
>
> I for one run two BBS’s that use LinFBB. LinFBB uses the kernel code for its AX.25 stack. This is still excellent software that is maintained. I also use Linux as a netrom node for said BBS with real radio ports and internet connectivity to out of town nodes using the ax25ipd/netromd that both use kernel stacks.
>
> 73 de Chris KQ6UP
>
> Thanks,
> Chris Maness
> -Sent from my iPhone
>
>
> On Fri, Apr 10, 2026 at 4:38 PM Hugh Blemings <hugh@blemings.org> wrote:
>>
>>
>> On 11/4/2026 08:51, Craig wrote:
>> >> If the main concern here is ongoing maintenance of these Ham Radio
>> >> related protocols/drivers, can we pause for a moment on anything as
>> >> dramatic as removing from the tree entirely ?
>> >>
>> >> There is a good cohort of capable kernel folks that either are or
>> >> were ham radio operators who I believe, upon realising that things
>> >> have got to this point, will be happy to redouble efforts to ensure
>> >> this code maintained and tested to a satisfactory standard.
>> >>
>> >> Or, alternatively, as a technical community it may be that the Ham
>> >> Radio interested folks conclude that out of tree or user space
>> >> solutions are a better way forward as others have proposed.
>> >>
>> >> Give us a few days, please, for the word to be put around that we
>> >> need to pull ourselves together a bit as a technical group :)
>> >>
>> >
>> > I, for one, really can't imagine pulling an entire network subsytem
>> > out of the kernel without any
>> > knowledge of how/if/when it's used. Like intercontinental radio
>> > networks, global email, ax.25
>> > keyboard-to-keyboard, BBS and other emergency-communication systems
>> > throughout the
>> > world. If you're sure the Internet will never fail, I guess it makes
>> > sense removing all of this
>> > since it's inconvenient to maintain.
>> >
>> > Global AX.25 keyboard-to-keyboard on 14.105Mhz
>> >
>> > https://qsl.net/kb9pvh/105.html
>> >
>> > AX.25/netrom VHF routed networks spanning from Oregon to Los Angeles.
>> >
>> > https://www.easymapmaker.com/map/80666c4898ec6e8fa0c35add5d03282d
>> >
>> > Global radio email using AX.25
>> >
>> > https://winlink.org/RMSChannels (1,336 AX.25 email packet nodes on
>> > the Earth and Space)
>> >
>> > This is all in operation by Amateur Radio ARES emergency
>> > protocols/technologies. This
>> > will not pass the headline test when it comes to Linux detractors.
>> >
>> > Most of this is running on Raspberry Pi / Linux 24/7.
>> >
>> > If we want to kill all these apps and somehow force them into user space,
>> > it's akin to just switching to Windows - and flounder with the
>> > Microsoft folks
>> > trying to do the same thing.
>>
>> Your email Craig neatly encapsulates just some of the practical and
>> ongoing applications of the kernel code in question - I don't think this
>> is in dispute.
>>
>> What's pertinent is if we as the ham/amatuer radio community can agree
>> on whether in tree, out of tree modules, or a userspace device driver
>> approach make the most sense. If we are to keep code in the kernel in
>> any form, we as a community need to find someone(s) that have the skills
>> and bandwidth to keep the in tree code up to date.
>>
>> I don't think this would be onerous and I have a couple of people in
>> mind to nudge who may be happy to do so if that proves the right way
>> forward. At a pinch I could do it, but that'll mean a lot of catching
>> up. But I think it reasonable that the responsibility here falls to
>> folks that are closer to the code in question than the wider and
>> overworked kernel maintainer community.
>>
>> That said, I think Dan Cross (KZ2X) earlier email makes a pretty strong
>> case for moving out of the kernel while still providing a way to have
>> backward compatibility, perhaps this might be the way forward?
>>
>> In any case, done well, this approach would not kill the apps or force
>> anything like switching to Windows! :) Great projects like digipi would
>> be able to continue with minimal changes.
>>
>> I wonder if a separate thread in linux-hams makes sense to discuss the
>> various longer term approaches to maintaining these capabilities - I'll
>> try make time later today to kick one off - such deliberations will be
>> of less interest to the broader LKML and other lists.
>>
>> Cheers/73
>> Hugh
>>
>>
>>
>> >
>> >
>> > -craig
>> > https://digipi.org/
>> >
>> >
>> --
>> I am slowly moving to hugh@blemings.id.au as my main email address.
>> If you're using hugh@blemings.org please update your address book accordingly.
>> Thank you :)
>>
>>
--
Thanks,
Chris Maness
^ permalink raw reply
* Re: [PATCH net v4 0/2] stmmac crash/stall fixes when under memory pressure
From: Sam Edwards @ 2026-04-11 20:40 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Maxime Coquelin, Alexandre Torgue, Maxime Chevallier,
Ovidiu Panait, Vladimir Oltean, Baruch Siach, Serge Semin,
Giuseppe Cavallaro, netdev, linux-stm32, linux-arm-kernel,
linux-kernel
In-Reply-To: <adjrtRSepmac2hpN@shell.armlinux.org.uk>
On Fri, Apr 10, 2026 at 5:23 AM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Thu, Apr 02, 2026 at 10:39:32AM -0700, Sam Edwards wrote:
> > On Thu, Apr 2, 2026 at 10:16 AM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > > I've tested this on my Jetson Xavier platform. One of the issues I've
> > > had is that running iperf3 results in the receive side stalling because
> > > it runs out of descriptors. However, despite the receive ring
> > > eventually being re-filled and the hardware appropriately prodded, it
> > > steadfastly refuses to restart, despite the descriptors having been
> > > updated.
> >
> > Hi Russell,
> >
> > Just to make sure I understand correctly: before my patches, you've
> > been observing this problem on Xavier for a while (no interrupts, ring
> > goes dry); with my patches, the ring is refilled, but the dwmac5
> > doesn't resume DMA. (Ah, just saw your follow-up email.)
> >
> > > Any ideas?
> >
> > Off the top of my head, my hypothesis is that dwmac5 has an additional
> > tripwire when the receive DMA is exhausted, and the
> > stmmac_set_rx_tail_ptr()/stmmac_enable_dma_reception() at the end of
> > stmmac_rx_refill() aren't sufficient to wake it back up.
> >
> > I think this is new to dwmac5, because my RK3588 (dwmac4.20 iirc)
> > happily resumes after the same condition.
> >
> > You gave a lot of info; thanks! I'll try to scrape up some
> > documentation on dwmac5 to see if there's something more
> > stmmac_rx_refill() ought to be doing. I think I have a Xavier NX
> > around here somewhere, I'll see if I can repro the problem.
>
> I've added dma_rmb() into dwmac4_wrback_get_tx_status() and
> dwmac4_wrback_get_rx_status(), and with that I've had an iperf3
> instance finally complete... but only once:
Hi Russell,
To me it feels relevant that the T194 doesn't use first-party
ARM/Cortex cores but rather Nvidia's in-house "Carmel" architecture.
Do you suppose the cache there is quirky in such a way that either:
1) We're seeing poor cache hygiene in stmmac where other caches are
more forgiving (more likely)
2) Carmel's cache has a subtle hardware bug triggered by stmmac's
specific access pattern (less likely)?
I'm still trying to get my Xavier NX to boot on net-next. It's running
into eMMC corruption/stalls very early in the boot process (at
slightly different times; feels like a problem in autocalibration)
that I'm not seeing on older kernels. Once I'm done bisecting that
regression I'll take a deeper look at this stmmac mystery. :)
Cheers,
Sam
>
> root@tegra-ubuntu:~# iperf3 -c 192.168.248.1 -R
> Connecting to host 192.168.248.1, port 5201
> Reverse mode, remote host 192.168.248.1 is sending
> [ 5] local 192.168.248.174 port 42232 connected to 192.168.248.1 port 5201
> [ ID] Interval Transfer Bitrate
> [ 5] 0.00-1.00 sec 50.8 MBytes 426 Mbits/sec
> [ 5] 1.00-2.00 sec 54.9 MBytes 460 Mbits/sec
> [ 5] 2.00-3.00 sec 54.0 MBytes 453 Mbits/sec
> [ 5] 3.00-4.00 sec 53.8 MBytes 452 Mbits/sec
> [ 5] 4.00-5.00 sec 52.4 MBytes 438 Mbits/sec
> [ 5] 5.00-6.00 sec 54.3 MBytes 455 Mbits/sec
> [ 5] 6.00-7.00 sec 53.7 MBytes 452 Mbits/sec
> [ 5] 7.00-8.00 sec 52.8 MBytes 443 Mbits/sec
> [ 5] 8.00-9.00 sec 53.7 MBytes 451 Mbits/sec
> [ 5] 9.00-10.00 sec 54.3 MBytes 455 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval Transfer Bitrate Retr
> [ 5] 0.00-10.01 sec 537 MBytes 450 Mbits/sec 13 sender
> [ 5] 0.00-10.00 sec 535 MBytes 448 Mbits/sec receiver
>
> iperf Done.
>
> So, it seems better, but not completely solved.
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> index 2994df41ec2c..119f31c94b61 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c
> @@ -17,10 +17,12 @@ static int dwmac4_wrback_get_tx_status(struct stmmac_extra_stats *x,
> struct dma_desc *p,
> void __iomem *ioaddr)
> {
> - u32 tdes3 = le32_to_cpu(p->des3);
> + u32 tdes3;
> int ret = tx_done;
>
> /* Get tx owner first */
> + dma_rmb();
> + tdes3 = le32_to_cpu(p->des3);
> if (unlikely(tdes3 & TDES3_OWN))
> return tx_dma_own;
>
> @@ -70,12 +72,12 @@ static int dwmac4_wrback_get_tx_status(struct stmmac_extra_stats *x,
> static int dwmac4_wrback_get_rx_status(struct stmmac_extra_stats *x,
> struct dma_desc *p)
> {
> - u32 rdes1 = le32_to_cpu(p->des1);
> - u32 rdes2 = le32_to_cpu(p->des2);
> - u32 rdes3 = le32_to_cpu(p->des3);
> + u32 rdes1, rdes2, rdes3;
> int message_type;
> int ret = good_frame;
>
> + dma_rmb();
> + rdes3 = le32_to_cpu(p->des3);
> if (unlikely(rdes3 & RDES3_OWN))
> return dma_own;
>
> @@ -107,6 +109,7 @@ static int dwmac4_wrback_get_rx_status(struct stmmac_extra_stats *x,
>
> message_type = FIELD_GET(RDES1_PTP_MSG_TYPE_MASK, rdes1);
>
> + rdes1 = le32_to_cpu(p->des1);
> if (rdes1 & RDES1_IP_HDR_ERROR) {
> x->ip_hdr_err++;
> ret |= csum_none;
> @@ -152,6 +155,7 @@ static int dwmac4_wrback_get_rx_status(struct stmmac_extra_stats *x,
> if (rdes1 & RDES1_TIMESTAMP_DROPPED)
> x->timestamp_dropped++;
>
> + rdes2 = le32_to_cpu(p->des2);
> if (unlikely(rdes2 & RDES2_SA_FILTER_FAIL)) {
> x->sa_rx_filter_fail++;
> ret = discard_frame;
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply
* Re: [PATCH next-next] net: phy: mscc: Drop redundant phydev->lock
From: Andrew Lunn @ 2026-04-11 20:44 UTC (permalink / raw)
To: Biju
Cc: Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Biju Das, Russell King, Lad Prabhakar,
Horatiu Vultur, Vladimir Oltean, netdev, linux-kernel,
Geert Uytterhoeven, linux-renesas-soc
In-Reply-To: <20260411154959.200091-1-biju.das.jz@bp.renesas.com>
On Sat, Apr 11, 2026 at 04:49:56PM +0100, Biju wrote:
> From: Biju Das <biju.das.jz@bp.renesas.com>
>
> Remove manual mutex_lock/unlock(&phydev->lock) calls from several
> functions in the MSCC PHY driver, as the PHY core already holds this lock
> when invoking these callbacks.
>
> The affected functions are:
>
> vsc85xx_edge_rate_cntl_set() — lock/unlock around phy_modify_paged()
> vsc85xx_mac_if_set() — lock/unlock with a goto out_unlock error path
> vsc8531_pre_init_seq_set() — lock/unlock around phy_select/restore_page()
> vsc85xx_eee_init_seq_set() — lock/unlock around phy_select/restore_page()
>
> Along with dropping the locks, error-path labels are renamed from
> out_unlock to err or restore_oldpage to better reflect their purpose now
> that no unlocking is performed. In vsc8531_pre_init_seq_set() and
> vsc85xx_eee_init_seq_set(), the redundant intermediate assignment of
> oldpage before returning is also eliminated.
>
> No functional change intended.
This patch needs to be sent as part of the patchset with your other
change. The order they get merged matters, otherwise a git bisect
could land on a deadlock.
Andrew
---
pw-bot: cr
^ permalink raw reply
* Re: [PATCH net] net: phy: qcom: at803x: Use the correct bit to disable extended next page
From: Andrew Lunn @ 2026-04-11 20:49 UTC (permalink / raw)
To: Maxime Chevallier
Cc: Jakub Kicinski, davem, Eric Dumazet, Paolo Abeni, Simon Horman,
Russell King, thomas.petazzoni, netdev, linux-kernel,
linux-arm-msm
In-Reply-To: <c37f182e-cbb4-4f0b-817a-759d39940212@bootlin.com>
> > Should we keep ADVERTISE_RESV?
> >
> > 45.2.7.6 AN advertisement register
> >
> > If the Auto-Negotiation advertisement register (register 4) is
> > present, (see 28.2.4.1.3), then this register is a copy of the
> > Auto-Negotiation advertisement register (register 4). In this case,
> > reads to the AN advertisement register (7.16) report the value of
> > the Auto-Negotiation advertisement register (register 4); writes to
> > the AN advertisement register (7.16) cause a write to occur to the
> > Auto-Negotiation advertisement register.
> >
> > So MDIO_MMD_AN:MDIO_AN_ADVERTISE is a straight copy of MII_ADVERTISE.
> >
> > ef4_mdio_write(efx, MDIO_MMD_AN, MDIO_AN_ADVERTISE, reg);
> > ret = phy_write_mmd(phydev, MDIO_MMD_AN, MDIO_AN_ADVERTISE, adv);
> >
> > So ADVERTISE_XNP is just as valid in the other two drivers using
> > ADVERTISE_RESV. I think we should change those as well to
> > ADVERTISE_XNP and remove ADVERTISE_RESV?
> >
> > Andrew
>
> I agree with that yes and I've considered converting these drivers once
> we have net merged into net-next should this patch be applied :)
Ah, sorry, missed the patch was targeting net. Please do submit a
cleanup for net-next later on.
> That said, ADVERTISE_RESV is in uapi, is it even possible to remove it ?
Good point. It probably does have to stay.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Andrew
^ permalink raw reply
* Re: [PATCH net-next 1/2] net: dsa: mxl862xx: add ethtool statistics support
From: Andrew Lunn @ 2026-04-11 22:41 UTC (permalink / raw)
To: Daniel Golle
Cc: Vladimir Oltean, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Russell King, netdev, linux-kernel, Frank Wunderlich,
Chad Monroe, Cezary Wilmanski, Liang Xu, Benny (Ying-Tsan) Weng,
Jose Maria Verdu Munoz, Avinash Jayaraman, John Crispin
In-Reply-To: <127b374aba1fb8eb4ad9215e0c0826dd4545fc58.1775865049.git.daniel@makrotopia.org>
> +static int mxl862xx_read_rmon(struct dsa_switch *ds, int port,
> + struct mxl862xx_rmon_port_cnt *cnt)
> +{
> + memset(cnt, 0, sizeof(*cnt));
> + cnt->port_type = cpu_to_le32(MXL862XX_CTP_PORT);
> + cnt->port_id = cpu_to_le16(port);
> +
> + return MXL862XX_API_READ(ds->priv, MXL862XX_RMON_PORT_GET, *cnt);
> +}
> +
> +static void mxl862xx_get_ethtool_stats(struct dsa_switch *ds, int port,
> + u64 *data)
> +{
> + const struct mxl862xx_mib_desc *mib;
> + struct mxl862xx_rmon_port_cnt cnt;
> + int ret, i;
> + void *field;
> +
> + ret = mxl862xx_read_rmon(ds, port, &cnt);
> + if (ret) {
> + dev_err(ds->dev, "failed to read RMON stats on port %d\n", port);
> + return;
> + }
RMON statistics should be returned via the .get_rmon_stats in
dsa_switch_ops. Please only return statistics here which don't fit any
of the other statistics functions in dsa_switch_ops.
Andrew
---
pw-bot: cr
^ permalink raw reply
* [PATCH net-next v2 0/2] net: dsa: mxl862xx: add statistics support
From: Daniel Golle @ 2026-04-12 0:01 UTC (permalink / raw)
To: Daniel Golle, Andrew Lunn, Vladimir Oltean, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Russell King, netdev,
linux-kernel
Cc: Frank Wunderlich, Chad Monroe, Cezary Wilmanski, Liang Xu,
Benny (Ying-Tsan) Weng, Jose Maria Verdu Munoz, Avinash Jayaraman,
John Crispin
Add per-port RMON statistics support for the MxL862xx DSA driver,
covering hardware-specific ethtool -S counters, standard IEEE 802.3
MAC/ctrl/pause statistics, and rtnl_link_stats64 via polled 64-bit
accumulation.
Changes since v1:
* trim mxl862xx_mib[] to counters not covered elsewhere only
* remove histogram counters (moved to .get_rmon_stats)
* remove RMON error counters (moved to .get_rmon_stats)
* remove counters already in .get_eth_mac_stats
* remove counters already in .get_stats64
* add mxl862xx_rmon_ranges[] and mxl862xx_get_rmon_stats()
Daniel Golle (2):
net: dsa: mxl862xx: add ethtool statistics support
net: dsa: mxl862xx: implement .get_stats64
drivers/net/dsa/mxl862xx/mxl862xx-api.h | 142 +++++++++
drivers/net/dsa/mxl862xx/mxl862xx-cmd.h | 3 +
drivers/net/dsa/mxl862xx/mxl862xx-host.c | 8 +-
drivers/net/dsa/mxl862xx/mxl862xx.c | 348 +++++++++++++++++++++++
drivers/net/dsa/mxl862xx/mxl862xx.h | 94 +++++-
5 files changed, 588 insertions(+), 7 deletions(-)
--
2.53.0
^ permalink raw reply
* [PATCH net-next v2 1/2] net: dsa: mxl862xx: add ethtool statistics support
From: Daniel Golle @ 2026-04-12 0:01 UTC (permalink / raw)
To: Daniel Golle, Andrew Lunn, Vladimir Oltean, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Russell King, netdev,
linux-kernel
Cc: Frank Wunderlich, Chad Monroe, Cezary Wilmanski, Liang Xu,
Benny (Ying-Tsan) Weng, Jose Maria Verdu Munoz, Avinash Jayaraman,
John Crispin
In-Reply-To: <cover.1775951347.git.daniel@makrotopia.org>
The MxL862xx firmware exposes per-port RMON counters through the
RMON_PORT_GET command, covering standard IEEE 802.3 MAC statistics
(unicast/multicast/broadcast packet and byte counts, collision
counters, pause frames) as well as hardware-specific counters such
as extended VLAN discard and MTU exceed events.
Add the RMON counter firmware API structures and command definitions.
Implement .get_strings, .get_sset_count, and .get_ethtool_stats for
legacy ethtool -S support. Implement .get_eth_mac_stats,
.get_eth_ctrl_stats, and .get_pause_stats for the standardized
IEEE 802.3 statistics interface.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
v2:
* trim mxl862xx_mib[] to counters not covered elsewhere only
* remove histogram counters (moved to .get_rmon_stats)
* remove RMON error counters (moved to .get_rmon_stats)
* remove counters already in .get_eth_mac_stats
* remove counters already in .get_stats64
* add mxl862xx_rmon_ranges[] and mxl862xx_get_rmon_stats()
drivers/net/dsa/mxl862xx/mxl862xx-api.h | 142 +++++++++++++++++++
drivers/net/dsa/mxl862xx/mxl862xx-cmd.h | 3 +
drivers/net/dsa/mxl862xx/mxl862xx.c | 173 ++++++++++++++++++++++++
3 files changed, 318 insertions(+)
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx-api.h b/drivers/net/dsa/mxl862xx/mxl862xx-api.h
index c902e90397e5f..fb21ddc1bf1c0 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx-api.h
+++ b/drivers/net/dsa/mxl862xx/mxl862xx-api.h
@@ -1224,4 +1224,146 @@ struct mxl862xx_sys_fw_image_version {
__le32 iv_build_num;
} __packed;
+/**
+ * enum mxl862xx_port_type - Port Type
+ * @MXL862XX_LOGICAL_PORT: Logical Port
+ * @MXL862XX_PHYSICAL_PORT: Physical Port
+ * @MXL862XX_CTP_PORT: Connectivity Termination Port (CTP)
+ * @MXL862XX_BRIDGE_PORT: Bridge Port
+ */
+enum mxl862xx_port_type {
+ MXL862XX_LOGICAL_PORT = 0,
+ MXL862XX_PHYSICAL_PORT,
+ MXL862XX_CTP_PORT,
+ MXL862XX_BRIDGE_PORT,
+};
+
+/**
+ * enum mxl862xx_rmon_port_type - RMON counter table type
+ * @MXL862XX_RMON_CTP_PORT_RX: CTP RX counters
+ * @MXL862XX_RMON_CTP_PORT_TX: CTP TX counters
+ * @MXL862XX_RMON_BRIDGE_PORT_RX: Bridge port RX counters
+ * @MXL862XX_RMON_BRIDGE_PORT_TX: Bridge port TX counters
+ * @MXL862XX_RMON_CTP_PORT_PCE_BYPASS: CTP PCE bypass counters
+ * @MXL862XX_RMON_TFLOW_RX: TFLOW RX counters
+ * @MXL862XX_RMON_TFLOW_TX: TFLOW TX counters
+ * @MXL862XX_RMON_QMAP: QMAP counters
+ * @MXL862XX_RMON_METER: Meter counters
+ * @MXL862XX_RMON_PMAC: PMAC counters
+ */
+enum mxl862xx_rmon_port_type {
+ MXL862XX_RMON_CTP_PORT_RX = 0,
+ MXL862XX_RMON_CTP_PORT_TX,
+ MXL862XX_RMON_BRIDGE_PORT_RX,
+ MXL862XX_RMON_BRIDGE_PORT_TX,
+ MXL862XX_RMON_CTP_PORT_PCE_BYPASS,
+ MXL862XX_RMON_TFLOW_RX,
+ MXL862XX_RMON_TFLOW_TX,
+ MXL862XX_RMON_QMAP = 0x0e,
+ MXL862XX_RMON_METER = 0x19,
+ MXL862XX_RMON_PMAC = 0x1c,
+};
+
+/**
+ * struct mxl862xx_rmon_port_cnt - RMON counters for a port
+ * @port_type: Port type for counter retrieval (see &enum mxl862xx_port_type)
+ * @port_id: Ethernet port number (zero-based)
+ * @sub_if_id_group: Sub-interface ID group
+ * @pce_bypass: Separate CTP Tx counters when PCE is bypassed
+ * @rx_extended_vlan_discard_pkts: Discarded at extended VLAN operation
+ * @mtu_exceed_discard_pkts: Discarded due to MTU exceeded
+ * @tx_under_size_good_pkts: Tx undersize (<64) packet count
+ * @tx_oversize_good_pkts: Tx oversize (>1518) packet count
+ * @rx_good_pkts: Received good packet count
+ * @rx_unicast_pkts: Received unicast packet count
+ * @rx_broadcast_pkts: Received broadcast packet count
+ * @rx_multicast_pkts: Received multicast packet count
+ * @rx_fcserror_pkts: Received FCS error packet count
+ * @rx_under_size_good_pkts: Received undersize good packet count
+ * @rx_oversize_good_pkts: Received oversize good packet count
+ * @rx_under_size_error_pkts: Received undersize error packet count
+ * @rx_good_pause_pkts: Received good pause packet count
+ * @rx_oversize_error_pkts: Received oversize error packet count
+ * @rx_align_error_pkts: Received alignment error packet count
+ * @rx_filtered_pkts: Filtered packet count
+ * @rx64byte_pkts: Received 64-byte packet count
+ * @rx127byte_pkts: Received 65-127 byte packet count
+ * @rx255byte_pkts: Received 128-255 byte packet count
+ * @rx511byte_pkts: Received 256-511 byte packet count
+ * @rx1023byte_pkts: Received 512-1023 byte packet count
+ * @rx_max_byte_pkts: Received 1024-max byte packet count
+ * @tx_good_pkts: Transmitted good packet count
+ * @tx_unicast_pkts: Transmitted unicast packet count
+ * @tx_broadcast_pkts: Transmitted broadcast packet count
+ * @tx_multicast_pkts: Transmitted multicast packet count
+ * @tx_single_coll_count: Transmit single collision count
+ * @tx_mult_coll_count: Transmit multiple collision count
+ * @tx_late_coll_count: Transmit late collision count
+ * @tx_excess_coll_count: Transmit excessive collision count
+ * @tx_coll_count: Transmit collision count
+ * @tx_pause_count: Transmit pause packet count
+ * @tx64byte_pkts: Transmitted 64-byte packet count
+ * @tx127byte_pkts: Transmitted 65-127 byte packet count
+ * @tx255byte_pkts: Transmitted 128-255 byte packet count
+ * @tx511byte_pkts: Transmitted 256-511 byte packet count
+ * @tx1023byte_pkts: Transmitted 512-1023 byte packet count
+ * @tx_max_byte_pkts: Transmitted 1024-max byte packet count
+ * @tx_dropped_pkts: Transmit dropped packet count
+ * @tx_acm_dropped_pkts: Transmit ACM dropped packet count
+ * @rx_dropped_pkts: Received dropped packet count
+ * @rx_good_bytes: Received good byte count (64-bit)
+ * @rx_bad_bytes: Received bad byte count (64-bit)
+ * @tx_good_bytes: Transmitted good byte count (64-bit)
+ */
+struct mxl862xx_rmon_port_cnt {
+ __le32 port_type; /* enum mxl862xx_port_type */
+ __le16 port_id;
+ __le16 sub_if_id_group;
+ u8 pce_bypass;
+ __le32 rx_extended_vlan_discard_pkts;
+ __le32 mtu_exceed_discard_pkts;
+ __le32 tx_under_size_good_pkts;
+ __le32 tx_oversize_good_pkts;
+ __le32 rx_good_pkts;
+ __le32 rx_unicast_pkts;
+ __le32 rx_broadcast_pkts;
+ __le32 rx_multicast_pkts;
+ __le32 rx_fcserror_pkts;
+ __le32 rx_under_size_good_pkts;
+ __le32 rx_oversize_good_pkts;
+ __le32 rx_under_size_error_pkts;
+ __le32 rx_good_pause_pkts;
+ __le32 rx_oversize_error_pkts;
+ __le32 rx_align_error_pkts;
+ __le32 rx_filtered_pkts;
+ __le32 rx64byte_pkts;
+ __le32 rx127byte_pkts;
+ __le32 rx255byte_pkts;
+ __le32 rx511byte_pkts;
+ __le32 rx1023byte_pkts;
+ __le32 rx_max_byte_pkts;
+ __le32 tx_good_pkts;
+ __le32 tx_unicast_pkts;
+ __le32 tx_broadcast_pkts;
+ __le32 tx_multicast_pkts;
+ __le32 tx_single_coll_count;
+ __le32 tx_mult_coll_count;
+ __le32 tx_late_coll_count;
+ __le32 tx_excess_coll_count;
+ __le32 tx_coll_count;
+ __le32 tx_pause_count;
+ __le32 tx64byte_pkts;
+ __le32 tx127byte_pkts;
+ __le32 tx255byte_pkts;
+ __le32 tx511byte_pkts;
+ __le32 tx1023byte_pkts;
+ __le32 tx_max_byte_pkts;
+ __le32 tx_dropped_pkts;
+ __le32 tx_acm_dropped_pkts;
+ __le32 rx_dropped_pkts;
+ __le64 rx_good_bytes;
+ __le64 rx_bad_bytes;
+ __le64 tx_good_bytes;
+} __packed;
+
#endif /* __MXL862XX_API_H */
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx-cmd.h b/drivers/net/dsa/mxl862xx/mxl862xx-cmd.h
index 45df37cde40d1..f1ea40aa7ea08 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx-cmd.h
+++ b/drivers/net/dsa/mxl862xx/mxl862xx-cmd.h
@@ -16,6 +16,7 @@
#define MXL862XX_BRDGPORT_MAGIC 0x400
#define MXL862XX_CTP_MAGIC 0x500
#define MXL862XX_QOS_MAGIC 0x600
+#define MXL862XX_RMON_MAGIC 0x700
#define MXL862XX_SWMAC_MAGIC 0xa00
#define MXL862XX_EXTVLAN_MAGIC 0xb00
#define MXL862XX_VLANFILTER_MAGIC 0xc00
@@ -43,6 +44,8 @@
#define MXL862XX_QOS_METERCFGSET (MXL862XX_QOS_MAGIC + 0x2)
#define MXL862XX_QOS_METERALLOC (MXL862XX_QOS_MAGIC + 0x2a)
+#define MXL862XX_RMON_PORT_GET (MXL862XX_RMON_MAGIC + 0x1)
+
#define MXL862XX_MAC_TABLEENTRYADD (MXL862XX_SWMAC_MAGIC + 0x2)
#define MXL862XX_MAC_TABLEENTRYREAD (MXL862XX_SWMAC_MAGIC + 0x3)
#define MXL862XX_MAC_TABLEENTRYQUERY (MXL862XX_SWMAC_MAGIC + 0x4)
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx.c b/drivers/net/dsa/mxl862xx/mxl862xx.c
index fca9a3e36bb69..58bf7210c6d40 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx.c
+++ b/drivers/net/dsa/mxl862xx/mxl862xx.c
@@ -30,6 +30,38 @@
#define MXL862XX_API_READ_QUIET(dev, cmd, data) \
mxl862xx_api_wrap(dev, cmd, &(data), sizeof((data)), true, true)
+struct mxl862xx_mib_desc {
+ unsigned int size;
+ unsigned int offset;
+ const char *name;
+};
+
+#define MIB_DESC(_size, _name, _element) \
+{ \
+ .size = _size, \
+ .name = _name, \
+ .offset = offsetof(struct mxl862xx_rmon_port_cnt, _element) \
+}
+
+/* Hardware-specific counters not covered by any standardized stats callback. */
+static const struct mxl862xx_mib_desc mxl862xx_mib[] = {
+ MIB_DESC(1, "TxAcmDroppedPkts", tx_acm_dropped_pkts),
+ MIB_DESC(1, "RxFilteredPkts", rx_filtered_pkts),
+ MIB_DESC(1, "RxExtendedVlanDiscardPkts", rx_extended_vlan_discard_pkts),
+ MIB_DESC(1, "MtuExceedDiscardPkts", mtu_exceed_discard_pkts),
+ MIB_DESC(2, "RxBadBytes", rx_bad_bytes),
+};
+
+static const struct ethtool_rmon_hist_range mxl862xx_rmon_ranges[] = {
+ { 0, 64 },
+ { 65, 127 },
+ { 128, 255 },
+ { 256, 511 },
+ { 512, 1023 },
+ { 1024, 10240 },
+ {}
+};
+
#define MXL862XX_SDMA_PCTRLP(p) (0xbc0 + ((p) * 0x6))
#define MXL862XX_SDMA_PCTRL_EN BIT(0)
@@ -1734,6 +1766,140 @@ static int mxl862xx_port_bridge_flags(struct dsa_switch *ds, int port,
return 0;
}
+static void mxl862xx_get_strings(struct dsa_switch *ds, int port,
+ u32 stringset, u8 *data)
+{
+ int i;
+
+ if (stringset != ETH_SS_STATS)
+ return;
+
+ for (i = 0; i < ARRAY_SIZE(mxl862xx_mib); i++)
+ ethtool_puts(&data, mxl862xx_mib[i].name);
+}
+
+static int mxl862xx_get_sset_count(struct dsa_switch *ds, int port, int sset)
+{
+ if (sset != ETH_SS_STATS)
+ return 0;
+
+ return ARRAY_SIZE(mxl862xx_mib);
+}
+
+static int mxl862xx_read_rmon(struct dsa_switch *ds, int port,
+ struct mxl862xx_rmon_port_cnt *cnt)
+{
+ memset(cnt, 0, sizeof(*cnt));
+ cnt->port_type = cpu_to_le32(MXL862XX_CTP_PORT);
+ cnt->port_id = cpu_to_le16(port);
+
+ return MXL862XX_API_READ(ds->priv, MXL862XX_RMON_PORT_GET, *cnt);
+}
+
+static void mxl862xx_get_ethtool_stats(struct dsa_switch *ds, int port,
+ u64 *data)
+{
+ const struct mxl862xx_mib_desc *mib;
+ struct mxl862xx_rmon_port_cnt cnt;
+ int ret, i;
+ void *field;
+
+ ret = mxl862xx_read_rmon(ds, port, &cnt);
+ if (ret) {
+ dev_err(ds->dev, "failed to read RMON stats on port %d\n", port);
+ return;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(mxl862xx_mib); i++) {
+ mib = &mxl862xx_mib[i];
+ field = (u8 *)&cnt + mib->offset;
+
+ if (mib->size == 1)
+ *data++ = le32_to_cpu(*(__le32 *)field);
+ else
+ *data++ = le64_to_cpu(*(__le64 *)field);
+ }
+}
+
+static void mxl862xx_get_eth_mac_stats(struct dsa_switch *ds, int port,
+ struct ethtool_eth_mac_stats *mac_stats)
+{
+ struct mxl862xx_rmon_port_cnt cnt;
+
+ if (mxl862xx_read_rmon(ds, port, &cnt))
+ return;
+
+ mac_stats->FramesTransmittedOK = le32_to_cpu(cnt.tx_good_pkts);
+ mac_stats->SingleCollisionFrames = le32_to_cpu(cnt.tx_single_coll_count);
+ mac_stats->MultipleCollisionFrames = le32_to_cpu(cnt.tx_mult_coll_count);
+ mac_stats->FramesReceivedOK = le32_to_cpu(cnt.rx_good_pkts);
+ mac_stats->FrameCheckSequenceErrors = le32_to_cpu(cnt.rx_fcserror_pkts);
+ mac_stats->AlignmentErrors = le32_to_cpu(cnt.rx_align_error_pkts);
+ mac_stats->OctetsTransmittedOK = le64_to_cpu(cnt.tx_good_bytes);
+ mac_stats->LateCollisions = le32_to_cpu(cnt.tx_late_coll_count);
+ mac_stats->FramesAbortedDueToXSColls = le32_to_cpu(cnt.tx_excess_coll_count);
+ mac_stats->OctetsReceivedOK = le64_to_cpu(cnt.rx_good_bytes);
+ mac_stats->MulticastFramesXmittedOK = le32_to_cpu(cnt.tx_multicast_pkts);
+ mac_stats->BroadcastFramesXmittedOK = le32_to_cpu(cnt.tx_broadcast_pkts);
+ mac_stats->MulticastFramesReceivedOK = le32_to_cpu(cnt.rx_multicast_pkts);
+ mac_stats->BroadcastFramesReceivedOK = le32_to_cpu(cnt.rx_broadcast_pkts);
+ mac_stats->FrameTooLongErrors = le32_to_cpu(cnt.rx_oversize_error_pkts);
+}
+
+static void mxl862xx_get_eth_ctrl_stats(struct dsa_switch *ds, int port,
+ struct ethtool_eth_ctrl_stats *ctrl_stats)
+{
+ struct mxl862xx_rmon_port_cnt cnt;
+
+ if (mxl862xx_read_rmon(ds, port, &cnt))
+ return;
+
+ ctrl_stats->MACControlFramesTransmitted = le32_to_cpu(cnt.tx_pause_count);
+ ctrl_stats->MACControlFramesReceived = le32_to_cpu(cnt.rx_good_pause_pkts);
+}
+
+static void mxl862xx_get_pause_stats(struct dsa_switch *ds, int port,
+ struct ethtool_pause_stats *pause_stats)
+{
+ struct mxl862xx_rmon_port_cnt cnt;
+
+ if (mxl862xx_read_rmon(ds, port, &cnt))
+ return;
+
+ pause_stats->tx_pause_frames = le32_to_cpu(cnt.tx_pause_count);
+ pause_stats->rx_pause_frames = le32_to_cpu(cnt.rx_good_pause_pkts);
+}
+
+static void mxl862xx_get_rmon_stats(struct dsa_switch *ds, int port,
+ struct ethtool_rmon_stats *rmon_stats,
+ const struct ethtool_rmon_hist_range **ranges)
+{
+ struct mxl862xx_rmon_port_cnt cnt;
+
+ if (mxl862xx_read_rmon(ds, port, &cnt))
+ return;
+
+ rmon_stats->undersize_pkts = le32_to_cpu(cnt.rx_under_size_good_pkts);
+ rmon_stats->oversize_pkts = le32_to_cpu(cnt.rx_oversize_good_pkts);
+ rmon_stats->fragments = le32_to_cpu(cnt.rx_under_size_error_pkts);
+ rmon_stats->jabbers = le32_to_cpu(cnt.rx_oversize_error_pkts);
+
+ rmon_stats->hist[0] = le32_to_cpu(cnt.rx64byte_pkts);
+ rmon_stats->hist[1] = le32_to_cpu(cnt.rx127byte_pkts);
+ rmon_stats->hist[2] = le32_to_cpu(cnt.rx255byte_pkts);
+ rmon_stats->hist[3] = le32_to_cpu(cnt.rx511byte_pkts);
+ rmon_stats->hist[4] = le32_to_cpu(cnt.rx1023byte_pkts);
+ rmon_stats->hist[5] = le32_to_cpu(cnt.rx_max_byte_pkts);
+
+ rmon_stats->hist_tx[0] = le32_to_cpu(cnt.tx64byte_pkts);
+ rmon_stats->hist_tx[1] = le32_to_cpu(cnt.tx127byte_pkts);
+ rmon_stats->hist_tx[2] = le32_to_cpu(cnt.tx255byte_pkts);
+ rmon_stats->hist_tx[3] = le32_to_cpu(cnt.tx511byte_pkts);
+ rmon_stats->hist_tx[4] = le32_to_cpu(cnt.tx1023byte_pkts);
+ rmon_stats->hist_tx[5] = le32_to_cpu(cnt.tx_max_byte_pkts);
+
+ *ranges = mxl862xx_rmon_ranges;
+}
static const struct dsa_switch_ops mxl862xx_switch_ops = {
.get_tag_protocol = mxl862xx_get_tag_protocol,
.setup = mxl862xx_setup,
@@ -1758,6 +1924,13 @@ static const struct dsa_switch_ops mxl862xx_switch_ops = {
.port_vlan_filtering = mxl862xx_port_vlan_filtering,
.port_vlan_add = mxl862xx_port_vlan_add,
.port_vlan_del = mxl862xx_port_vlan_del,
+ .get_strings = mxl862xx_get_strings,
+ .get_sset_count = mxl862xx_get_sset_count,
+ .get_ethtool_stats = mxl862xx_get_ethtool_stats,
+ .get_eth_mac_stats = mxl862xx_get_eth_mac_stats,
+ .get_eth_ctrl_stats = mxl862xx_get_eth_ctrl_stats,
+ .get_pause_stats = mxl862xx_get_pause_stats,
+ .get_rmon_stats = mxl862xx_get_rmon_stats,
};
static void mxl862xx_phylink_mac_config(struct phylink_config *config,
--
2.53.0
^ permalink raw reply related
* [PATCH net-next v2 2/2] net: dsa: mxl862xx: implement .get_stats64
From: Daniel Golle @ 2026-04-12 0:02 UTC (permalink / raw)
To: Daniel Golle, Andrew Lunn, Vladimir Oltean, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Russell King, netdev,
linux-kernel
Cc: Frank Wunderlich, Chad Monroe, Cezary Wilmanski, Liang Xu,
Benny (Ying-Tsan) Weng, Jose Maria Verdu Munoz, Avinash Jayaraman,
John Crispin
In-Reply-To: <cover.1775951347.git.daniel@makrotopia.org>
Poll free-running firmware RMON counters every 2 seconds and accumulate
deltas into 64-bit per-port statistics. 32-bit packet counters wrap
in ~220s at 10 Gbps line rate with minimum-size frames; the 2s polling
interval provides a comfortable margin. The .get_stats64 callback
forces a fresh poll so that counters are always up to date when queried.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
v2: no changes
drivers/net/dsa/mxl862xx/mxl862xx-host.c | 8 +-
drivers/net/dsa/mxl862xx/mxl862xx.c | 175 +++++++++++++++++++++++
drivers/net/dsa/mxl862xx/mxl862xx.h | 94 +++++++++++-
3 files changed, 270 insertions(+), 7 deletions(-)
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx-host.c b/drivers/net/dsa/mxl862xx/mxl862xx-host.c
index cadbdb590cf43..d55f9dff6433e 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx-host.c
+++ b/drivers/net/dsa/mxl862xx/mxl862xx-host.c
@@ -48,7 +48,7 @@ static void mxl862xx_crc_err_work_fn(struct work_struct *work)
dev_close(dp->conduit);
rtnl_unlock();
- clear_bit(0, &priv->crc_err);
+ clear_bit(MXL862XX_FLAG_CRC_ERR, &priv->flags);
}
/* Firmware CRC error codes (outside normal Zephyr errno range). */
@@ -247,7 +247,7 @@ static int mxl862xx_issue_cmd(struct mxl862xx_priv *priv, u16 cmd, u16 len)
ret = mxl862xx_crc6_verify(ctrl_enc, len_enc, &fw_result);
if (ret) {
- if (!test_and_set_bit(0, &priv->crc_err))
+ if (!test_and_set_bit(MXL862XX_FLAG_CRC_ERR, &priv->flags))
schedule_work(&priv->crc_err_work);
return -EIO;
}
@@ -314,7 +314,7 @@ static int mxl862xx_send_cmd(struct mxl862xx_priv *priv, u16 cmd, u16 size,
if (ret < 0) {
if ((ret == MXL862XX_FW_CRC6_ERR ||
ret == MXL862XX_FW_CRC16_ERR) &&
- !test_and_set_bit(0, &priv->crc_err))
+ !test_and_set_bit(MXL862XX_FLAG_CRC_ERR, &priv->flags))
schedule_work(&priv->crc_err_work);
if (!quiet)
dev_err(&priv->mdiodev->dev,
@@ -458,7 +458,7 @@ int mxl862xx_api_wrap(struct mxl862xx_priv *priv, u16 cmd, void *_data,
}
if (crc16(0xffff, (const u8 *)data, size) != crc) {
- if (!test_and_set_bit(0, &priv->crc_err))
+ if (!test_and_set_bit(MXL862XX_FLAG_CRC_ERR, &priv->flags))
schedule_work(&priv->crc_err_work);
ret = -EIO;
goto out;
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx.c b/drivers/net/dsa/mxl862xx/mxl862xx.c
index 58bf7210c6d40..b60482d93a855 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx.c
+++ b/drivers/net/dsa/mxl862xx/mxl862xx.c
@@ -30,6 +30,12 @@
#define MXL862XX_API_READ_QUIET(dev, cmd, data) \
mxl862xx_api_wrap(dev, cmd, &(data), sizeof((data)), true, true)
+/* Polling interval for RMON counter accumulation. At 2.5 Gbps with
+ * minimum-size (64-byte) frames, a 32-bit packet counter wraps in ~880s.
+ * 2s gives a comfortable margin.
+ */
+#define MXL862XX_STATS_POLL_INTERVAL (2 * HZ)
+
struct mxl862xx_mib_desc {
unsigned int size;
unsigned int offset;
@@ -677,6 +683,9 @@ static int mxl862xx_setup(struct dsa_switch *ds)
if (ret)
return ret;
+ schedule_delayed_work(&priv->stats_work,
+ MXL862XX_STATS_POLL_INTERVAL);
+
return mxl862xx_setup_mdio(ds);
}
@@ -1900,6 +1909,159 @@ static void mxl862xx_get_rmon_stats(struct dsa_switch *ds, int port,
*ranges = mxl862xx_rmon_ranges;
}
+
+/* Compute the delta between two 32-bit free-running counter snapshots,
+ * handling a single wrap-around correctly via unsigned subtraction.
+ */
+static u64 mxl862xx_delta32(u32 cur, u32 prev)
+{
+ return (u32)(cur - prev);
+}
+
+/**
+ * mxl862xx_stats_poll - Read RMON counters and accumulate into 64-bit stats
+ * @ds: DSA switch
+ * @port: port index
+ *
+ * The firmware RMON counters are free-running 32-bit values (64-bit for
+ * byte counters). This function reads the hardware via MDIO (may sleep),
+ * computes deltas from the previous snapshot, and accumulates them into
+ * 64-bit per-port stats under a spinlock.
+ *
+ * Called only from the stats polling workqueue -- serialized by the
+ * single-threaded delayed_work, so no MDIO locking is needed here.
+ */
+static void mxl862xx_stats_poll(struct dsa_switch *ds, int port)
+{
+ struct mxl862xx_priv *priv = ds->priv;
+ struct mxl862xx_port_stats *s = &priv->ports[port].stats;
+ u32 rx_fcserr, rx_under, rx_over, rx_align, tx_drop;
+ u32 rx_drop, rx_evlan, mtu_exc, tx_acm;
+ struct mxl862xx_rmon_port_cnt cnt;
+ u64 rx_bytes, tx_bytes;
+ u32 rx_mcast, tx_coll;
+ u32 rx_pkts, tx_pkts;
+
+ /* MDIO read -- may sleep, done outside the spinlock. */
+ if (mxl862xx_read_rmon(ds, port, &cnt))
+ return;
+
+ rx_pkts = le32_to_cpu(cnt.rx_good_pkts);
+ tx_pkts = le32_to_cpu(cnt.tx_good_pkts);
+ rx_bytes = le64_to_cpu(cnt.rx_good_bytes);
+ tx_bytes = le64_to_cpu(cnt.tx_good_bytes);
+ rx_fcserr = le32_to_cpu(cnt.rx_fcserror_pkts);
+ rx_under = le32_to_cpu(cnt.rx_under_size_error_pkts);
+ rx_over = le32_to_cpu(cnt.rx_oversize_error_pkts);
+ rx_align = le32_to_cpu(cnt.rx_align_error_pkts);
+ tx_drop = le32_to_cpu(cnt.tx_dropped_pkts);
+ rx_drop = le32_to_cpu(cnt.rx_dropped_pkts);
+ rx_evlan = le32_to_cpu(cnt.rx_extended_vlan_discard_pkts);
+ mtu_exc = le32_to_cpu(cnt.mtu_exceed_discard_pkts);
+ tx_acm = le32_to_cpu(cnt.tx_acm_dropped_pkts);
+ rx_mcast = le32_to_cpu(cnt.rx_multicast_pkts);
+ tx_coll = le32_to_cpu(cnt.tx_coll_count);
+
+ /* Accumulate deltas under spinlock -- .get_stats64 reads these. */
+ spin_lock_bh(&priv->ports[port].stats_lock);
+
+ s->rx_packets += mxl862xx_delta32(rx_pkts, s->prev_rx_good_pkts);
+ s->tx_packets += mxl862xx_delta32(tx_pkts, s->prev_tx_good_pkts);
+ s->rx_bytes += rx_bytes - s->prev_rx_good_bytes;
+ s->tx_bytes += tx_bytes - s->prev_tx_good_bytes;
+
+ s->rx_errors +=
+ mxl862xx_delta32(rx_fcserr, s->prev_rx_fcserror_pkts) +
+ mxl862xx_delta32(rx_under, s->prev_rx_under_size_error_pkts) +
+ mxl862xx_delta32(rx_over, s->prev_rx_oversize_error_pkts) +
+ mxl862xx_delta32(rx_align, s->prev_rx_align_error_pkts);
+ s->tx_errors +=
+ mxl862xx_delta32(tx_drop, s->prev_tx_dropped_pkts);
+
+ s->rx_dropped +=
+ mxl862xx_delta32(rx_drop, s->prev_rx_dropped_pkts) +
+ mxl862xx_delta32(rx_evlan, s->prev_rx_evlan_discard_pkts) +
+ mxl862xx_delta32(mtu_exc, s->prev_mtu_exceed_discard_pkts);
+ s->tx_dropped +=
+ mxl862xx_delta32(tx_drop, s->prev_tx_dropped_pkts) +
+ mxl862xx_delta32(tx_acm, s->prev_tx_acm_dropped_pkts);
+
+ s->multicast += mxl862xx_delta32(rx_mcast, s->prev_rx_multicast_pkts);
+ s->collisions += mxl862xx_delta32(tx_coll, s->prev_tx_coll_count);
+
+ s->rx_length_errors +=
+ mxl862xx_delta32(rx_under, s->prev_rx_under_size_error_pkts) +
+ mxl862xx_delta32(rx_over, s->prev_rx_oversize_error_pkts);
+ s->rx_crc_errors +=
+ mxl862xx_delta32(rx_fcserr, s->prev_rx_fcserror_pkts);
+ s->rx_frame_errors +=
+ mxl862xx_delta32(rx_align, s->prev_rx_align_error_pkts);
+
+ s->prev_rx_good_pkts = rx_pkts;
+ s->prev_tx_good_pkts = tx_pkts;
+ s->prev_rx_good_bytes = rx_bytes;
+ s->prev_tx_good_bytes = tx_bytes;
+ s->prev_rx_fcserror_pkts = rx_fcserr;
+ s->prev_rx_under_size_error_pkts = rx_under;
+ s->prev_rx_oversize_error_pkts = rx_over;
+ s->prev_rx_align_error_pkts = rx_align;
+ s->prev_tx_dropped_pkts = tx_drop;
+ s->prev_rx_dropped_pkts = rx_drop;
+ s->prev_rx_evlan_discard_pkts = rx_evlan;
+ s->prev_mtu_exceed_discard_pkts = mtu_exc;
+ s->prev_tx_acm_dropped_pkts = tx_acm;
+ s->prev_rx_multicast_pkts = rx_mcast;
+ s->prev_tx_coll_count = tx_coll;
+
+ spin_unlock_bh(&priv->ports[port].stats_lock);
+}
+
+static void mxl862xx_stats_work_fn(struct work_struct *work)
+{
+ struct mxl862xx_priv *priv =
+ container_of(work, struct mxl862xx_priv, stats_work.work);
+ struct dsa_switch *ds = priv->ds;
+ struct dsa_port *dp;
+
+ dsa_switch_for_each_available_port(dp, ds)
+ mxl862xx_stats_poll(ds, dp->index);
+
+ if (!test_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags))
+ schedule_delayed_work(&priv->stats_work,
+ MXL862XX_STATS_POLL_INTERVAL);
+}
+
+static void mxl862xx_get_stats64(struct dsa_switch *ds, int port,
+ struct rtnl_link_stats64 *s)
+{
+ struct mxl862xx_priv *priv = ds->priv;
+ struct mxl862xx_port_stats *ps = &priv->ports[port].stats;
+
+ spin_lock_bh(&priv->ports[port].stats_lock);
+
+ s->rx_packets = ps->rx_packets;
+ s->tx_packets = ps->tx_packets;
+ s->rx_bytes = ps->rx_bytes;
+ s->tx_bytes = ps->tx_bytes;
+ s->rx_errors = ps->rx_errors;
+ s->tx_errors = ps->tx_errors;
+ s->rx_dropped = ps->rx_dropped;
+ s->tx_dropped = ps->tx_dropped;
+ s->multicast = ps->multicast;
+ s->collisions = ps->collisions;
+ s->rx_length_errors = ps->rx_length_errors;
+ s->rx_crc_errors = ps->rx_crc_errors;
+ s->rx_frame_errors = ps->rx_frame_errors;
+
+ spin_unlock_bh(&priv->ports[port].stats_lock);
+
+ /* Trigger a fresh poll so the next read sees up-to-date counters.
+ * No-op if the work is already pending, running, or teardown started.
+ */
+ if (!test_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags))
+ schedule_delayed_work(&priv->stats_work, 0);
+}
+
static const struct dsa_switch_ops mxl862xx_switch_ops = {
.get_tag_protocol = mxl862xx_get_tag_protocol,
.setup = mxl862xx_setup,
@@ -1931,6 +2093,7 @@ static const struct dsa_switch_ops mxl862xx_switch_ops = {
.get_eth_ctrl_stats = mxl862xx_get_eth_ctrl_stats,
.get_pause_stats = mxl862xx_get_pause_stats,
.get_rmon_stats = mxl862xx_get_rmon_stats,
+ .get_stats64 = mxl862xx_get_stats64,
};
static void mxl862xx_phylink_mac_config(struct phylink_config *config,
@@ -1992,16 +2155,22 @@ static int mxl862xx_probe(struct mdio_device *mdiodev)
priv->ports[i].priv = priv;
INIT_WORK(&priv->ports[i].host_flood_work,
mxl862xx_host_flood_work_fn);
+ spin_lock_init(&priv->ports[i].stats_lock);
}
+ INIT_DELAYED_WORK(&priv->stats_work, mxl862xx_stats_work_fn);
+
dev_set_drvdata(dev, ds);
err = dsa_register_switch(ds);
if (err) {
+ set_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags);
+ cancel_delayed_work_sync(&priv->stats_work);
mxl862xx_host_shutdown(priv);
for (i = 0; i < MXL862XX_MAX_PORTS; i++)
cancel_work_sync(&priv->ports[i].host_flood_work);
}
+
return err;
}
@@ -2016,6 +2185,9 @@ static void mxl862xx_remove(struct mdio_device *mdiodev)
priv = ds->priv;
+ set_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags);
+ cancel_delayed_work_sync(&priv->stats_work);
+
dsa_unregister_switch(ds);
mxl862xx_host_shutdown(priv);
@@ -2042,6 +2214,9 @@ static void mxl862xx_shutdown(struct mdio_device *mdiodev)
dsa_switch_shutdown(ds);
+ set_bit(MXL862XX_FLAG_WORK_STOPPED, &priv->flags);
+ cancel_delayed_work_sync(&priv->stats_work);
+
mxl862xx_host_shutdown(priv);
for (i = 0; i < MXL862XX_MAX_PORTS; i++)
diff --git a/drivers/net/dsa/mxl862xx/mxl862xx.h b/drivers/net/dsa/mxl862xx/mxl862xx.h
index a010cf6b961a9..80053ab40e4ce 100644
--- a/drivers/net/dsa/mxl862xx/mxl862xx.h
+++ b/drivers/net/dsa/mxl862xx/mxl862xx.h
@@ -116,6 +116,79 @@ struct mxl862xx_evlan_block {
u16 n_active;
};
+/**
+ * struct mxl862xx_port_stats - 64-bit accumulated hardware port statistics
+ * @rx_packets: total received packets
+ * @tx_packets: total transmitted packets
+ * @rx_bytes: total received bytes
+ * @tx_bytes: total transmitted bytes
+ * @rx_errors: total receive errors
+ * @tx_errors: total transmit errors
+ * @rx_dropped: total received packets dropped
+ * @tx_dropped: total transmitted packets dropped
+ * @multicast: total received multicast packets
+ * @collisions: total transmit collisions
+ * @rx_length_errors: received length errors (undersize + oversize)
+ * @rx_crc_errors: received FCS errors
+ * @rx_frame_errors: received alignment errors
+ * @prev_rx_good_pkts: previous snapshot of rx good packet counter
+ * @prev_tx_good_pkts: previous snapshot of tx good packet counter
+ * @prev_rx_good_bytes: previous snapshot of rx good byte counter
+ * @prev_tx_good_bytes: previous snapshot of tx good byte counter
+ * @prev_rx_fcserror_pkts: previous snapshot of rx FCS error counter
+ * @prev_rx_under_size_error_pkts: previous snapshot of rx undersize
+ * error counter
+ * @prev_rx_oversize_error_pkts: previous snapshot of rx oversize
+ * error counter
+ * @prev_rx_align_error_pkts: previous snapshot of rx alignment
+ * error counter
+ * @prev_tx_dropped_pkts: previous snapshot of tx dropped counter
+ * @prev_rx_dropped_pkts: previous snapshot of rx dropped counter
+ * @prev_rx_evlan_discard_pkts: previous snapshot of extended VLAN
+ * discard counter
+ * @prev_mtu_exceed_discard_pkts: previous snapshot of MTU exceed
+ * discard counter
+ * @prev_tx_acm_dropped_pkts: previous snapshot of tx ACM dropped
+ * counter
+ * @prev_rx_multicast_pkts: previous snapshot of rx multicast counter
+ * @prev_tx_coll_count: previous snapshot of tx collision counter
+ *
+ * The firmware RMON counters are 32-bit free-running (64-bit for byte
+ * counters). This structure holds 64-bit accumulators alongside the
+ * previous raw snapshot so that deltas can be computed across polls,
+ * handling 32-bit wrap correctly via unsigned subtraction.
+ */
+struct mxl862xx_port_stats {
+ u64 rx_packets;
+ u64 tx_packets;
+ u64 rx_bytes;
+ u64 tx_bytes;
+ u64 rx_errors;
+ u64 tx_errors;
+ u64 rx_dropped;
+ u64 tx_dropped;
+ u64 multicast;
+ u64 collisions;
+ u64 rx_length_errors;
+ u64 rx_crc_errors;
+ u64 rx_frame_errors;
+ u32 prev_rx_good_pkts;
+ u32 prev_tx_good_pkts;
+ u64 prev_rx_good_bytes;
+ u64 prev_tx_good_bytes;
+ u32 prev_rx_fcserror_pkts;
+ u32 prev_rx_under_size_error_pkts;
+ u32 prev_rx_oversize_error_pkts;
+ u32 prev_rx_align_error_pkts;
+ u32 prev_tx_dropped_pkts;
+ u32 prev_rx_dropped_pkts;
+ u32 prev_rx_evlan_discard_pkts;
+ u32 prev_mtu_exceed_discard_pkts;
+ u32 prev_tx_acm_dropped_pkts;
+ u32 prev_rx_multicast_pkts;
+ u32 prev_tx_coll_count;
+};
+
/**
* struct mxl862xx_port - per-port state tracked by the driver
* @priv: back-pointer to switch private data; needed by
@@ -145,6 +218,10 @@ struct mxl862xx_evlan_block {
* The worker acquires rtnl_lock() to serialize with
* DSA callbacks and checks @setup_done to avoid
* acting on torn-down ports.
+ * @stats: 64-bit accumulated hardware statistics; updated
+ * periodically by the stats polling work
+ * @stats_lock: protects accumulator reads in .get_stats64 against
+ * concurrent updates from the polling work
*/
struct mxl862xx_port {
struct mxl862xx_priv *priv;
@@ -160,16 +237,24 @@ struct mxl862xx_port {
bool host_flood_uc;
bool host_flood_mc;
struct work_struct host_flood_work;
+ struct mxl862xx_port_stats stats;
+ spinlock_t stats_lock; /* protects stats accumulators */
};
+/* Bit indices for struct mxl862xx_priv::flags */
+#define MXL862XX_FLAG_CRC_ERR 0
+#define MXL862XX_FLAG_WORK_STOPPED 1
+
/**
* struct mxl862xx_priv - driver private data for an MxL862xx switch
* @ds: pointer to the DSA switch instance
* @mdiodev: MDIO device used to communicate with the switch firmware
* @crc_err_work: deferred work for shutting down all ports on MDIO CRC
* errors
- * @crc_err: set atomically before CRC-triggered shutdown, cleared
- * after
+ * @flags: atomic status flags; %MXL862XX_FLAG_CRC_ERR is set
+ * before CRC-triggered shutdown and cleared after;
+ * %MXL862XX_FLAG_WORK_STOPPED is set before cancelling
+ * stats_work to prevent rescheduling during teardown
* @drop_meter: index of the single shared zero-rate firmware meter
* used to unconditionally drop traffic (used to block
* flooding)
@@ -181,18 +266,21 @@ struct mxl862xx_port {
* @evlan_ingress_size: per-port ingress Extended VLAN block size
* @evlan_egress_size: per-port egress Extended VLAN block size
* @vf_block_size: per-port VLAN Filter block size
+ * @stats_work: periodic work item that polls RMON hardware counters
+ * and accumulates them into 64-bit per-port stats
*/
struct mxl862xx_priv {
struct dsa_switch *ds;
struct mdio_device *mdiodev;
struct work_struct crc_err_work;
- unsigned long crc_err;
+ unsigned long flags;
u16 drop_meter;
struct mxl862xx_port ports[MXL862XX_MAX_PORTS];
u16 bridges[MXL862XX_MAX_BRIDGES + 1];
u16 evlan_ingress_size;
u16 evlan_egress_size;
u16 vf_block_size;
+ struct delayed_work stats_work;
};
#endif /* __MXL862XX_H */
--
2.53.0
^ permalink raw reply related
* [PATCH v4] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: Prathamesh Deshpande @ 2026-04-12 0:04 UTC (permalink / raw)
To: Carolina Jubran, Saeed Mahameed, Leon Romanovsky
Cc: Richard Cochran, Tariq Toukan, Mark Bloch, netdev, linux-rdma,
linux-kernel, Prathamesh Deshpande
In mlx5_pps_event(), several critical issues were identified:
1. The 'pin' index from the hardware event was used without bounds
checking to index 'pin_config' and 'pps_info->start'. Check against
MAX_PIN_NUM to prevent out-of-bounds access.
2. 'ptp_event' was not zero-initialized, potentially leaking stack
memory through the union.
3. A NULL 'pin_config' could be dereferenced if initialization failed.
4. 'clock->ptp' could be NULL if ptp_clock_register() failed.
Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
Suggested-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
v4:
- Validate pin index against MAX_PIN_NUM instead of n_pins [Carolina].
v3:
- Fix union corruption by using a local timestamp variable [Sashiko].
- Validate pin index against n_pins with WARN_ON_ONCE [Carolina].
- Remove redundant pin < 0 check and cleanup TODO comment.
v2:
- Zero-initialize ptp_event to prevent stack information leak [Sashiko].
- Add bounds check for hardware pin index to prevent OOB access [Sashiko].
- Add NULL guard for pin_config to handle initialization failures [Sashiko].
- Add NULL check for clock->ptp as originally intended.
.../net/ethernet/mellanox/mlx5/core/lib/clock.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index bd4e042077af..ff03dfa12a67 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -1164,16 +1164,22 @@ static int mlx5_pps_event(struct notifier_block *nb,
pps_nb);
struct mlx5_core_dev *mdev = clock_state->mdev;
struct mlx5_clock *clock = mdev->clock;
- struct ptp_clock_event ptp_event;
+ struct ptp_clock_event ptp_event = {};
struct mlx5_eqe *eqe = data;
int pin = eqe->data.pps.pin;
unsigned long flags;
u64 ns;
+ if (!clock->ptp_info.pin_config)
+ return NOTIFY_OK;
+
+ if (WARN_ON_ONCE(pin >= MAX_PIN_NUM))
+ return NOTIFY_OK;
+
switch (clock->ptp_info.pin_config[pin].func) {
case PTP_PF_EXTTS:
ptp_event.index = pin;
- ptp_event.timestamp = mlx5_real_time_mode(mdev) ?
+ ns = mlx5_real_time_mode(mdev) ?
mlx5_real_time_cyc2time(clock,
be64_to_cpu(eqe->data.pps.time_stamp)) :
mlx5_timecounter_cyc2time(clock,
@@ -1181,12 +1187,13 @@ static int mlx5_pps_event(struct notifier_block *nb,
if (clock->pps_info.enabled) {
ptp_event.type = PTP_CLOCK_PPSUSR;
ptp_event.pps_times.ts_real =
- ns_to_timespec64(ptp_event.timestamp);
+ ns_to_timespec64(ns);
} else {
ptp_event.type = PTP_CLOCK_EXTTS;
+ ptp_event.timestamp = ns;
}
- /* TODOL clock->ptp can be NULL if ptp_clock_register fails */
- ptp_clock_event(clock->ptp, &ptp_event);
+ if (clock->ptp)
+ ptp_clock_event(clock->ptp, &ptp_event);
break;
case PTP_PF_PEROUT:
if (clock->shared) {
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v3] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: prathamesh deshpande @ 2026-04-12 0:15 UTC (permalink / raw)
To: Carolina Jubran
Cc: Richard Cochran, Tariq Toukan, Mark Bloch, netdev, linux-rdma,
linux-kernel, Leon Romanovsky, Saeed Mahameed
In-Reply-To: <c30f21a3-5a27-43fb-957d-107775b00faf@nvidia.com>
Hi Carolina,
Thanks for the feedback. I have just submitted v4 which addresses this
by checking the pin index against MAX_PIN_NUM.
Thanks,
Prathamesh
On Sat, Apr 11, 2026 at 12:35 PM Carolina Jubran <cjubran@nvidia.com> wrote:
>
>
> On 10/04/2026 4:53, Prathamesh Deshpande wrote:
> > In mlx5_pps_event(), several critical issues were identified during
> > review by Sashiko:
> >
> > 1. The 'pin' index from the hardware event was used without bounds
> > checking to index 'pin_config' and 'pps_info->start', leading to
> > potential out-of-bounds memory access.
> > 2. 'ptp_event' was not zero-initialized. Since it contains a union,
> > assigning a timestamp partially leaves the 'ts_raw' field with
> > uninitialized stack memory, which can leak kernel data or
> > corrupt time sync logic in hardpps().
> > 3. A NULL 'pin_config' could be dereferenced if initialization failed.
> > 4. 'clock->ptp' could be NULL if ptp_clock_register() failed.
> >
> > Fix these by zero-initializing the event struct, adding a bounds
> > check against n_pins, and adding appropriate NULL guards.
> >
> > Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
> > Suggested-by: Carolina Jubran <cjubran@nvidia.com>
> > Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
> > ---
> > v3:
> > - Fix union corruption by using a local timestamp variable [Sashiko].
> > - Validate pin index against n_pins with WARN_ON_ONCE [Carolina].
> > - Remove redundant pin < 0 check and cleanup TODO comment.
> > v2:
> > - Zero-initialize ptp_event to prevent stack information leak [Sashiko].
> > - Add bounds check for hardware pin index to prevent OOB access [Sashiko].
> > - Add NULL guard for pin_config to handle initialization failures [Sashiko].
> > - Add NULL check for clock->ptp as originally intended.
> >
> > .../net/ethernet/mellanox/mlx5/core/lib/clock.c | 17 ++++++++++++-----
> > 1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
> > index bd4e042077af..674dd048a6b8 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
> > @@ -1164,16 +1164,22 @@ static int mlx5_pps_event(struct notifier_block *nb,
> > pps_nb);
> > struct mlx5_core_dev *mdev = clock_state->mdev;
> > struct mlx5_clock *clock = mdev->clock;
> > - struct ptp_clock_event ptp_event;
> > + struct ptp_clock_event ptp_event = {};
> > struct mlx5_eqe *eqe = data;
> > int pin = eqe->data.pps.pin;
> > unsigned long flags;
> > u64 ns;
> >
> > + if (!clock->ptp_info.pin_config)
> > + return NOTIFY_OK;
> > +
> > + if (WARN_ON_ONCE(pin >= clock->ptp_info.n_pins))
> > + return NOTIFY_OK;
>
>
> Sorry if my previous comment wasn't clear enough.
>
>
> The firmware will never report a pin higher than n_pins, thats not the
> concern
>
> here. if future hardware reports n_pins > 8, checking against n_pins
> would still
>
> allow OOB access on those arrays. The check should compare against
> MAX_PIN_NUM
>
> instead, since thats the actual hard limit of the driver's data
> structures. and if a new
>
> device supports more than 8 pins, the WARN_ON_ONCE would let us know we need
>
> to update the driver.
>
>
> Thanks,
>
> Carolina
>
--
Thanks and Regards,
Prathamesh Deshpande
^ permalink raw reply
* Re: [PATCH v2 net-next 2/5] net: phy: make mdio_device.c part of libphy
From: Stephen Boyd @ 2026-04-12 0:25 UTC (permalink / raw)
To: Andrew Lunn, Bjorn Andersson, David Miller, Eric Dumazet,
Heiner Kallweit, Jakub Kicinski, Michael Turquette,
Neil Armstrong, Paolo Abeni, Russell King - ARM Linux, Vinod Koul
Cc: netdev@vger.kernel.org, Philipp Zabel, linux-arm-msm, linux-clk,
linux-phy
In-Reply-To: <c6dbf9b3-3ca0-434b-ad3a-71fe602ab809@gmail.com>
Quoting Heiner Kallweit (2026-03-09 10:03:31)
> This patch
> - makes mdio_device.c part of libphy
> - makes mdio_device_(un)register_reset() static
> - moves mdiobus_(un)register_device() from mdio_bus.c to mdio_device.c,
> stops exporting both functions and makes them private to phylib
>
> This further decouples the MDIO consumer functionality from libphy.
>
> Note: This makes MDIO driver registration part of phylib, therefore
> adjust Kconfig dependencies where needed.
>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
Acked-by: Stephen Boyd <sboyd@kernel.org>
^ permalink raw reply
* [PATCH net-next v2] r8169: Use napi_schedule_irqoff()
From: Matt Vollrath @ 2026-04-12 1:40 UTC (permalink / raw)
To: netdev
Cc: Matt Vollrath, edumazet, pabeni, hkallweit1, kuba, andrew+netdev,
nic_swsd
napi_schedule() masks hard interrupts while doing its work, which is
redundant when called from an interrupt handler where hard interrupts
are already masked. Use napi_schedule_irqoff() instead to bypass this
redundant masking. This is an optimization.
Tested on a Lenovo RTL8168h/8111h.
Signed-off-by: Matt Vollrath <tactii@gmail.com>
---
drivers/net/ethernet/realtek/r8169_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 791277e750ba..4c0ad0de3410 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -4873,7 +4873,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
phy_mac_interrupt(tp->phydev);
rtl_irq_disable(tp);
- napi_schedule(&tp->napi);
+ napi_schedule_irqoff(&tp->napi);
out:
rtl_ack_events(tp, status);
--
2.43.0
Changes:
v2:
* CC the maintainers, make the CI board green
^ permalink raw reply related
* [PATCH] xfrm: fix memory leak in xfrm_add_policy()
From: Deepanshu Kartikey @ 2026-04-12 2:08 UTC (permalink / raw)
To: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms
Cc: leon, netdev, linux-kernel, Deepanshu Kartikey,
syzbot+901d48e0b95aed4a2548
When xfrm_policy_insert() fails, the error path performs manual
cleanup by calling xfrm_dev_policy_free(), security_xfrm_policy_free()
and kfree() directly. This is incorrect because xfrm_policy_destroy()
already handles all of these, causing a memory leak detected by
kmemleak.
Replace the open-coded cleanup with xfrm_policy_destroy(), consistent
with the error handling in xfrm_policy_construct(). The walk.dead
flag must be set before calling xfrm_policy_destroy() as it requires
it via BUG_ON(!policy->walk.dead).
Fixes: 94b95dfaa814 ("xfrm: release all offloaded policy memory")
Reported-by: syzbot+901d48e0b95aed4a2548@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=901d48e0b95aed4a2548
Tested-by: syzbot+901d48e0b95aed4a2548@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
net/xfrm/xfrm_user.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d56450f61669..ae144d1e4a65 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2267,9 +2267,8 @@ static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err) {
xfrm_dev_policy_delete(xp);
- xfrm_dev_policy_free(xp);
- security_xfrm_policy_free(xp->security);
- kfree(xp);
+ xp->walk.dead = 1;
+ xfrm_policy_destroy(xp);
return err;
}
--
2.43.0
^ permalink raw reply related
* Re: [PATCH net] netrom: do some basic forms of validation on incoming frames
From: Hugh Blemings @ 2026-04-12 2:32 UTC (permalink / raw)
To: Greg KH
Cc: Kuniyuki Iwashima, kuba, davem, edumazet, horms, linux-hams,
linux-kernel, netdev, pabeni, stable, workflows, yizhe
In-Reply-To: <2026041124-hyphen-circulate-34ae@gregkh>
On 11/4/2026 18:58, Greg KH wrote:
> On Sat, Apr 11, 2026 at 05:24:17PM +1000, Hugh Blemings wrote:
>> On 11/4/2026 15:50, Greg KH wrote:
>>> On Sat, Apr 11, 2026 at 08:25:19AM +1000, Hugh Blemings wrote:
>>>> On 11/4/2026 08:11, Kuniyuki Iwashima wrote:
>>>>> From: Jakub Kicinski <kuba@kernel.org>
>>>>> Date: Fri, 10 Apr 2026 14:54:48 -0700
>>>>>> On Fri, 10 Apr 2026 14:30:42 -0700 Jakub Kicinski wrote:
>>>>>>> On Fri, 10 Apr 2026 07:24:36 +0200 Greg Kroah-Hartman wrote:
>>>>>>>> On Thu, Apr 09, 2026 at 08:32:35PM -0700, Jakub Kicinski wrote:
>>>>>>>>> Or for simplicity we could also be testing against skb_headlen()
>>>>>>>>> since we don't expect any legit non-linear frames here? Dunno.
>>>>>>>> I'll be glad to change this either way, your call. Given that this is
>>>>>>>> an obsolete protocol that seems to only be a target for drive-by fuzzers
>>>>>>>> to attack, whatever the simplest thing to do to quiet them up I'll be
>>>>>>>> glad to implement.
>>>>>>>>
>>>>>>>> Or can we just delete this stuff entirely? :)
>>>>>>> Yes.
>>>>>>>
>>>>>>> My thinking is to delete hamradio, nfc, atm, caif.. [more to come]
>>>>>>> Create GH repos which provide them as OOT modules.
>>>>>>> Hopefully we can convince any existing users to switch to that.
>>>>>>>
>>>>>>> The only thing stopping me is the concern that this is just the softest
>>>>>>> target and the LLMs will find something else to focus on which we can't
>>>>>>> delete. I suspect any PCIe driver can be flooded with "aren't you
>>>>>>> trusting the HW to provide valid responses here?" bullshit.
>>>>>>>
>>>>>>> But hey, let's try. I'll post a patch nuking all of hamradio later
>>>>>>> today.
>>>>>> Well, either we "expunge" this code to OOT repos, or we mark it
>>>>>> as broken and tell everyone that we don't take security fixes
>>>>>> for anything that depends on BROKEN. I'd personally rather expunge.
>>>>> +1 for "expunge" to prevent LLM-based patch flood.
>>>>>
>>>>> IIRC, we did that recently for one driver only used by OpenWRT ?
>>>>>
>>>>>
>>>> If the main concern here is ongoing maintenance of these Ham Radio related
>>>> protocols/drivers, can we pause for a moment on anything as dramatic as
>>>> removing from the tree entirely ?
>>> Sure, but:
>>>
>>>> There is a good cohort of capable kernel folks that either are or were ham
>>>> radio operators who I believe, upon realising that things have got to this
>>>> point, will be happy to redouble efforts to ensure this code maintained and
>>>> tested to a satisfactory standard.
>>> We need this code to be maintained, because as is being shown, there are
>>> reported problems with it that will affect these devices/networks that
>>> you all are using. So all we need is a maintainer for this to be able
>>> to take reports that we get and fix things up as needed. I know you
>>> have that experience, want to come back to kernel development, we've
>>> missed you :)
>> That's most kind Greg, thank you, have missed all you cool kids too :)
>>
>> More seriously though - I'd be up for doing it, but I think there may be
>> others better placed than I who haven't yet realised we have this conundrum.
>> I'm nudging a few folks offline on this front.
> The main "conundrum" is, is that this protocol completly trusts the
> hardware to give the kernel the "correct" data. So if you trust the
> hardware to work properly, it will be fine, but as the fuzzing tools are
> finding, if the data from the hardware modems is a bit out-of-spec,
> "bad" things can happen.
>
> I don't know how well controlled the data is from these devices, if it's
> just a "pass through" from what they get off the "wire" or if the
> devices always ensure the protocol packets are sane before passing them
> off to the kernel. That's going to be something you all with the
> hardware is going to have to determine in order to keep this a working
> system over time. Especially given that this is a wireless protcol
> where you "have" to trust the remote end.
Thanks for the thoughts Greg - and ya, I guess on balance I come back to
being generally skeptical of both hardware and software to Do The Right
Thing (TM)
So bounds checking and the like seems prudent irrespective of whether
the kernel is getting the data from real hardware, software modems etc.
I've done some initial digging around that confirms my suspicion that
this in kernel code remains quite widely used, if somewhat out of view.
Accordingly I lean then towards working to get these various mitigations
in place with some revised patches etc. as needed and into the main tree.
Once this done I think that'll give me a good sense of whether I or
someone else is well positioned to keep the code maintained longer term
and thus justify it remaining in tree or not.
More to follow once I finish remembering this kernel thing!
Cheers,
Hugh
^ permalink raw reply
* [PATCH v2 0/3] bpf: fix sock_ops rtt_min OOB read and related guard issues
From: Werner Kasselman @ 2026-04-12 3:03 UTC (permalink / raw)
To: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko
Cc: John Fastabend, Lawrence Brakmo, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Werner Kasselman
Patch 3 fixes an out-of-bounds read in sock_ops_convert_ctx_access()
for the rtt_min context field. It is the only tcp_sock-backed field
that bypasses the is_locked_tcp_sock guard, so on request_sock-backed
sock_ops callbacks the converted BPF load reads past the end of a
tcp_request_sock.
Patches 1 and 2 are groundwork. Patch 1 fixes a pre-existing info
leak in SOCK_OPS_GET_FIELD() and SOCK_OPS_GET_SK() where dst_reg is
left holding the context pointer on the guard-failure branch when
dst_reg == src_reg, instead of being zeroed. Patch 2 extracts
SOCK_OPS_LOAD_TCP_SOCK_FIELD() from SOCK_OPS_GET_FIELD() so the
rtt_min sub-field access in patch 3 can reuse it.
Patches 1 and 3 carry Fixes: tags and Cc: stable. Patch 2 is a pure
refactor.
v1: https://lore.kernel.org/bpf/ (earlier single-patch posting)
- Inlined the guarded load sequence by hand.
- Feedback: please factor it through the existing helper instead
of open-coding 30 lines.
v2:
- Patch 1 (new): fix latent dst == src info leak in both macros.
- Patch 2 (new): refactor SOCK_OPS_GET_FIELD().
- Patch 3: use SOCK_OPS_LOAD_TCP_SOCK_FIELD() for rtt_min and use
offsetof(struct minmax_sample, v) for the sub-field offset.
Werner Kasselman (3):
bpf: zero dst_reg on sock_ops field guard failure when dst == src
bpf: extract SOCK_OPS_LOAD_TCP_SOCK_FIELD from SOCK_OPS_GET_FIELD
bpf: guard sock_ops rtt_min against non-locked tcp_sock
net/core/filter.c | 37 +++++++++++++++++++++----------------
1 file changed, 21 insertions(+), 16 deletions(-)
--
2.43.0
^ permalink raw reply
* [PATCH v2 1/3] bpf: zero dst_reg on sock_ops field guard failure when dst == src
From: Werner Kasselman @ 2026-04-12 3:03 UTC (permalink / raw)
To: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko
Cc: John Fastabend, Lawrence Brakmo, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Werner Kasselman
In-Reply-To: <20260412030306.3469543-1-werner@verivus.com>
When a BPF_PROG_TYPE_SOCK_OPS program reads a tcp_sock-backed context
field (e.g. ctx->snd_ssthresh) or ctx->sk using the same register for
source and destination, SOCK_OPS_GET_FIELD() and SOCK_OPS_GET_SK()
load is_locked_tcp_sock/is_fullsock into a scratch register rather
than into dst_reg. On the guard-failure branch the macro only
restores the scratch register before falling through, leaving
dst_reg holding the unchanged context pointer.
Callers expect dst_reg to read as a scalar 0 when the guard fails.
Instead the BPF program sees a kernel heap address, which the
verifier has already typed as a scalar, giving a narrow kernel
pointer leak. Clang does not emit the dst == src pattern for normal
C ctx field reads, but it is reachable via inline asm and
hand-written BPF.
Add an explicit BPF_MOV64_IMM(dst_reg, 0) on the failure path in
both macros and bump the success-path BPF_JMP_A() to skip over it.
Found via AST-based call-graph analysis using sqry.
Fixes: fd09af010788 ("bpf: sock_ops ctx access may stomp registers in corner case")
Fixes: 84f44df664e9 ("bpf: sock_ops sk access may stomp registers when dst_reg = src_reg")
Cc: stable@vger.kernel.org
Signed-off-by: Werner Kasselman <werner@verivus.com>
---
net/core/filter.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 78b548158fb0..53ce06ed4a88 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -10581,10 +10581,11 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
si->dst_reg, si->dst_reg, \
offsetof(OBJ, OBJ_FIELD)); \
if (si->dst_reg == si->src_reg) { \
- *insn++ = BPF_JMP_A(1); \
+ *insn++ = BPF_JMP_A(2); \
*insn++ = BPF_LDX_MEM(BPF_DW, reg, si->src_reg, \
offsetof(struct bpf_sock_ops_kern, \
temp)); \
+ *insn++ = BPF_MOV64_IMM(si->dst_reg, 0); \
} \
} while (0)
@@ -10618,10 +10619,11 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
si->dst_reg, si->src_reg, \
offsetof(struct bpf_sock_ops_kern, sk));\
if (si->dst_reg == si->src_reg) { \
- *insn++ = BPF_JMP_A(1); \
+ *insn++ = BPF_JMP_A(2); \
*insn++ = BPF_LDX_MEM(BPF_DW, reg, si->src_reg, \
offsetof(struct bpf_sock_ops_kern, \
temp)); \
+ *insn++ = BPF_MOV64_IMM(si->dst_reg, 0); \
} \
} while (0)
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox