* [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
@ 2026-01-16 0:49 Russell King (Oracle)
2026-01-16 7:42 ` Maxime Chevallier
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Russell King (Oracle) @ 2026-01-16 0:49 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, linux-arm-kernel, linux-stm32, Maxime Coquelin,
netdev, Paolo Abeni
dwmac4's transmit performance dropped by a factor of four due to an
incorrect assumption about which definitions are for what. This
highlights the need for sane register macros.
Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
definitions") changed the way the txpbl value is merged into the
register:
value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
- value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
+ value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
With the following in the header file:
#define DMA_BUS_MODE_PBL BIT(16)
-#define DMA_BUS_MODE_PBL_SHIFT 16
The assumption here was that DMA_BUS_MODE_PBL was the mask for
DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
The field is actually six bits wide, buts 21:16, and is called
TXPBL.
What's even more confusing is, there turns out to be a PBLX8
single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
which is for register 0x1004.
Fix this up by adding an appropriately named field definition under
the DMA_CHAN_TX_CONTROL() register address definition.
Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
renaming it as well.
Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
renaming it.
This removes confusion over the PBL fields.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 8 ++++----
drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 7 ++++---
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
index 7036beccfc85..aaa83e9ff4f0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
@@ -52,7 +52,7 @@ static void dwmac4_dma_init_rx_chan(struct stmmac_priv *priv,
u32 rxpbl = dma_cfg->rxpbl ?: dma_cfg->pbl;
value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan));
- value = value | FIELD_PREP(DMA_BUS_MODE_RPBL_MASK, rxpbl);
+ value = value | FIELD_PREP(DMA_CHAN_RX_CTRL_RXPBL_MASK, rxpbl);
writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan));
if (IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && likely(dma_cfg->eame))
@@ -73,7 +73,7 @@ static void dwmac4_dma_init_tx_chan(struct stmmac_priv *priv,
u32 txpbl = dma_cfg->txpbl ?: dma_cfg->pbl;
value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
- value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
+ value = value | FIELD_PREP(DMA_CHAN_TX_CTRL_TXPBL_MASK, txpbl);
/* Enable OSP to get best performance */
value |= DMA_CONTROL_OSP;
@@ -98,7 +98,7 @@ static void dwmac4_dma_init_channel(struct stmmac_priv *priv,
/* common channel control register config */
value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
if (dma_cfg->pblx8)
- value = value | DMA_BUS_MODE_PBL;
+ value = value | DMA_CHAN_CTRL_PBLX8;
writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
/* Mask interrupts by writing to CSR7 */
@@ -116,7 +116,7 @@ static void dwmac410_dma_init_channel(struct stmmac_priv *priv,
/* common channel control register config */
value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
if (dma_cfg->pblx8)
- value = value | DMA_BUS_MODE_PBL;
+ value = value | DMA_CHAN_CTRL_PBLX8;
writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
index 5f1e2916f099..9d9077a4ac9f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
@@ -24,8 +24,6 @@
#define DMA_SYS_BUS_MODE 0x00001004
-#define DMA_BUS_MODE_PBL BIT(16)
-#define DMA_BUS_MODE_RPBL_MASK GENMASK(21, 16)
#define DMA_BUS_MODE_MB BIT(14)
#define DMA_BUS_MODE_FB BIT(0)
@@ -68,19 +66,22 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
#define DMA_CHAN_CONTROL(addrs, x) dma_chanx_base_addr(addrs, x)
+#define DMA_CHAN_CTRL_PBLX8 BIT(16)
#define DMA_CONTROL_SPH BIT(24)
#define DMA_CHAN_TX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x4)
#define DMA_CONTROL_EDSE BIT(28)
+#define DMA_CHAN_TX_CTRL_TXPBL_MASK GENMASK(21, 16)
#define DMA_CONTROL_TSE BIT(12)
#define DMA_CONTROL_OSP BIT(4)
#define DMA_CONTROL_ST BIT(0)
#define DMA_CHAN_RX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x8)
-#define DMA_CONTROL_SR BIT(0)
+#define DMA_CHAN_RX_CTRL_RXPBL_MASK GENMASK(21, 16)
#define DMA_RBSZ_MASK GENMASK(14, 1)
+#define DMA_CONTROL_SR BIT(0)
#define DMA_CHAN_TX_BASE_ADDR_HI(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x10)
#define DMA_CHAN_TX_BASE_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x14)
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-01-16 0:49 [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression Russell King (Oracle)
@ 2026-01-16 7:42 ` Maxime Chevallier
2026-01-16 23:21 ` Russell King (Oracle)
2026-01-19 14:19 ` patchwork-bot+netdevbpf
2026-03-13 15:03 ` Georg Gottleuber
2 siblings, 1 reply; 9+ messages in thread
From: Maxime Chevallier @ 2026-01-16 7:42 UTC (permalink / raw)
To: Russell King (Oracle), Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, linux-arm-kernel, linux-stm32, Maxime Coquelin,
netdev, Paolo Abeni
Hi Russell,
On 16/01/2026 01:49, Russell King (Oracle) wrote:
> dwmac4's transmit performance dropped by a factor of four due to an
> incorrect assumption about which definitions are for what. This
> highlights the need for sane register macros.
>
> Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> definitions") changed the way the txpbl value is merged into the
> register:
>
> value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
> + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
>
> With the following in the header file:
>
> #define DMA_BUS_MODE_PBL BIT(16)
> -#define DMA_BUS_MODE_PBL_SHIFT 16
>
> The assumption here was that DMA_BUS_MODE_PBL was the mask for
> DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
>
> The field is actually six bits wide, buts 21:16, and is called
> TXPBL.
>
> What's even more confusing is, there turns out to be a PBLX8
> single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
> and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
> et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
> which is for register 0x1004.
>
> Fix this up by adding an appropriately named field definition under
> the DMA_CHAN_TX_CONTROL() register address definition.
>
> Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
> renaming it as well.
>
> Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
> renaming it.
>
> This removes confusion over the PBL fields.
>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Good job finding the problem ! However you need a Fixes tag, even though
ths is is for net-next.
It would also have been nice to be in CC, I spent some time on the bisect...
Besides that, problem solved on an imx8mp setup :)
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Maxime
> ---
> drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c | 8 ++++----
> drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 7 ++++---
> 2 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
> index 7036beccfc85..aaa83e9ff4f0 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c
> @@ -52,7 +52,7 @@ static void dwmac4_dma_init_rx_chan(struct stmmac_priv *priv,
> u32 rxpbl = dma_cfg->rxpbl ?: dma_cfg->pbl;
>
> value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan));
> - value = value | FIELD_PREP(DMA_BUS_MODE_RPBL_MASK, rxpbl);
> + value = value | FIELD_PREP(DMA_CHAN_RX_CTRL_RXPBL_MASK, rxpbl);
> writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan));
>
> if (IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && likely(dma_cfg->eame))
> @@ -73,7 +73,7 @@ static void dwmac4_dma_init_tx_chan(struct stmmac_priv *priv,
> u32 txpbl = dma_cfg->txpbl ?: dma_cfg->pbl;
>
> value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> - value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
> + value = value | FIELD_PREP(DMA_CHAN_TX_CTRL_TXPBL_MASK, txpbl);
>
> /* Enable OSP to get best performance */
> value |= DMA_CONTROL_OSP;
> @@ -98,7 +98,7 @@ static void dwmac4_dma_init_channel(struct stmmac_priv *priv,
> /* common channel control register config */
> value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
> if (dma_cfg->pblx8)
> - value = value | DMA_BUS_MODE_PBL;
> + value = value | DMA_CHAN_CTRL_PBLX8;
> writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
>
> /* Mask interrupts by writing to CSR7 */
> @@ -116,7 +116,7 @@ static void dwmac410_dma_init_channel(struct stmmac_priv *priv,
> /* common channel control register config */
> value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
> if (dma_cfg->pblx8)
> - value = value | DMA_BUS_MODE_PBL;
> + value = value | DMA_CHAN_CTRL_PBLX8;
>
> writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan));
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> index 5f1e2916f099..9d9077a4ac9f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> @@ -24,8 +24,6 @@
>
> #define DMA_SYS_BUS_MODE 0x00001004
>
> -#define DMA_BUS_MODE_PBL BIT(16)
> -#define DMA_BUS_MODE_RPBL_MASK GENMASK(21, 16)
> #define DMA_BUS_MODE_MB BIT(14)
> #define DMA_BUS_MODE_FB BIT(0)
>
> @@ -68,19 +66,22 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>
> #define DMA_CHAN_CONTROL(addrs, x) dma_chanx_base_addr(addrs, x)
>
> +#define DMA_CHAN_CTRL_PBLX8 BIT(16)
> #define DMA_CONTROL_SPH BIT(24)
>
> #define DMA_CHAN_TX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x4)
>
> #define DMA_CONTROL_EDSE BIT(28)
> +#define DMA_CHAN_TX_CTRL_TXPBL_MASK GENMASK(21, 16)
> #define DMA_CONTROL_TSE BIT(12)
> #define DMA_CONTROL_OSP BIT(4)
> #define DMA_CONTROL_ST BIT(0)
>
> #define DMA_CHAN_RX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x8)
>
> -#define DMA_CONTROL_SR BIT(0)
> +#define DMA_CHAN_RX_CTRL_RXPBL_MASK GENMASK(21, 16)
> #define DMA_RBSZ_MASK GENMASK(14, 1)
> +#define DMA_CONTROL_SR BIT(0)
>
> #define DMA_CHAN_TX_BASE_ADDR_HI(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x10)
> #define DMA_CHAN_TX_BASE_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x14)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-01-16 7:42 ` Maxime Chevallier
@ 2026-01-16 23:21 ` Russell King (Oracle)
2026-01-17 9:26 ` Maxime Chevallier
0 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2026-01-16 23:21 UTC (permalink / raw)
To: Maxime Chevallier
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
linux-stm32, Maxime Coquelin, netdev, Paolo Abeni
On Fri, Jan 16, 2026 at 08:42:19AM +0100, Maxime Chevallier wrote:
> Hi Russell,
>
> On 16/01/2026 01:49, Russell King (Oracle) wrote:
> > dwmac4's transmit performance dropped by a factor of four due to an
> > incorrect assumption about which definitions are for what. This
> > highlights the need for sane register macros.
> >
> > Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> > definitions") changed the way the txpbl value is merged into the
> > register:
> >
> > value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> > - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
> > + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
> >
> > With the following in the header file:
> >
> > #define DMA_BUS_MODE_PBL BIT(16)
> > -#define DMA_BUS_MODE_PBL_SHIFT 16
> >
> > The assumption here was that DMA_BUS_MODE_PBL was the mask for
> > DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
> >
> > The field is actually six bits wide, buts 21:16, and is called
> > TXPBL.
> >
> > What's even more confusing is, there turns out to be a PBLX8
> > single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
> > and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
> > et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
> > which is for register 0x1004.
> >
> > Fix this up by adding an appropriately named field definition under
> > the DMA_CHAN_TX_CONTROL() register address definition.
> >
> > Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
> > renaming it as well.
> >
> > Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
> > renaming it.
> >
> > This removes confusion over the PBL fields.
> >
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>
> Good job finding the problem ! However you need a Fixes tag, even though
> ths is is for net-next.
I really give up with when fixes should be added or not, because it
seems quite random when it's needed and when it isn't.
And no, don't quote the stable-kernel-rules nonsense that is
meaningless ot stable kernel people, when they use AI to analyse
commits and pick stuff completely randomly.
> It would also have been nice to be in CC, I spent some time on the bisect...
I thought you were, but I see now it was a different Maxime!
> Besides that, problem solved on an imx8mp setup :)
>
> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-01-16 23:21 ` Russell King (Oracle)
@ 2026-01-17 9:26 ` Maxime Chevallier
0 siblings, 0 replies; 9+ messages in thread
From: Maxime Chevallier @ 2026-01-17 9:26 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
linux-stm32, Maxime Coquelin, netdev, Paolo Abeni
> I really give up with when fixes should be added or not, because it
> seems quite random when it's needed and when it isn't.
>
> And no, don't quote the stable-kernel-rules nonsense that is
> meaningless ot stable kernel people, when they use AI to analyse
> commits and pick stuff completely randomly.
The rule I'm following is the one in the netdev-maintainer doc, that says:
'for fixes the Fixes: tag is required, regardless of the tree'
>
>> It would also have been nice to be in CC, I spent some time on the bisect...
>
> I thought you were, but I see now it was a different Maxime!
Bah, no worries :)
Seems we have a bunch of french people with similar names working with
st platforms or stmmac on a regular basis, I understand the confusion :
Maxime Coquelin
Maxime Chevallier
Gatien Chevallier
Anyways, thanks for the quick fix on that !
Maxime
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-01-16 0:49 [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression Russell King (Oracle)
2026-01-16 7:42 ` Maxime Chevallier
@ 2026-01-19 14:19 ` patchwork-bot+netdevbpf
2026-03-13 15:03 ` Georg Gottleuber
2 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-01-19 14:19 UTC (permalink / raw)
To: Russell King
Cc: andrew, hkallweit1, alexandre.torgue, andrew+netdev, davem,
edumazet, kuba, linux-arm-kernel, linux-stm32, mcoquelin.stm32,
netdev, pabeni
Hello:
This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 16 Jan 2026 00:49:24 +0000 you wrote:
> dwmac4's transmit performance dropped by a factor of four due to an
> incorrect assumption about which definitions are for what. This
> highlights the need for sane register macros.
>
> Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> definitions") changed the way the txpbl value is merged into the
> register:
>
> [...]
Here is the summary with links:
- [net-next] net: stmmac: fix dwmac4 transmit performance regression
https://git.kernel.org/netdev/net-next/c/5ccde4c81e84
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-01-16 0:49 [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression Russell King (Oracle)
2026-01-16 7:42 ` Maxime Chevallier
2026-01-19 14:19 ` patchwork-bot+netdevbpf
@ 2026-03-13 15:03 ` Georg Gottleuber
2026-03-13 16:35 ` Russell King (Oracle)
2 siblings, 1 reply; 9+ messages in thread
From: Georg Gottleuber @ 2026-03-13 15:03 UTC (permalink / raw)
To: Russell King (Oracle), Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, linux-arm-kernel, linux-stm32, Maxime Coquelin,
netdev, Paolo Abeni, Christoffer Sandberg, Werner Sembach
[-- Attachment #1: Type: text/plain, Size: 2137 bytes --]
Am 16.01.26 um 01:49 schrieb Russell King (Oracle):
> dwmac4's transmit performance dropped by a factor of four due to an
> incorrect assumption about which definitions are for what. This
> highlights the need for sane register macros.
>
> Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> definitions") changed the way the txpbl value is merged into the
> register:
>
> value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
> + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
>
> With the following in the header file:
>
> #define DMA_BUS_MODE_PBL BIT(16)
> -#define DMA_BUS_MODE_PBL_SHIFT 16
>
> The assumption here was that DMA_BUS_MODE_PBL was the mask for
> DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
>
> The field is actually six bits wide, buts 21:16, and is called
> TXPBL.
>
> What's even more confusing is, there turns out to be a PBLX8
> single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
> and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
> et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
> which is for register 0x1004.
>
> Fix this up by adding an appropriately named field definition under
> the DMA_CHAN_TX_CONTROL() register address definition.
>
> Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
> renaming it as well.
>
> Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
> renaming it.
>
> This removes confusion over the PBL fields.
>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Thank you for this patch, which significantly speeds up the transmit (by
more than a factor of nine on our devices with Motorcomm yt6801).
Unfortunately, this patch also causes DMA errors on two of our devices;
logs from a iperf3 test are attached.
Strangely enough, a third device with the same Motorcomm yt6801 does not
appear to be affected by the DMA errors. However, further testing is needed.
Do you have any ideas for further tests?
Regards,
Georg
[-- Attachment #2: dmesg_stellaris16_intel_gen6.log.gz --]
[-- Type: application/gzip, Size: 32509 bytes --]
[-- Attachment #3: dmesg_stellaris_slim_15_intel_gen6.log.gz --]
[-- Type: application/gzip, Size: 39139 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-03-13 15:03 ` Georg Gottleuber
@ 2026-03-13 16:35 ` Russell King (Oracle)
2026-03-13 18:39 ` Russell King (Oracle)
0 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2026-03-13 16:35 UTC (permalink / raw)
To: Georg Gottleuber
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
linux-stm32, Maxime Coquelin, netdev, Paolo Abeni,
Christoffer Sandberg, Werner Sembach
On Fri, Mar 13, 2026 at 04:03:16PM +0100, Georg Gottleuber wrote:
> Am 16.01.26 um 01:49 schrieb Russell King (Oracle):
> > dwmac4's transmit performance dropped by a factor of four due to an
> > incorrect assumption about which definitions are for what. This
> > highlights the need for sane register macros.
> >
> > Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> > definitions") changed the way the txpbl value is merged into the
> > register:
> >
> > value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> > - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
> > + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
> >
> > With the following in the header file:
> >
> > #define DMA_BUS_MODE_PBL BIT(16)
> > -#define DMA_BUS_MODE_PBL_SHIFT 16
> >
> > The assumption here was that DMA_BUS_MODE_PBL was the mask for
> > DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
> >
> > The field is actually six bits wide, buts 21:16, and is called
> > TXPBL.
> >
> > What's even more confusing is, there turns out to be a PBLX8
> > single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
> > and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
> > et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
> > which is for register 0x1004.
> >
> > Fix this up by adding an appropriately named field definition under
> > the DMA_CHAN_TX_CONTROL() register address definition.
> >
> > Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
> > renaming it as well.
> >
> > Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
> > renaming it.
> >
> > This removes confusion over the PBL fields.
> >
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>
> Thank you for this patch, which significantly speeds up the transmit (by
> more than a factor of nine on our devices with Motorcomm yt6801).
>
> Unfortunately, this patch also causes DMA errors on two of our devices;
> logs from a iperf3 test are attached.
>
> Strangely enough, a third device with the same Motorcomm yt6801 does not
> appear to be affected by the DMA errors. However, further testing is needed.
>
> Do you have any ideas for further tests?
I would suggest dumping the contents of these control registers prior
to commit 8409495bf6c9, and after this commit, comparing their values
to identify what has changed. I'm sorry, I don't have the bandwidth to
inspect the patches to see what may have been inadvertently changed.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-03-13 16:35 ` Russell King (Oracle)
@ 2026-03-13 18:39 ` Russell King (Oracle)
2026-03-16 10:51 ` Georg Gottleuber
0 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2026-03-13 18:39 UTC (permalink / raw)
To: Georg Gottleuber
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
linux-stm32, Maxime Coquelin, netdev, Paolo Abeni,
Christoffer Sandberg, Werner Sembach
On Fri, Mar 13, 2026 at 04:35:02PM +0000, Russell King (Oracle) wrote:
> On Fri, Mar 13, 2026 at 04:03:16PM +0100, Georg Gottleuber wrote:
> > Am 16.01.26 um 01:49 schrieb Russell King (Oracle):
> > > dwmac4's transmit performance dropped by a factor of four due to an
> > > incorrect assumption about which definitions are for what. This
> > > highlights the need for sane register macros.
> > >
> > > Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
> > > definitions") changed the way the txpbl value is merged into the
> > > register:
> > >
> > > value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
> > > - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
> > > + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
> > >
> > > With the following in the header file:
> > >
> > > #define DMA_BUS_MODE_PBL BIT(16)
> > > -#define DMA_BUS_MODE_PBL_SHIFT 16
> > >
> > > The assumption here was that DMA_BUS_MODE_PBL was the mask for
> > > DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
> > >
> > > The field is actually six bits wide, buts 21:16, and is called
> > > TXPBL.
> > >
> > > What's even more confusing is, there turns out to be a PBLX8
> > > single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
> > > and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
> > > et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
> > > which is for register 0x1004.
> > >
> > > Fix this up by adding an appropriately named field definition under
> > > the DMA_CHAN_TX_CONTROL() register address definition.
> > >
> > > Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
> > > renaming it as well.
> > >
> > > Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
> > > renaming it.
> > >
> > > This removes confusion over the PBL fields.
> > >
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> >
> > Thank you for this patch, which significantly speeds up the transmit (by
> > more than a factor of nine on our devices with Motorcomm yt6801).
> >
> > Unfortunately, this patch also causes DMA errors on two of our devices;
> > logs from a iperf3 test are attached.
> >
> > Strangely enough, a third device with the same Motorcomm yt6801 does not
> > appear to be affected by the DMA errors. However, further testing is needed.
> >
> > Do you have any ideas for further tests?
>
> I would suggest dumping the contents of these control registers prior
> to commit 8409495bf6c9, and after this commit, comparing their values
> to identify what has changed. I'm sorry, I don't have the bandwidth to
> inspect the patches to see what may have been inadvertently changed.
Note that dwmac-motorcomm was merged between the broken commit
8409495bf6c9 and the fix 5ccde4c81e84. However, the broken commit was
merged on 12 Jan, but there were postings of dwmac-motorcomm before
this:
https://lore.kernel.org/netdev/20251014164746.50696-5-ziyao@disroot.org/
which uses the same parameters.
So, I wonder whether what you're running into is that with the
breakage, the PCIe errors are masked. However, what I will note is that
setting TxPBL to zero (as will happen with the breakage, since 32 & 1
is 0) is documented as having undefined behaviour - so it's definitely
wrong. Even if zero works there, you're operating the IP in undefined
documented territory. If you really want to test that, with commit
5ccde4c81e84 applied, set txpbl to 64, as
FIELD_PREP(DMA_CHAN_TX_CTRL_TXPBL_MASK, 64)
will be zero because it overflows the 21:16 bitmask.
I suspect if you wind the kernel tree back to the before 8409495bf6c9,
and then apply the motorcomm support patches, you may well see these
PCIe errors - and that would rule out these changes. It would suggest
that this is a pre-existing problem, or maybe a hardware issue.
Another idea would be to try reducing txpbl to see if there's a value
where things stabilise.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression
2026-03-13 18:39 ` Russell King (Oracle)
@ 2026-03-16 10:51 ` Georg Gottleuber
0 siblings, 0 replies; 9+ messages in thread
From: Georg Gottleuber @ 2026-03-16 10:51 UTC (permalink / raw)
To: Russell King (Oracle), Georg Gottleuber
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
linux-stm32, Maxime Coquelin, netdev, Paolo Abeni,
Christoffer Sandberg, Werner Sembach
Am 13.03.26 um 19:39 schrieb Russell King (Oracle):
> On Fri, Mar 13, 2026 at 04:35:02PM +0000, Russell King (Oracle) wrote:
>> On Fri, Mar 13, 2026 at 04:03:16PM +0100, Georg Gottleuber wrote:
>>> Am 16.01.26 um 01:49 schrieb Russell King (Oracle):
>>>> dwmac4's transmit performance dropped by a factor of four due to an
>>>> incorrect assumption about which definitions are for what. This
>>>> highlights the need for sane register macros.
>>>>
>>>> Commit 8409495bf6c9 ("net: stmmac: cores: remove many xxx_SHIFT
>>>> definitions") changed the way the txpbl value is merged into the
>>>> register:
>>>>
>>>> value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan));
>>>> - value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT);
>>>> + value = value | FIELD_PREP(DMA_BUS_MODE_PBL, txpbl);
>>>>
>>>> With the following in the header file:
>>>>
>>>> #define DMA_BUS_MODE_PBL BIT(16)
>>>> -#define DMA_BUS_MODE_PBL_SHIFT 16
>>>>
>>>> The assumption here was that DMA_BUS_MODE_PBL was the mask for
>>>> DMA_BUS_MODE_PBL_SHIFT, but this turns out not to be the case.
>>>>
>>>> The field is actually six bits wide, buts 21:16, and is called
>>>> TXPBL.
>>>>
>>>> What's even more confusing is, there turns out to be a PBLX8
>>>> single bit in the DMA_CHAN_CONTROL register (0x1100 for channel 0),
>>>> and DMA_BUS_MODE_PBL seems to be used for that. However, this bit
>>>> et.al. was listed under a comment "/* DMA SYS Bus Mode bitmap */"
>>>> which is for register 0x1004.
>>>>
>>>> Fix this up by adding an appropriately named field definition under
>>>> the DMA_CHAN_TX_CONTROL() register address definition.
>>>>
>>>> Move the RPBL mask definition under DMA_CHAN_RX_CONTROL(), correctly
>>>> renaming it as well.
>>>>
>>>> Also move the PBL bit definition under DMA_CHAN_CONTROL(), correctly
>>>> renaming it.
>>>>
>>>> This removes confusion over the PBL fields.
>>>>
>>>> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>>>
>>> Thank you for this patch, which significantly speeds up the transmit (by
>>> more than a factor of nine on our devices with Motorcomm yt6801).
>>>
>>> Unfortunately, this patch also causes DMA errors on two of our devices;
>>> logs from a iperf3 test are attached.
>>>
>>> Strangely enough, a third device with the same Motorcomm yt6801 does not
>>> appear to be affected by the DMA errors. However, further testing is needed.
>>>
>>> Do you have any ideas for further tests?
>>
>> I would suggest dumping the contents of these control registers prior
>> to commit 8409495bf6c9, and after this commit, comparing their values
>> to identify what has changed. I'm sorry, I don't have the bandwidth to
>> inspect the patches to see what may have been inadvertently changed.
>
> Note that dwmac-motorcomm was merged between the broken commit
> 8409495bf6c9 and the fix 5ccde4c81e84. However, the broken commit was
> merged on 12 Jan, but there were postings of dwmac-motorcomm before
> this:
>
> https://lore.kernel.org/netdev/20251014164746.50696-5-ziyao@disroot.org/
>
> which uses the same parameters.
>
> So, I wonder whether what you're running into is that with the
> breakage, the PCIe errors are masked. However, what I will note is that
> setting TxPBL to zero (as will happen with the breakage, since 32 & 1
> is 0) is documented as having undefined behaviour - so it's definitely
> wrong. Even if zero works there, you're operating the IP in undefined
> documented territory. If you really want to test that, with commit
> 5ccde4c81e84 applied, set txpbl to 64, as
>
> FIELD_PREP(DMA_CHAN_TX_CTRL_TXPBL_MASK, 64)
>
> will be zero because it overflows the 21:16 bitmask.
>
> I suspect if you wind the kernel tree back to the before 8409495bf6c9,
> and then apply the motorcomm support patches, you may well see these
> PCIe errors - and that would rule out these changes. It would suggest
> that this is a pre-existing problem, or maybe a hardware issue.
Thank you very much for the explanation. You're right. I saw the DMA
errors in this case as well.
> Another idea would be to try reducing txpbl to see if there's a value
> where things stabilise.
>
I'll give it a try.
Regards,
Georg
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-16 10:51 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16 0:49 [PATCH net-next] net: stmmac: fix dwmac4 transmit performance regression Russell King (Oracle)
2026-01-16 7:42 ` Maxime Chevallier
2026-01-16 23:21 ` Russell King (Oracle)
2026-01-17 9:26 ` Maxime Chevallier
2026-01-19 14:19 ` patchwork-bot+netdevbpf
2026-03-13 15:03 ` Georg Gottleuber
2026-03-13 16:35 ` Russell King (Oracle)
2026-03-13 18:39 ` Russell King (Oracle)
2026-03-16 10:51 ` Georg Gottleuber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox