* [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso
@ 2026-01-16 9:39 Tao Wang
2026-01-16 23:29 ` Russell King (Oracle)
0 siblings, 1 reply; 4+ messages in thread
From: Tao Wang @ 2026-01-16 9:39 UTC (permalink / raw)
To: kuba, linux, maxime.chevallier, netdev, davem, edumazet, pabeni,
horms, andrew+netdev, mcoquelin.stm32, alexandre.torgue,
rmk+kernel
Cc: linux-kernel, linux-doc, Tao Wang
after resume dev_watchdog() message:
"NETDEV WATCHDOG: CPU: x: transmit queue x timed out xx ms"
The trigging scenario is as follows:
When the TSO function sets tx_skbuff_dma[tx_q->cur_tx].last_segment = true,
and the last_segment value is not cleared in stmmac_free_tx_buffer after
resume, restarting TSO transmission may incorrectly use
tx_q->tx_skbuff_dma[first_entry].last_segment = true for a new TSO packet.
When the tx queue has timed out, and the emac TX descriptor is as follows:
eth0: 221 [0x0000000876d10dd0]: 0x73660cbe 0x8 0x42 0xb04416a0
eth0: 222 [0x0000000876d10de0]: 0x77731d40 0x8 0x16a0 0x90000000
Descriptor 221 is the TSO header, and descriptor 222 is the TSO payload.
In the tdes3 (0xb04416a0), bit 29 (first descriptor) and bit 28
(last descriptor) of the TSO packet 221 DMA descriptor cannot both be
set to 1 simultaneously. Since descriptor 222 is the actual last
descriptor, failing to set it properly will cause the EMAC DMA to stop
and hang.
To solve the issue, Do not use the last_segment default value and set
last_segment to false in stmmac_tso_xmit.
Fixes: c2837423cb54 ("net: stmmac: Rework TX Coalesce logic")
Signed-off-by: Tao Wang <tao03.wang@horizon.auto>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b3730312aeed..1735f1b50a71 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4448,6 +4448,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
if (dma_mapping_error(priv->device, des))
goto dma_map_err;
+ tx_q->tx_skbuff_dma[first_entry].last_segment = false;
stmmac_set_desc_addr(priv, first, des);
stmmac_tso_allocator(priv, des + proto_hdr_len, pay_len,
(nfrags == 0), queue);
--
2.52.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso
2026-01-16 9:39 [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso Tao Wang
@ 2026-01-16 23:29 ` Russell King (Oracle)
2026-01-17 7:59 ` Tao Wang
0 siblings, 1 reply; 4+ messages in thread
From: Russell King (Oracle) @ 2026-01-16 23:29 UTC (permalink / raw)
To: Tao Wang
Cc: kuba, maxime.chevallier, netdev, davem, edumazet, pabeni, horms,
andrew+netdev, mcoquelin.stm32, alexandre.torgue, linux-kernel,
linux-doc
On Fri, Jan 16, 2026 at 05:39:31PM +0800, Tao Wang wrote:
> after resume dev_watchdog() message:
> "NETDEV WATCHDOG: CPU: x: transmit queue x timed out xx ms"
>
> The trigging scenario is as follows:
> When the TSO function sets tx_skbuff_dma[tx_q->cur_tx].last_segment = true,
> and the last_segment value is not cleared in stmmac_free_tx_buffer after
> resume, restarting TSO transmission may incorrectly use
> tx_q->tx_skbuff_dma[first_entry].last_segment = true for a new TSO packet.
>
> When the tx queue has timed out, and the emac TX descriptor is as follows:
> eth0: 221 [0x0000000876d10dd0]: 0x73660cbe 0x8 0x42 0xb04416a0
> eth0: 222 [0x0000000876d10de0]: 0x77731d40 0x8 0x16a0 0x90000000
>
> Descriptor 221 is the TSO header, and descriptor 222 is the TSO payload.
> In the tdes3 (0xb04416a0), bit 29 (first descriptor) and bit 28
> (last descriptor) of the TSO packet 221 DMA descriptor cannot both be
> set to 1 simultaneously. Since descriptor 222 is the actual last
> descriptor, failing to set it properly will cause the EMAC DMA to stop
> and hang.
>
> To solve the issue, Do not use the last_segment default value and set
> last_segment to false in stmmac_tso_xmit.
>
> Fixes: c2837423cb54 ("net: stmmac: Rework TX Coalesce logic")
> Signed-off-by: Tao Wang <tao03.wang@horizon.auto>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index b3730312aeed..1735f1b50a71 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -4448,6 +4448,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
> if (dma_mapping_error(priv->device, des))
> goto dma_map_err;
>
> + tx_q->tx_skbuff_dma[first_entry].last_segment = false;
> stmmac_set_desc_addr(priv, first, des);
> stmmac_tso_allocator(priv, des + proto_hdr_len, pay_len,
> (nfrags == 0), queue);
Buried in the patches I worked on as a result of the previous version
of this patch, I came up with a completely different way to deal with
this which doesn't even need .last_segment set correctly.
8<===
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH net-next] net: stmmac: calculate tso last_segment
Rather than using tx_q->tx_skbuff_dma[].last_segment to determine
whether the first descriptor entry is the only segment, calculate the
number of descriptor entries used. If there is only one descriptor,
then the first is also the last, so mark it as such.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index c2589f02ff7e..e0da51222966 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4359,11 +4359,11 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
unsigned int first_entry, tx_packets;
struct stmmac_txq_stats *txq_stats;
struct stmmac_tx_queue *tx_q;
+ bool set_ic, is_last_segment;
u32 pay_len, mss, queue;
int i, first_tx, nfrags;
u8 proto_hdr_len, hdr;
dma_addr_t des;
- bool set_ic;
/* Always insert VLAN tag to SKB payload for TSO frames.
*
@@ -4551,10 +4551,16 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
stmmac_enable_tx_timestamp(priv, first);
}
+ /* If we only have one DMA descriptor used, then the first entry
+ * is the last segment.
+ */
+ is_last_segment = ((tx_q->cur_tx - first_entry) &
+ (priv->dma_conf.dma_tx_size - 1)) == 1;
+
/* Complete the first descriptor before granting the DMA */
stmmac_prepare_tso_tx_desc(priv, first, 1, proto_hdr_len, 0, 1,
- tx_q->tx_skbuff_dma[first_entry].last_segment,
- hdr / 4, (skb->len - proto_hdr_len));
+ is_last_segment, hdr / 4,
+ skb->len - proto_hdr_len);
/* If context desc is used to change MSS */
if (mss_desc) {
--
2.47.3
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso
2026-01-16 23:29 ` Russell King (Oracle)
@ 2026-01-17 7:59 ` Tao Wang
2026-01-17 11:09 ` Russell King (Oracle)
0 siblings, 1 reply; 4+ messages in thread
From: Tao Wang @ 2026-01-17 7:59 UTC (permalink / raw)
To: linux
Cc: alexandre.torgue, andrew+netdev, davem, edumazet, horms, kuba,
linux-doc, linux-kernel, maxime.chevallier, mcoquelin.stm32,
netdev, pabeni, tao03.wang
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 702 bytes --]
> Rather than using tx_q->tx_skbuff_dma[].last_segment to determine
> whether the first descriptor entry is the only segment, calculate the
> number of descriptor entries used. If there is only one descriptor,
> then the first is also the last, so mark it as such.
This is a good idea. tx_q->tx_skbuff_dma[].last_segment no longer carries
much meaning and can indeed be removed altogether.
> + is_last_segment = ((tx_q->cur_tx - first_entry) &
> + (priv->dma_conf.dma_tx_size - 1)) == 1;
Since tx_q->cur_tx may wrap around and become smaller than first_entry,
the following statement is more concise:
is_last_segment = (tx_q->cur_tx == first_entry);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso
2026-01-17 7:59 ` Tao Wang
@ 2026-01-17 11:09 ` Russell King (Oracle)
0 siblings, 0 replies; 4+ messages in thread
From: Russell King (Oracle) @ 2026-01-17 11:09 UTC (permalink / raw)
To: Tao Wang
Cc: alexandre.torgue, andrew+netdev, davem, edumazet, horms, kuba,
linux-doc, linux-kernel, maxime.chevallier, mcoquelin.stm32,
netdev, pabeni
On Sat, Jan 17, 2026 at 03:59:22PM +0800, Tao Wang wrote:
> > Rather than using tx_q->tx_skbuff_dma[].last_segment to determine
> > whether the first descriptor entry is the only segment, calculate the
> > number of descriptor entries used. If there is only one descriptor,
> > then the first is also the last, so mark it as such.
>
> This is a good idea. tx_q->tx_skbuff_dma[].last_segment no longer carries
> much meaning and can indeed be removed altogether.
>
> > + is_last_segment = ((tx_q->cur_tx - first_entry) &
> > + (priv->dma_conf.dma_tx_size - 1)) == 1;
>
> Since tx_q->cur_tx may wrap around and become smaller than first_entry,
> the following statement is more concise:
> is_last_segment = (tx_q->cur_tx == first_entry);
That's incorrect. We advance tx_q->cur_tx by at least one by this
point:
first_entry = tx_q->cur_tx;
... fill descriptors ...
/* We've used all descriptors we need for this skb, however,
* advance cur_tx so that it references a fresh descriptor.
* ndo_start_xmit will fill this descriptor the next time it's
* called and stmmac_tx_clean may clean up to this descriptor.
*/
tx_q->cur_tx = STMMAC_GET_ENTRY(tx_q->cur_tx, priv->dma_conf.dma_tx_size);
...
/* If we only have one entry used, then the first entry is the last
* segment.
*/
is_last_segment = ((tx_q->cur_tx - first_entry) &
(priv->dma_conf.dma_tx_size - 1)) == 1;
So, replacing this with a check for tx_q->cur_tx == first_entry
would always be false here, unless we completely filled the ring
with a single TSO.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-01-17 11:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-16 9:39 [PATCH net v3] net: stmmac: fix transmit queue timed out after resume for tso Tao Wang
2026-01-16 23:29 ` Russell King (Oracle)
2026-01-17 7:59 ` Tao Wang
2026-01-17 11:09 ` Russell King (Oracle)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox