* [PATCH] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers
@ 2026-05-15 5:38 Ding Hui
2026-05-15 11:58 ` Andrew Lunn
0 siblings, 1 reply; 2+ messages in thread
From: Ding Hui @ 2026-05-15 5:38 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Maxime Chevallier,
open list:STMMAC ETHERNET DRIVER,
moderated list:ARM/STM32 ARCHITECTURE,
moderated list:ARM/STM32 ARCHITECTURE, open list
Cc: dinghui, xiasanbo, yangchen11, liuxuanjun
From: Ding Hui <dinghui@lixiang.com>
On suspend, stmmac_suspend() calls stmmac_disable_all_queues() which
stops the RX NAPI, but the RX DMA engine may still be running for a
short window before stmmac_stop_all_dma() takes effect. During that
window the hardware can write incoming frames into the buffers pointed
to by the RX descriptors and write back the descriptors (clearing the
OWN bit, updating length/status). Because NAPI is already disabled,
the driver never refills these descriptors, so the RX ring is left in
a "consumed but not refilled" state with HW-written content in the
descriptor buffer-address fields.
On resume, stmmac_clear_descriptors() only re-arms the OWN bit (rdes3)
and does not repopulate the RX buffer address fields. As a result the
descriptors still contain whatever the hardware wrote back during the
suspend race. When the DMA is restarted, it dereferences these stale
addresses and triggers a fatal bus error.
Fix this by treating the RX ring the same way as on close/open around
a PM transition:
- In stmmac_suspend(), after stmmac_stop_all_dma(), walk every RX
queue and free its buffers via dma_free_rx_xskbufs() when an XSK
pool is attached or dma_free_rx_skbufs() otherwise, then reset
rx_q->buf_alloc_num and clear rx_q->xsk_pool so the queue state
matches a freshly closed queue.
- In stmmac_resume(), call init_dma_rx_desc_rings() before
stmmac_reset_queues_param() so RX buffers are re-allocated and
the descriptor buffer-address fields are properly repopulated
before the DMA is restarted.
After this change, post-resume RX descriptors always reference freshly
allocated, driver-owned buffers, and the bus error no longer occurs.
Signed-off-by: Ding Hui <dinghui@lixiang.com>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 24 +++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3591755ea30b..8ed43187cf20 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -8176,6 +8176,9 @@ int stmmac_suspend(struct device *dev)
{
struct net_device *ndev = dev_get_drvdata(dev);
struct stmmac_priv *priv = netdev_priv(ndev);
+ u32 rx_count = priv->plat->rx_queues_to_use;
+ struct stmmac_rx_queue *rx_q;
+ u32 queue;
u8 chan;
if (!ndev || !netif_running(ndev))
@@ -8198,6 +8201,19 @@ int stmmac_suspend(struct device *dev)
/* Stop TX/RX DMA */
stmmac_stop_all_dma(priv);
+ /* Free RX queue resources */
+ for (queue = 0; queue < rx_count; queue++) {
+ rx_q = &priv->dma_conf.rx_queue[queue];
+
+ /* Release the DMA RX socket buffers */
+ if (rx_q->xsk_pool)
+ dma_free_rx_xskbufs(priv, &priv->dma_conf, queue);
+ else
+ dma_free_rx_skbufs(priv, &priv->dma_conf, queue);
+ rx_q->buf_alloc_num = 0;
+ rx_q->xsk_pool = NULL;
+ }
+
stmmac_legacy_serdes_power_down(priv);
/* Enable Power down mode by programming the PMT regs */
@@ -8316,6 +8332,14 @@ int stmmac_resume(struct device *dev)
mutex_lock(&priv->lock);
+ ret = init_dma_rx_desc_rings(ndev, &priv->dma_conf, GFP_KERNEL);
+ if (ret < 0) {
+ netdev_err(priv->dev, "%s: rx dma desc rings init failed\n", __func__);
+ mutex_unlock(&priv->lock);
+ rtnl_unlock();
+ return ret;
+ }
+
stmmac_reset_queues_param(priv);
stmmac_free_tx_skbufs(priv);
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers
2026-05-15 5:38 [PATCH] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers Ding Hui
@ 2026-05-15 11:58 ` Andrew Lunn
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Lunn @ 2026-05-15 11:58 UTC (permalink / raw)
To: Ding Hui
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Maxime Coquelin, Alexandre Torgue,
Russell King (Oracle), Maxime Chevallier,
open list:STMMAC ETHERNET DRIVER,
moderated list:ARM/STM32 ARCHITECTURE,
moderated list:ARM/STM32 ARCHITECTURE, open list, dinghui,
xiasanbo, yangchen11, liuxuanjun
> Fix this by treating the RX ring the same way as on close/open around
> a PM transition:
>
> - In stmmac_suspend(), after stmmac_stop_all_dma(), walk every RX
> queue and free its buffers via dma_free_rx_xskbufs() when an XSK
> pool is attached or dma_free_rx_skbufs() otherwise, then reset
> rx_q->buf_alloc_num and clear rx_q->xsk_pool so the queue state
> matches a freshly closed queue.
>
> - In stmmac_resume(), call init_dma_rx_desc_rings() before
> stmmac_reset_queues_param() so RX buffers are re-allocated and
> the descriptor buffer-address fields are properly repopulated
> before the DMA is restarted.
The problem with this is, if the system is under memory pressure, it
might not be able to allocate the new RX buffers. So on resume, your
network interface dies.
For configuration changes which require buffers to be change, like
ethtool --set-ring, sometimes changing the MTU, you first allocate the
new buffers, and only if successful do you free the old buffers, so
that you can gracefully fail.
That free and then release idea does not work for resume.
So, can you live with the buffers you have, and just reset the
descriptors?
Andrew
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-15 11:58 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 5:38 [PATCH] net: stmmac: fix fatal bus error on resume by reinitializing RX buffers Ding Hui
2026-05-15 11:58 ` Andrew Lunn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox