[PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume
@ 2026-06-18 14:35 Yun Zhou
  2026-06-20 19:00 ` Simon Horman
  0 siblings, 1 reply; 2+ messages in thread
From: Yun Zhou @ 2026-06-18 14:35 UTC (permalink / raw)
  To: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni
  Cc: netdev, linux-kernel, yun.zhou

The mvneta driver uses the hardware Buffer Manager (BM) for RX buffer
allocation. During suspend, mvneta disables its clock, causing BM to
lose all buffer address state. On resume, mvneta_bm_port_init() re-
attaches the BM pool to the NIC, but BM hardware returns stale/garbage
buffer addresses. When NAPI poll processes these buffers, DMA cache
sync hits an invalid virtual address causing a kernel panic:

 Unable to handle kernel paging request at virtual address b0000080
 PC is at v7_dma_inv_range
 Call trace:
  v7_dma_inv_range from arch_sync_dma_for_cpu+0x94/0x158
  arch_sync_dma_for_cpu from __dma_sync_single_for_cpu+0xc4/0x15c
  __dma_sync_single_for_cpu from mvneta_rx_swbm+0x6c8/0xf48
  mvneta_rx_swbm from mvneta_poll+0x6fc/0x70c
  mvneta_poll from __napi_poll.constprop.0+0x2c/0x1e0
  __napi_poll.constprop.0 from net_rx_action+0x160/0x2c4
  net_rx_action from handle_softirqs+0xd8/0x2b8
  handle_softirqs from run_ksoftirqd+0x30/0x94
  run_ksoftirqd from smpboot_thread_fn+0x100/0x204
  smpboot_thread_fn from kthread+0xf4/0x110
  kthread from ret_from_fork+0x14/0x28

Fix by adding suspend/resume callbacks to the BM driver:

- suspend: drain all buffers (with DMA unmapping), free the BPPE
  regions, and reset pool state to FREE before stopping BM and gating
  the clock.

- resume: enable the clock, reinitialize BM defaults, and restore pool
  read/write pointers and size registers. Pool allocation and buffer
  refill are handled by mvneta_resume() through the normal
  mvneta_bm_port_init() path, which sees pools as FREE and performs
  full initialization identical to probe.

Add a device_link (DL_FLAG_AUTOREMOVE_CONSUMER) in mvneta_probe to
guarantee BM resumes before mvneta and suspends after mvneta. If the
link cannot be created, fall back to SW buffer management to avoid a
potential crash on resume due to unordered PM transitions.

Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
---
v4:
  - On device_link_add() failure, fall back to SW buffer management
    (destroy pools, put BM reference, clear bm_priv) instead of merely
    emitting a warning. Without the link, suspend/resume ordering is
    not guaranteed and the original crash can still occur.

v3:
  - Restore per-pool POOL_SIZE_REG, POOL_READ_PTR_REG, and
    POOL_WRITE_PTR_REG in resume, since clock gating loses all BM
    register state.
  - Check device_link_add() return value and emit dev_warn on failure.
  - Replace SIMPLE_DEV_PM_OPS (deprecated) with
    DEFINE_SIMPLE_DEV_PM_OPS and pm_sleep_ptr(), removing the
    #ifdef CONFIG_PM_SLEEP guard.
  - Add dev_warn in suspend if not all buffers could be freed.

v2:
  - Drain buffers via mvneta_bm_bufs_free() in suspend instead of only
    stopping BM and gating the clock. This ensures proper DMA unmapping
    and avoids buffer leaks.
  - Free the BPPE DMA-coherent region in suspend so that resume takes
    the full probe-time initialization path (alloc + fill), eliminating
    the need to modify mvneta_bm_pool_create().
  - Reset pool type to MVNETA_BM_FREE in suspend so mvneta_bm_pool_use()
    correctly re-creates and refills pools on resume.
  - Check clk_prepare_enable() return value in resume.
  - Add device_link between mvneta (consumer) and mvneta_bm (supplier)
    to guarantee correct suspend/resume ordering.

 drivers/net/ethernet/marvell/mvneta.c    | 18 ++++++++
 drivers/net/ethernet/marvell/mvneta_bm.c | 58 ++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 744d6585a949..543e566425c1 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -5678,6 +5678,24 @@ static int mvneta_probe(struct platform_device *pdev)
 					 "use SW buffer management\n");
 				mvneta_bm_put(pp->bm_priv);
 				pp->bm_priv = NULL;
+			} else if (!device_link_add(&pdev->dev,
+						    &pp->bm_priv->pdev->dev,
+						    DL_FLAG_AUTOREMOVE_CONSUMER)) {
+				/*
+				 * Link guarantees BM resumes before mvneta.
+				 * Without it, BM may not be ready when
+				 * mvneta_bm_port_init() runs on resume,
+				 * causing stale buffer addresses and a crash.
+				 * Fall back to SW management to be safe.
+				 */
+				dev_warn(&pdev->dev,
+					 "failed to link to BM, use SW buffer management\n");
+				mvneta_bm_pool_destroy(pp->bm_priv,
+						       pp->pool_long, 1 << pp->id);
+				mvneta_bm_pool_destroy(pp->bm_priv,
+						       pp->pool_short, 1 << pp->id);
+				mvneta_bm_put(pp->bm_priv);
+				pp->bm_priv = NULL;
 			}
 		}
 		/* Set RX packet offset correction for platforms, whose
diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
index 6bb380494919..85162a43eaf6 100644
--- a/drivers/net/ethernet/marvell/mvneta_bm.c
+++ b/drivers/net/ethernet/marvell/mvneta_bm.c
@@ -477,6 +477,63 @@ static void mvneta_bm_remove(struct platform_device *pdev)
 	clk_disable_unprepare(priv->clk);
 }
 
+static int mvneta_bm_suspend(struct device *dev)
+{
+	struct mvneta_bm *priv = dev_get_drvdata(dev);
+	int i;
+
+	/* Drain buffers and free pool resources while BM is still clocked */
+	for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
+		struct mvneta_bm_pool *bm_pool = &priv->bm_pools[i];
+		int size_bytes;
+
+		if (bm_pool->type == MVNETA_BM_FREE)
+			continue;
+
+		mvneta_bm_bufs_free(priv, bm_pool, bm_pool->port_map);
+		if (bm_pool->hwbm_pool.buf_num)
+			dev_warn(&priv->pdev->dev,
+				 "pool %d: %d buffers not freed\n",
+				 bm_pool->id, bm_pool->hwbm_pool.buf_num);
+
+		size_bytes = sizeof(u32) * bm_pool->hwbm_pool.size;
+		dma_free_coherent(&priv->pdev->dev, size_bytes,
+				  bm_pool->virt_addr, bm_pool->phys_addr);
+		bm_pool->virt_addr = NULL;
+		bm_pool->type = MVNETA_BM_FREE;
+	}
+
+	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
+	clk_disable_unprepare(priv->clk);
+	return 0;
+}
+
+static int mvneta_bm_resume(struct device *dev)
+{
+	struct mvneta_bm *priv = dev_get_drvdata(dev);
+	int i, err;
+
+	err = clk_prepare_enable(priv->clk);
+	if (err)
+		return err;
+
+	/* Reinitialize BM hardware; pools are refilled by mvneta_resume() */
+	mvneta_bm_default_set(priv);
+
+	/* Restore pool registers lost during clock gating */
+	for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
+		mvneta_bm_write(priv, MVNETA_BM_POOL_READ_PTR_REG(i), 0);
+		mvneta_bm_write(priv, MVNETA_BM_POOL_WRITE_PTR_REG(i), 0);
+		mvneta_bm_write(priv, MVNETA_BM_POOL_SIZE_REG(i),
+				priv->bm_pools[i].hwbm_pool.size);
+	}
+
+	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_START_MASK);
+	return 0;
+}
+
+static DEFINE_SIMPLE_DEV_PM_OPS(mvneta_bm_pm_ops, mvneta_bm_suspend, mvneta_bm_resume);
+
 static const struct of_device_id mvneta_bm_match[] = {
 	{ .compatible = "marvell,armada-380-neta-bm" },
 	{ }
@@ -489,6 +546,7 @@ static struct platform_driver mvneta_bm_driver = {
 	.driver = {
 		.name = MVNETA_BM_DRIVER_NAME,
 		.of_match_table = mvneta_bm_match,
+		.pm = pm_sleep_ptr(&mvneta_bm_pm_ops),
 	},
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume
  2026-06-18 14:35 [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume Yun Zhou
@ 2026-06-20 19:00 ` Simon Horman
  0 siblings, 0 replies; 2+ messages in thread
From: Simon Horman @ 2026-06-20 19:00 UTC (permalink / raw)
  To: yun.zhou
  Cc: Simon Horman, marcin.s.wojtas, andrew+netdev, davem, edumazet,
	kuba, pabeni, netdev, linux-kernel

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: mvneta_bm: add suspend/resume support to prevent crash after resume

This patch adds suspend/resume callbacks to mvneta_bm so the BM hardware
state is drained at suspend and reinitialised at resume, and adds a
device_link from mvneta to mvneta_bm so PM ordering between the two is
guaranteed. A few questions on the new paths below.

> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 0c061fb0ed07..488f2663ad2c 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -5678,6 +5678,24 @@ static int mvneta_probe(struct platform_device *pdev)
>  					 "use SW buffer management\n");
>  				mvneta_bm_put(pp->bm_priv);
>  				pp->bm_priv = NULL;
> +			} else if (!device_link_add(&pdev->dev,
> +						    &pp->bm_priv->pdev->dev,
> +						    DL_FLAG_AUTOREMOVE_CONSUMER)) {

[Medium]
Can the existing mvneta_bm_port_init() failure path leave priv->bm_pools[i]
in a state that the new suspend handler will then double-free?

mvneta_bm_pool_use() sets new_pool->type to LONG/SHORT before calling
mvneta_bm_pool_create(). If mvneta_bm_pool_create() fails after the
dma_alloc_coherent (for example at the alignment check or
mvebu_mbus_get_dram_win_info()), it does this:

  drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_pool_create() {
      ...
      dma_free_coherent(&pdev->dev, sizeof(u32) * size,
                        bm_pool->virt_addr, bm_pool->phys_addr);
      return -ENOMEM;
  }

It does not zero virt_addr, and the existing probe error path here only
calls mvneta_bm_put() and sets pp->bm_priv = NULL, without invoking
mvneta_bm_pool_destroy() on the partially-initialised pool:

  if (err < 0) {
      dev_info(&pdev->dev,
               "use SW buffer management\n");
      mvneta_bm_put(pp->bm_priv);
      pp->bm_priv = NULL;
  } else if (!device_link_add(...)) {

So priv->bm_pools[i] can be left with type != FREE and a freed-but-
non-NULL virt_addr. When mvneta_bm_suspend() later runs, it skips only
on type == MVNETA_BM_FREE and unconditionally calls
dma_free_coherent(bm_pool->virt_addr, ...) on the stale pointer.

Should the suspend loop mirror mvneta_bm_pool_destroy() and guard with
"if (bm_pool->virt_addr)" before calling dma_free_coherent(), and
NULL-check after?

> +				/*
> +				 * Link guarantees BM resumes before mvneta.
> +				 * Without it, BM may not be ready when
> +				 * mvneta_bm_port_init() runs on resume,
> +				 * causing stale buffer addresses and a crash.
> +				 * Fall back to SW management to be safe.
> +				 */
> +				dev_warn(&pdev->dev,
> +					 "failed to link to BM, use SW buffer management\n");
> +				mvneta_bm_pool_destroy(pp->bm_priv,
> +						       pp->pool_long, 1 << pp->id);
> +				mvneta_bm_pool_destroy(pp->bm_priv,
> +						       pp->pool_short, 1 << pp->id);
> +				mvneta_bm_put(pp->bm_priv);
> +				pp->bm_priv = NULL;
>  			}
>  		}
>  		/* Set RX packet offset correction for platforms, whose
> diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
> index 6bb380494919..85162a43eaf6 100644
> --- a/drivers/net/ethernet/marvell/mvneta_bm.c
> +++ b/drivers/net/ethernet/marvell/mvneta_bm.c
> @@ -477,6 +477,63 @@ static void mvneta_bm_remove(struct platform_device *pdev)
>  	clk_disable_unprepare(priv->clk);
>  }
>  
> +static int mvneta_bm_suspend(struct device *dev)
> +{
> +	struct mvneta_bm *priv = dev_get_drvdata(dev);
> +	int i;
> +
> +	/* Drain buffers and free pool resources while BM is still clocked */
> +	for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> +		struct mvneta_bm_pool *bm_pool = &priv->bm_pools[i];
> +		int size_bytes;
> +
> +		if (bm_pool->type == MVNETA_BM_FREE)
> +			continue;
> +
> +		mvneta_bm_bufs_free(priv, bm_pool, bm_pool->port_map);
> +		if (bm_pool->hwbm_pool.buf_num)
> +			dev_warn(&priv->pdev->dev,
> +				 "pool %d: %d buffers not freed\n",
> +				 bm_pool->id, bm_pool->hwbm_pool.buf_num);
> +
> +		size_bytes = sizeof(u32) * bm_pool->hwbm_pool.size;
> +		dma_free_coherent(&priv->pdev->dev, size_bytes,
> +				  bm_pool->virt_addr, bm_pool->phys_addr);
> +		bm_pool->virt_addr = NULL;
> +		bm_pool->type = MVNETA_BM_FREE;
> +	}
> +
> +	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
> +	clk_disable_unprepare(priv->clk);
> +	return 0;
> +}

[Medium]
Why does this loop not call mvneta_bm_pool_disable() (or otherwise clear
MVNETA_BM_POOL_ENABLE_MASK in BASE_REG) per pool the way
mvneta_bm_pool_destroy() does?

mvneta_bm_pool_destroy() clears the per-pool enable bit before the
dma_free_coherent of the BPPE region. The new suspend ordering is
instead drain via BPPI -> dma_free_coherent(BPPE) -> next pool -> a
single global MVNETA_BM_STOP_MASK write at the end.

mvneta_bm_bufs_free() also clears MVNETA_BM_EMPTY_LIMIT_MASK on exit,
which re-enables auto-refill:

  drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_bufs_free() {
      ...
      mvneta_bm_config_clear(priv, MVNETA_BM_EMPTY_LIMIT_MASK);
      ...
  }

Between dma_free_coherent() for one pool's BPPE and the eventual global
STOP, BM hardware still has BASE_REG pointing at the freed phys_addr
with ENABLE_MASK set. Would matching mvneta_bm_remove()'s
"mvneta_bm_pool_destroy() per pool, then global STOP" ordering be safer
here?

[Low]
Resetting bm_pool->type to MVNETA_BM_FREE causes mvneta_bm_pool_use()
to re-enter its initialisation branch on every resume cycle, which
includes:

  mutex_init(&hwbm_pool->buf_lock);

on a struct mutex that was already initialised at first use. Repeated
mutex_init() on a previously-initialised mutex discards lockdep history
for that lock.

Could suspend either call mutex_destroy() on buf_lock to pair the
re-init, or leave bm_pool->type unchanged and only clear the dynamic
state (virt_addr, buf_num) so mvneta_bm_pool_use() does not re-enter
the init branch?

> +
> +static int mvneta_bm_resume(struct device *dev)
> +{
> +	struct mvneta_bm *priv = dev_get_drvdata(dev);
> +	int i, err;
> +
> +	err = clk_prepare_enable(priv->clk);
> +	if (err)
> +		return err;
> +
> +	/* Reinitialize BM hardware; pools are refilled by mvneta_resume() */
> +	mvneta_bm_default_set(priv);
> +
> +	/* Restore pool registers lost during clock gating */
> +	for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> +		mvneta_bm_write(priv, MVNETA_BM_POOL_READ_PTR_REG(i), 0);
> +		mvneta_bm_write(priv, MVNETA_BM_POOL_WRITE_PTR_REG(i), 0);
> +		mvneta_bm_write(priv, MVNETA_BM_POOL_SIZE_REG(i),
> +				priv->bm_pools[i].hwbm_pool.size);
> +	}
> +
> +	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_START_MASK);
> +	return 0;
> +}

[Medium]
The comment says "Restore pool registers lost during clock gating", but
the loop only restores SIZE_REG, READ_PTR_REG and WRITE_PTR_REG, and
does not touch MVNETA_BM_POOL_BASE_REG (which carries the BPPE physical
address and the per-pool MVNETA_BM_POOL_ENABLE_MASK).

Combined with mvneta_bm_suspend() not clearing the per-pool ENABLE_MASK
before clk_disable_unprepare(), MVNETA_BM_START_MASK is being written
here with each BASE_REG in an unspecified state.

If the IP retains BASE_REG content across clock gating, BM would be
started with stale BPPE phys_addrs (now freed memory) and ENABLE_MASK
set, until mvneta_resume() drives mvneta_bm_pool_create() to overwrite
BASE_REG.

Could resume explicitly clear each MVNETA_BM_POOL_BASE_REG (or at least
its ENABLE_MASK) before writing START_MASK, or could suspend clear the
per-pool ENABLE_MASK before disabling the clock?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-20 19:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18 14:35 [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume Yun Zhou
2026-06-20 19:00 ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.