[PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume

Netdev List
 help / color / mirror / Atom feed

* [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume
@ 2026-06-11  2:09 Yun Zhou
  2026-06-11  6:03 ` Andrew Lunn
  2026-06-12 16:50 ` Simon Horman
  0 siblings, 2 replies; 4+ messages in thread
From: Yun Zhou @ 2026-06-11  2:09 UTC (permalink / raw)
  To: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni
  Cc: netdev, linux-kernel, yun.zhou

The mvneta driver uses the hardware Buffer Manager (BM) for RX buffer
allocation. During suspend, mvneta disables its clock, causing BM to
lose all buffer address state. On resume, mvneta_bm_port_init() re-
attaches the BM pool to the NIC, but BM hardware returns stale/garbage
buffer addresses. When NAPI poll processes these buffers, DMA cache
sync hits an invalid virtual address causing a kernel panic:

  Unable to handle kernel paging request at virtual address b0000080
  PC is at v7_dma_inv_range
  Call trace:
   v7_dma_inv_range from arch_sync_dma_for_cpu+0x94/0x158
   arch_sync_dma_for_cpu from __dma_sync_single_for_cpu+0xc4/0x15c
   __dma_sync_single_for_cpu from mvneta_rx_swbm+0x6c8/0xf48
   mvneta_rx_swbm from mvneta_poll+0x6fc/0x70c
   mvneta_poll from __napi_poll.constprop.0+0x2c/0x1e0
   __napi_poll.constprop.0 from net_rx_action+0x160/0x2c4
   net_rx_action from handle_softirqs+0xd8/0x2b8
   handle_softirqs from run_ksoftirqd+0x30/0x94
   run_ksoftirqd from smpboot_thread_fn+0x100/0x204
   smpboot_thread_fn from kthread+0xf4/0x110
   kthread from ret_from_fork+0x14/0x28

Fix by adding suspend/resume callbacks to the BM driver:
- suspend: stop BM unit and disable clock
- resume: enable clock, reinitialize BM defaults and restart pools

Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
---
 drivers/net/ethernet/marvell/mvneta_bm.c | 25 ++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
index 6bb380494919..a0800b99c007 100644
--- a/drivers/net/ethernet/marvell/mvneta_bm.c
+++ b/drivers/net/ethernet/marvell/mvneta_bm.c
@@ -477,6 +477,30 @@ static void mvneta_bm_remove(struct platform_device *pdev)
 	clk_disable_unprepare(priv->clk);
 }
 
+#ifdef CONFIG_PM_SLEEP
+static int mvneta_bm_suspend(struct device *dev)
+{
+	struct mvneta_bm *priv = dev_get_drvdata(dev);
+
+	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
+	clk_disable_unprepare(priv->clk);
+	return 0;
+}
+
+static int mvneta_bm_resume(struct device *dev)
+{
+	struct mvneta_bm *priv = dev_get_drvdata(dev);
+
+	clk_prepare_enable(priv->clk);
+	mdelay(1);
+	mvneta_bm_default_set(priv);
+	mvneta_bm_pools_init(priv);
+	return 0;
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(mvneta_bm_pm_ops, mvneta_bm_suspend, mvneta_bm_resume);
+
 static const struct of_device_id mvneta_bm_match[] = {
 	{ .compatible = "marvell,armada-380-neta-bm" },
 	{ }
@@ -489,6 +513,7 @@ static struct platform_driver mvneta_bm_driver = {
 	.driver = {
 		.name = MVNETA_BM_DRIVER_NAME,
 		.of_match_table = mvneta_bm_match,
+		.pm = &mvneta_bm_pm_ops,
 	},
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume
  2026-06-11  2:09 [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume Yun Zhou
@ 2026-06-11  6:03 ` Andrew Lunn
  2026-06-11  7:23   ` Zhou, Yun
  2026-06-12 16:50 ` Simon Horman
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Lunn @ 2026-06-11  6:03 UTC (permalink / raw)
  To: Yun Zhou
  Cc: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni,
	netdev, linux-kernel

On Thu, Jun 11, 2026 at 10:09:35AM +0800, Yun Zhou wrote:
> The mvneta driver uses the hardware Buffer Manager (BM) for RX buffer
> allocation. During suspend, mvneta disables its clock, causing BM to
> lose all buffer address state. On resume, mvneta_bm_port_init() re-
> attaches the BM pool to the NIC, but BM hardware returns stale/garbage
> buffer addresses. When NAPI poll processes these buffers, DMA cache
> sync hits an invalid virtual address causing a kernel panic:

Has this never worked, or has something changed recently which broke
it?

	Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume
  2026-06-11  6:03 ` Andrew Lunn
@ 2026-06-11  7:23   ` Zhou, Yun
  0 siblings, 0 replies; 4+ messages in thread
From: Zhou, Yun @ 2026-06-11  7:23 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni,
	netdev, linux-kernel



On 6/11/26 14:03, Andrew Lunn wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> On Thu, Jun 11, 2026 at 10:09:35AM +0800, Yun Zhou wrote:
>> The mvneta driver uses the hardware Buffer Manager (BM) for RX buffer
>> allocation. During suspend, mvneta disables its clock, causing BM to
>> lose all buffer address state. On resume, mvneta_bm_port_init() re-
>> attaches the BM pool to the NIC, but BM hardware returns stale/garbage
>> buffer addresses. When NAPI poll processes these buffers, DMA cache
>> sync hits an invalid virtual address causing a kernel panic:
> Has this never worked, or has something changed recently which broke
> it?
>
>          Andrew
It should have never worked before, because the mvneta_bm module
does not have PM related support.On the Marvel Armada XP
development board, if try to put the system to STR while there is
network packet transmission, and then wake it up, after a period
of time (a few minutes or an hour), the crash mentioned in the patch
will appear.

By the way, there is more than one issue on this platform that causes
STR to not work, like this:
https://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git/commit/?h=gpio/for-current&id=b9ad50d7505ebd48282ec3630258dc820fc85c81
I guess that the vendor may not have verified STR.

BR,
Yun

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume
  2026-06-11  2:09 [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume Yun Zhou
  2026-06-11  6:03 ` Andrew Lunn
@ 2026-06-12 16:50 ` Simon Horman
  1 sibling, 0 replies; 4+ messages in thread
From: Simon Horman @ 2026-06-12 16:50 UTC (permalink / raw)
  To: yun.zhou
  Cc: Simon Horman, marcin.s.wojtas, andrew+netdev, davem, edumazet,
	kuba, pabeni, netdev, linux-kernel

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: mvneta_bm: add suspend/resume to prevent crash after resume

This patch adds suspend/resume PM callbacks to the mvneta_bm driver to
avoid a NAPI-time panic after system resume caused by BM hardware
returning stale buffer addresses. The suspend handler stops BM and gates
the clock; the resume handler re-enables the clock, calls
mvneta_bm_default_set() and mvneta_bm_pools_init().

> diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
> index 6bb380494919..a0800b99c007 100644
> --- a/drivers/net/ethernet/marvell/mvneta_bm.c
> +++ b/drivers/net/ethernet/marvell/mvneta_bm.c
> @@ -477,6 +477,30 @@ static void mvneta_bm_remove(struct platform_device *pdev)
>  	clk_disable_unprepare(priv->clk);
>  }
>  
> +#ifdef CONFIG_PM_SLEEP
> +static int mvneta_bm_suspend(struct device *dev)
> +{
> +	struct mvneta_bm *priv = dev_get_drvdata(dev);
> +
> +	mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
> +	clk_disable_unprepare(priv->clk);
> +	return 0;
> +}

[High]
Does this suspend path leak every RX buffer currently parked in the BM
hardware?

Buffers allocated via mvneta_bm_construct() are tracked only by the BM
unit itself: their virtual address is stored in the first 4 bytes of
the buffer, while only the phys address is pushed to BM through
mvneta_bm_pool_put_bp(). There is no kernel-side shadow list.

Once MVNETA_BM_STOP_MASK is written and the clock is gated, the commit
message itself says BM "loses all buffer address state". On a later
teardown, mvneta_bm_bufs_free() will read zero/garbage out of the now
empty BPPI/BPPE and bail on:

	if (buf_phys_addr == 0)
		continue;

so dma_unmap_single() / hwbm_buf_free() are never called for the
original allocations. Should suspend drain BM via mvneta_bm_bufs_free()
while BM still knows the addresses, and resume refill via
hwbm_pool_add()?

> +
> +static int mvneta_bm_resume(struct device *dev)
> +{
> +	struct mvneta_bm *priv = dev_get_drvdata(dev);
> +
> +	clk_prepare_enable(priv->clk);

[Medium]
Should the return value of clk_prepare_enable() be checked here?

mvneta_bm_probe() checks the same call and propagates the error. If
the call fails on the resume path, the immediately-following readl/
writel accesses in mvneta_bm_default_set() (MVNETA_BM_INTR_MASK_REG,
MVNETA_BM_INTR_CAUSE_REG, MVNETA_BM_CONFIG_REG) and
mvneta_bm_pools_init() (MVNETA_BM_COMMAND_REG and per-pool registers)
target an unclocked peripheral, which on Armada 380 typically raises a
synchronous external abort.

> +	mdelay(1);

[Low]
Resume callbacks for system sleep run in process context. Could
usleep_range() be used here instead of the busy-wait mdelay(1)?

Also, what is the 1ms delay waiting for? The probe path does the same
clk_prepare_enable() followed by mvneta_bm_default_set() with no
delay, and there is no comment or datasheet citation explaining what
this is for.

> +	mvneta_bm_default_set(priv);
> +	mvneta_bm_pools_init(priv);

[High, High]
Two questions about reusing mvneta_bm_pools_init() on the resume path.

First, does resume leave BM hardware with no buffer addresses, in spite
of the commit log saying the resume should "restart pools"?

mvneta_bm_pools_init() resets bm_pool->type to MVNETA_BM_FREE but does
not reset hwbm_pool->buf_num. The original probe-time fill set
buf_num = hwbm_pool->size, so the post-resume call chain
mvneta_bm_port_init() -> mvneta_bm_pool_use() -> hwbm_pool_add() takes
this early-return in net/core/hwbm.c:

	int hwbm_pool_add(struct hwbm_pool *bm_pool, unsigned int buf_num)
	{
		...
		if (bm_pool->buf_num == bm_pool->size) {
			pr_warn("pool already filled\n");
			return bm_pool->buf_num;
		}

hwbm_pool_refill() is never called, so no buffers are constructed and
no phys addresses are pushed via mvneta_bm_pool_put_bp(). Meanwhile,
mvneta_bm_pool_create() reprograms MVNETA_BM_POOL_BASE_REG to a fresh
dma_alloc_coherent() region. Doesn't this leave BM started with an
empty pool, recreating the same class of failure the patch claims to
fix the next time mvneta_rx_hwbm() runs?

Second, can this leak the previously allocated BPPE region per
suspend/resume cycle?

mvneta_bm_pools_init() unconditionally sets bm_pool->type =
MVNETA_BM_FREE for every pool but does not free
bm_pool->virt_addr / bm_pool->phys_addr (the DMA-coherent BPPE region
allocated by mvneta_bm_pool_create() during probe). On resume,
mvneta_bm_port_init() -> mvneta_bm_pool_use() observes
new_pool->type == MVNETA_BM_FREE and re-enters
mvneta_bm_pool_create(), which calls dma_alloc_coherent() and
overwrites virt_addr/phys_addr without dma_free_coherent() on the
previous allocation. mvneta_bm_pool_destroy() is the only freer of
virt_addr and is not called from the suspend path.

> +	return 0;
> +}
> +#endif
> +
> +static SIMPLE_DEV_PM_OPS(mvneta_bm_pm_ops, mvneta_bm_suspend, mvneta_bm_resume);
> +
>  static const struct of_device_id mvneta_bm_match[] = {
>  	{ .compatible = "marvell,armada-380-neta-bm" },
>  	{ }
> @@ -489,6 +513,7 @@ static struct platform_driver mvneta_bm_driver = {
>  	.driver = {
>  		.name = MVNETA_BM_DRIVER_NAME,
>  		.of_match_table = mvneta_bm_match,
> +		.pm = &mvneta_bm_pm_ops,
>  	},
>  };

[Medium]
Is the suspend/resume ordering between the mvneta_bm supplier and the
mvneta consumer guaranteed by anything beyond dpm_list registration
order?

mvneta_bm_get() does of_find_device_by_node() and
platform_get_drvdata(), but does not call device_link_add(). Without a
device link or equivalent dpm dependency, ordering between two
independent platform_drivers is determined by probe completion order.
Under deferred-probe scenarios (e.g., mvneta probes first with the
bm_priv == NULL fallback, then mvneta_bm probes later), would the
ordering still keep mvneta_bm_resume() before any mvneta_resume() ->
mvneta_bm_port_init() touching BM registers, and the mvneta consumer
quiesced before mvneta_bm_suspend() stops BM and gates the clock?

Would adding a device_link_add(supplier=bm_pdev, consumer=mvneta_pdev,
DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE_CONSUMER) when a consumer
attaches make this ordering explicit?
-- 
pw-bot: changes-requested

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-12 16:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11  2:09 [PATCH] net: mvneta_bm: add suspend/resume to prevent crash after resume Yun Zhou
2026-06-11  6:03 ` Andrew Lunn
2026-06-11  7:23   ` Zhou, Yun
2026-06-12 16:50 ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox