From: Simon Horman <horms@kernel.org>
To: yun.zhou@windriver.com
Cc: Simon Horman <horms@kernel.org>,
marcin.s.wojtas@gmail.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume
Date: Sat, 20 Jun 2026 20:00:36 +0100 [thread overview]
Message-ID: <20260620190036.997651-1-horms@kernel.org> (raw)
In-Reply-To: <20260618143516.1131805-1-yun.zhou@windriver.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: mvneta_bm: add suspend/resume support to prevent crash after resume
This patch adds suspend/resume callbacks to mvneta_bm so the BM hardware
state is drained at suspend and reinitialised at resume, and adds a
device_link from mvneta to mvneta_bm so PM ordering between the two is
guaranteed. A few questions on the new paths below.
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 0c061fb0ed07..488f2663ad2c 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -5678,6 +5678,24 @@ static int mvneta_probe(struct platform_device *pdev)
> "use SW buffer management\n");
> mvneta_bm_put(pp->bm_priv);
> pp->bm_priv = NULL;
> + } else if (!device_link_add(&pdev->dev,
> + &pp->bm_priv->pdev->dev,
> + DL_FLAG_AUTOREMOVE_CONSUMER)) {
[Medium]
Can the existing mvneta_bm_port_init() failure path leave priv->bm_pools[i]
in a state that the new suspend handler will then double-free?
mvneta_bm_pool_use() sets new_pool->type to LONG/SHORT before calling
mvneta_bm_pool_create(). If mvneta_bm_pool_create() fails after the
dma_alloc_coherent (for example at the alignment check or
mvebu_mbus_get_dram_win_info()), it does this:
drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_pool_create() {
...
dma_free_coherent(&pdev->dev, sizeof(u32) * size,
bm_pool->virt_addr, bm_pool->phys_addr);
return -ENOMEM;
}
It does not zero virt_addr, and the existing probe error path here only
calls mvneta_bm_put() and sets pp->bm_priv = NULL, without invoking
mvneta_bm_pool_destroy() on the partially-initialised pool:
if (err < 0) {
dev_info(&pdev->dev,
"use SW buffer management\n");
mvneta_bm_put(pp->bm_priv);
pp->bm_priv = NULL;
} else if (!device_link_add(...)) {
So priv->bm_pools[i] can be left with type != FREE and a freed-but-
non-NULL virt_addr. When mvneta_bm_suspend() later runs, it skips only
on type == MVNETA_BM_FREE and unconditionally calls
dma_free_coherent(bm_pool->virt_addr, ...) on the stale pointer.
Should the suspend loop mirror mvneta_bm_pool_destroy() and guard with
"if (bm_pool->virt_addr)" before calling dma_free_coherent(), and
NULL-check after?
> + /*
> + * Link guarantees BM resumes before mvneta.
> + * Without it, BM may not be ready when
> + * mvneta_bm_port_init() runs on resume,
> + * causing stale buffer addresses and a crash.
> + * Fall back to SW management to be safe.
> + */
> + dev_warn(&pdev->dev,
> + "failed to link to BM, use SW buffer management\n");
> + mvneta_bm_pool_destroy(pp->bm_priv,
> + pp->pool_long, 1 << pp->id);
> + mvneta_bm_pool_destroy(pp->bm_priv,
> + pp->pool_short, 1 << pp->id);
> + mvneta_bm_put(pp->bm_priv);
> + pp->bm_priv = NULL;
> }
> }
> /* Set RX packet offset correction for platforms, whose
> diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
> index 6bb380494919..85162a43eaf6 100644
> --- a/drivers/net/ethernet/marvell/mvneta_bm.c
> +++ b/drivers/net/ethernet/marvell/mvneta_bm.c
> @@ -477,6 +477,63 @@ static void mvneta_bm_remove(struct platform_device *pdev)
> clk_disable_unprepare(priv->clk);
> }
>
> +static int mvneta_bm_suspend(struct device *dev)
> +{
> + struct mvneta_bm *priv = dev_get_drvdata(dev);
> + int i;
> +
> + /* Drain buffers and free pool resources while BM is still clocked */
> + for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> + struct mvneta_bm_pool *bm_pool = &priv->bm_pools[i];
> + int size_bytes;
> +
> + if (bm_pool->type == MVNETA_BM_FREE)
> + continue;
> +
> + mvneta_bm_bufs_free(priv, bm_pool, bm_pool->port_map);
> + if (bm_pool->hwbm_pool.buf_num)
> + dev_warn(&priv->pdev->dev,
> + "pool %d: %d buffers not freed\n",
> + bm_pool->id, bm_pool->hwbm_pool.buf_num);
> +
> + size_bytes = sizeof(u32) * bm_pool->hwbm_pool.size;
> + dma_free_coherent(&priv->pdev->dev, size_bytes,
> + bm_pool->virt_addr, bm_pool->phys_addr);
> + bm_pool->virt_addr = NULL;
> + bm_pool->type = MVNETA_BM_FREE;
> + }
> +
> + mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
> + clk_disable_unprepare(priv->clk);
> + return 0;
> +}
[Medium]
Why does this loop not call mvneta_bm_pool_disable() (or otherwise clear
MVNETA_BM_POOL_ENABLE_MASK in BASE_REG) per pool the way
mvneta_bm_pool_destroy() does?
mvneta_bm_pool_destroy() clears the per-pool enable bit before the
dma_free_coherent of the BPPE region. The new suspend ordering is
instead drain via BPPI -> dma_free_coherent(BPPE) -> next pool -> a
single global MVNETA_BM_STOP_MASK write at the end.
mvneta_bm_bufs_free() also clears MVNETA_BM_EMPTY_LIMIT_MASK on exit,
which re-enables auto-refill:
drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_bufs_free() {
...
mvneta_bm_config_clear(priv, MVNETA_BM_EMPTY_LIMIT_MASK);
...
}
Between dma_free_coherent() for one pool's BPPE and the eventual global
STOP, BM hardware still has BASE_REG pointing at the freed phys_addr
with ENABLE_MASK set. Would matching mvneta_bm_remove()'s
"mvneta_bm_pool_destroy() per pool, then global STOP" ordering be safer
here?
[Low]
Resetting bm_pool->type to MVNETA_BM_FREE causes mvneta_bm_pool_use()
to re-enter its initialisation branch on every resume cycle, which
includes:
mutex_init(&hwbm_pool->buf_lock);
on a struct mutex that was already initialised at first use. Repeated
mutex_init() on a previously-initialised mutex discards lockdep history
for that lock.
Could suspend either call mutex_destroy() on buf_lock to pair the
re-init, or leave bm_pool->type unchanged and only clear the dynamic
state (virt_addr, buf_num) so mvneta_bm_pool_use() does not re-enter
the init branch?
> +
> +static int mvneta_bm_resume(struct device *dev)
> +{
> + struct mvneta_bm *priv = dev_get_drvdata(dev);
> + int i, err;
> +
> + err = clk_prepare_enable(priv->clk);
> + if (err)
> + return err;
> +
> + /* Reinitialize BM hardware; pools are refilled by mvneta_resume() */
> + mvneta_bm_default_set(priv);
> +
> + /* Restore pool registers lost during clock gating */
> + for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> + mvneta_bm_write(priv, MVNETA_BM_POOL_READ_PTR_REG(i), 0);
> + mvneta_bm_write(priv, MVNETA_BM_POOL_WRITE_PTR_REG(i), 0);
> + mvneta_bm_write(priv, MVNETA_BM_POOL_SIZE_REG(i),
> + priv->bm_pools[i].hwbm_pool.size);
> + }
> +
> + mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_START_MASK);
> + return 0;
> +}
[Medium]
The comment says "Restore pool registers lost during clock gating", but
the loop only restores SIZE_REG, READ_PTR_REG and WRITE_PTR_REG, and
does not touch MVNETA_BM_POOL_BASE_REG (which carries the BPPE physical
address and the per-pool MVNETA_BM_POOL_ENABLE_MASK).
Combined with mvneta_bm_suspend() not clearing the per-pool ENABLE_MASK
before clk_disable_unprepare(), MVNETA_BM_START_MASK is being written
here with each BASE_REG in an unspecified state.
If the IP retains BASE_REG content across clock gating, BM would be
started with stale BPPE phys_addrs (now freed memory) and ENABLE_MASK
set, until mvneta_resume() drives mvneta_bm_pool_create() to overwrite
BASE_REG.
Could resume explicitly clear each MVNETA_BM_POOL_BASE_REG (or at least
its ENABLE_MASK) before writing START_MASK, or could suspend clear the
per-pool ENABLE_MASK before disabling the clock?
prev parent reply other threads:[~2026-06-20 19:00 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 14:35 [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume Yun Zhou
2026-06-20 19:00 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260620190036.997651-1-horms@kernel.org \
--to=horms@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcin.s.wojtas@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=yun.zhou@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox