From: Simon Horman <horms@kernel.org>
To: yun.zhou@windriver.com
Cc: Simon Horman <horms@kernel.org>,
marcin.s.wojtas@gmail.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume
Date: Sat, 20 Jun 2026 20:00:36 +0100 [thread overview]
Message-ID: <20260620190036.997651-1-horms@kernel.org> (raw)
In-Reply-To: <20260618143516.1131805-1-yun.zhou@windriver.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: mvneta_bm: add suspend/resume support to prevent crash after resume
This patch adds suspend/resume callbacks to mvneta_bm so the BM hardware
state is drained at suspend and reinitialised at resume, and adds a
device_link from mvneta to mvneta_bm so PM ordering between the two is
guaranteed. A few questions on the new paths below.
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 0c061fb0ed07..488f2663ad2c 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -5678,6 +5678,24 @@ static int mvneta_probe(struct platform_device *pdev)
> "use SW buffer management\n");
> mvneta_bm_put(pp->bm_priv);
> pp->bm_priv = NULL;
> + } else if (!device_link_add(&pdev->dev,
> + &pp->bm_priv->pdev->dev,
> + DL_FLAG_AUTOREMOVE_CONSUMER)) {
[Medium]
Can the existing mvneta_bm_port_init() failure path leave priv->bm_pools[i]
in a state that the new suspend handler will then double-free?
mvneta_bm_pool_use() sets new_pool->type to LONG/SHORT before calling
mvneta_bm_pool_create(). If mvneta_bm_pool_create() fails after the
dma_alloc_coherent (for example at the alignment check or
mvebu_mbus_get_dram_win_info()), it does this:
drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_pool_create() {
...
dma_free_coherent(&pdev->dev, sizeof(u32) * size,
bm_pool->virt_addr, bm_pool->phys_addr);
return -ENOMEM;
}
It does not zero virt_addr, and the existing probe error path here only
calls mvneta_bm_put() and sets pp->bm_priv = NULL, without invoking
mvneta_bm_pool_destroy() on the partially-initialised pool:
if (err < 0) {
dev_info(&pdev->dev,
"use SW buffer management\n");
mvneta_bm_put(pp->bm_priv);
pp->bm_priv = NULL;
} else if (!device_link_add(...)) {
So priv->bm_pools[i] can be left with type != FREE and a freed-but-
non-NULL virt_addr. When mvneta_bm_suspend() later runs, it skips only
on type == MVNETA_BM_FREE and unconditionally calls
dma_free_coherent(bm_pool->virt_addr, ...) on the stale pointer.
Should the suspend loop mirror mvneta_bm_pool_destroy() and guard with
"if (bm_pool->virt_addr)" before calling dma_free_coherent(), and
NULL-check after?
> + /*
> + * Link guarantees BM resumes before mvneta.
> + * Without it, BM may not be ready when
> + * mvneta_bm_port_init() runs on resume,
> + * causing stale buffer addresses and a crash.
> + * Fall back to SW management to be safe.
> + */
> + dev_warn(&pdev->dev,
> + "failed to link to BM, use SW buffer management\n");
> + mvneta_bm_pool_destroy(pp->bm_priv,
> + pp->pool_long, 1 << pp->id);
> + mvneta_bm_pool_destroy(pp->bm_priv,
> + pp->pool_short, 1 << pp->id);
> + mvneta_bm_put(pp->bm_priv);
> + pp->bm_priv = NULL;
> }
> }
> /* Set RX packet offset correction for platforms, whose
> diff --git a/drivers/net/ethernet/marvell/mvneta_bm.c b/drivers/net/ethernet/marvell/mvneta_bm.c
> index 6bb380494919..85162a43eaf6 100644
> --- a/drivers/net/ethernet/marvell/mvneta_bm.c
> +++ b/drivers/net/ethernet/marvell/mvneta_bm.c
> @@ -477,6 +477,63 @@ static void mvneta_bm_remove(struct platform_device *pdev)
> clk_disable_unprepare(priv->clk);
> }
>
> +static int mvneta_bm_suspend(struct device *dev)
> +{
> + struct mvneta_bm *priv = dev_get_drvdata(dev);
> + int i;
> +
> + /* Drain buffers and free pool resources while BM is still clocked */
> + for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> + struct mvneta_bm_pool *bm_pool = &priv->bm_pools[i];
> + int size_bytes;
> +
> + if (bm_pool->type == MVNETA_BM_FREE)
> + continue;
> +
> + mvneta_bm_bufs_free(priv, bm_pool, bm_pool->port_map);
> + if (bm_pool->hwbm_pool.buf_num)
> + dev_warn(&priv->pdev->dev,
> + "pool %d: %d buffers not freed\n",
> + bm_pool->id, bm_pool->hwbm_pool.buf_num);
> +
> + size_bytes = sizeof(u32) * bm_pool->hwbm_pool.size;
> + dma_free_coherent(&priv->pdev->dev, size_bytes,
> + bm_pool->virt_addr, bm_pool->phys_addr);
> + bm_pool->virt_addr = NULL;
> + bm_pool->type = MVNETA_BM_FREE;
> + }
> +
> + mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_STOP_MASK);
> + clk_disable_unprepare(priv->clk);
> + return 0;
> +}
[Medium]
Why does this loop not call mvneta_bm_pool_disable() (or otherwise clear
MVNETA_BM_POOL_ENABLE_MASK in BASE_REG) per pool the way
mvneta_bm_pool_destroy() does?
mvneta_bm_pool_destroy() clears the per-pool enable bit before the
dma_free_coherent of the BPPE region. The new suspend ordering is
instead drain via BPPI -> dma_free_coherent(BPPE) -> next pool -> a
single global MVNETA_BM_STOP_MASK write at the end.
mvneta_bm_bufs_free() also clears MVNETA_BM_EMPTY_LIMIT_MASK on exit,
which re-enables auto-refill:
drivers/net/ethernet/marvell/mvneta_bm.c:mvneta_bm_bufs_free() {
...
mvneta_bm_config_clear(priv, MVNETA_BM_EMPTY_LIMIT_MASK);
...
}
Between dma_free_coherent() for one pool's BPPE and the eventual global
STOP, BM hardware still has BASE_REG pointing at the freed phys_addr
with ENABLE_MASK set. Would matching mvneta_bm_remove()'s
"mvneta_bm_pool_destroy() per pool, then global STOP" ordering be safer
here?
[Low]
Resetting bm_pool->type to MVNETA_BM_FREE causes mvneta_bm_pool_use()
to re-enter its initialisation branch on every resume cycle, which
includes:
mutex_init(&hwbm_pool->buf_lock);
on a struct mutex that was already initialised at first use. Repeated
mutex_init() on a previously-initialised mutex discards lockdep history
for that lock.
Could suspend either call mutex_destroy() on buf_lock to pair the
re-init, or leave bm_pool->type unchanged and only clear the dynamic
state (virt_addr, buf_num) so mvneta_bm_pool_use() does not re-enter
the init branch?
> +
> +static int mvneta_bm_resume(struct device *dev)
> +{
> + struct mvneta_bm *priv = dev_get_drvdata(dev);
> + int i, err;
> +
> + err = clk_prepare_enable(priv->clk);
> + if (err)
> + return err;
> +
> + /* Reinitialize BM hardware; pools are refilled by mvneta_resume() */
> + mvneta_bm_default_set(priv);
> +
> + /* Restore pool registers lost during clock gating */
> + for (i = 0; i < MVNETA_BM_POOLS_NUM; i++) {
> + mvneta_bm_write(priv, MVNETA_BM_POOL_READ_PTR_REG(i), 0);
> + mvneta_bm_write(priv, MVNETA_BM_POOL_WRITE_PTR_REG(i), 0);
> + mvneta_bm_write(priv, MVNETA_BM_POOL_SIZE_REG(i),
> + priv->bm_pools[i].hwbm_pool.size);
> + }
> +
> + mvneta_bm_write(priv, MVNETA_BM_COMMAND_REG, MVNETA_BM_START_MASK);
> + return 0;
> +}
[Medium]
The comment says "Restore pool registers lost during clock gating", but
the loop only restores SIZE_REG, READ_PTR_REG and WRITE_PTR_REG, and
does not touch MVNETA_BM_POOL_BASE_REG (which carries the BPPE physical
address and the per-pool MVNETA_BM_POOL_ENABLE_MASK).
Combined with mvneta_bm_suspend() not clearing the per-pool ENABLE_MASK
before clk_disable_unprepare(), MVNETA_BM_START_MASK is being written
here with each BASE_REG in an unspecified state.
If the IP retains BASE_REG content across clock gating, BM would be
started with stale BPPE phys_addrs (now freed memory) and ENABLE_MASK
set, until mvneta_resume() drives mvneta_bm_pool_create() to overwrite
BASE_REG.
Could resume explicitly clear each MVNETA_BM_POOL_BASE_REG (or at least
its ENABLE_MASK) before writing START_MASK, or could suspend clear the
per-pool ENABLE_MASK before disabling the clock?
prev parent reply other threads:[~2026-06-20 19:00 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 14:35 [PATCH v4] net: mvneta_bm: add suspend/resume support to prevent crash after resume Yun Zhou
2026-06-20 19:00 ` Simon Horman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260620190036.997651-1-horms@kernel.org \
--to=horms@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcin.s.wojtas@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=yun.zhou@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.