From: Frank Li <Frank.li@oss.nxp.com>
To: sashiko-reviews@lists.linux.dev
Cc: joy.zou@oss.nxp.com, Frank.Li@kernel.org, imx@lists.linux.dev,
vkoul@kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support
Date: Wed, 1 Jul 2026 09:50:34 -0500 [thread overview]
Message-ID: <akUpOso_-leCZeJu@SMW015318> (raw)
In-Reply-To: <20260701094312.B0EE61F000E9@smtp.kernel.org>
On Wed, Jul 01, 2026 at 09:43:12AM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
> - [High] Skipping hardware interrupt acknowledgment for inactive PM channels leads to an unrecoverable infinite IRQ storm.
> - [High] The channel runtime suspend callback calls `synchronize_irq()` with a negative error code and fails to synchronize actual per-channel shared IRQs.
> - [High] Accessing a freed `device_link` in `fsl_edma3_detach_pd()` due to devres teardown order inversion.
> - [Medium] The DMA engine and its channels are exposed to clients before their IRQs and runtime PM domains are fully initialized.
> - [Medium] Runtime PM usage counters for the engine device and channel power domains are permanently leaked on multiple error paths during probe.
> --
>
> commit a5800ff547600d938eb443a21684d6291b4658d1
> Author: Joy Zou <joy.zou@nxp.com>
>
> dmaengine: fsl-edma: add runtime suspend/resume support
>
> Introduce runtime suspend and resume support for FSL eDMA. Enable
> per-channel power domain management to facilitate runtime suspend and
> resume operations.
>
> > diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
> > index fe02b68d75fd3..3518dfb4292d2 100644
> > --- a/drivers/dma/fsl-edma-main.c
> > +++ b/drivers/dma/fsl-edma-main.c
> > @@ -37,15 +37,27 @@ static irqreturn_t fsl_edma_tx_handler(int irq, void *dev_id)
> > unsigned int intr, ch;
> > struct edma_regs *regs = &fsl_edma->regs;
> >
> > + if (pm_runtime_get_if_active(fsl_edma->dma_dev.dev) <= 0)
> > + return IRQ_NONE;
> > +
> > intr = edma_readl(fsl_edma, regs->intl);
> > - if (!intr)
> > + if (!intr) {
> > + pm_runtime_put(fsl_edma->dma_dev.dev);
> > return IRQ_NONE;
> > + }
> >
> > + pm_runtime_put(fsl_edma->dma_dev.dev);
> > for (ch = 0; ch < fsl_edma->n_chans; ch++) {
> > + struct fsl_edma_chan *fsl_chan = &fsl_edma->chans[ch];
> > +
> > + if (pm_runtime_get_if_active(&fsl_chan->vchan.chan.dev->device) <= 0)
> > + continue;
> > +
> > if (intr & (0x1 << ch)) {
> > edma_writeb(fsl_edma, EDMA_CINT_CINT(ch), regs->cint);
> > fsl_edma_tx_chan_handler(&fsl_edma->chans[ch]);
> > }
> > + pm_runtime_put(&fsl_chan->vchan.chan.dev->device);
> > }
> > return IRQ_HANDLED;
> > }
>
> [Severity: High]
> If a channel is runtime suspended (e.g., due to a hardware race or a spurious
> interrupt), the interrupt bit is never cleared because the loop evaluates
> pm_runtime_get_if_active() <= 0 and continues to the next iteration.
Make sense, please check other AI result
Frank
>
> However, the function ultimately returns IRQ_HANDLED unconditionally.
> Could returning IRQ_HANDLED without clearing the hardware interrupt flag
> trick the interrupt controller into thinking the event was serviced,
> potentially leading to an infinite IRQ storm?
>
> This same pattern appears in fsl_edma_err_handler() and
> fsl_edma3_or_err_handler().
>
> [ ... ]
>
> > @@ -691,6 +711,36 @@ static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_eng
> > return -EINVAL;
> > }
> >
> > +/* Per channel dma power domain */
> > +static int fsl_edma_chan_runtime_suspend(struct device *dev)
> > +{
> > + struct fsl_edma_chan *fsl_chan = dev_get_drvdata(dev);
> > + struct fsl_edma_engine *fsl_edma = fsl_chan->edma;
> > +
> > + if (fsl_edma->txirq)
> > + synchronize_irq(fsl_edma->txirq);
> > +
> > + if (fsl_edma->errirq)
> > + synchronize_irq(fsl_edma->errirq);
> > +
> > + clk_disable_unprepare(fsl_chan->clk);
> > +
> > + return 0;
> > +}
>
> [Severity: High]
> Are we synchronizing the correct IRQs before disabling the channel clock?
>
> During probe, global IRQs like fsl_edma->txirq are initialized to -EINVAL
> for SoCs using per-channel IRQs. This check evaluates to true because -EINVAL
> is non-zero, causing a negative error code to be passed to synchronize_irq().
>
> Additionally, this code doesn't synchronize the channel-specific shared
> IRQ (fsl_chan->txirq). Could this allow a shared IRQ handler to execute
> and access hardware after the clock is disabled?
>
> [ ... ]
>
> > @@ -891,6 +917,64 @@ static int fsl_edma_probe(struct platform_device *pdev)
> > return dev_err_probe(&pdev->dev, ret,
> > "Can't register Freescale eDMA engine.\n");
> >
> > + ret = devm_pm_runtime_enable(&pdev->dev);
> > + if (ret)
> > + return dev_err_probe(&pdev->dev, ret, "Can't enable eDMA engine PM runtime!\n");
> > +
> > + ret = pm_runtime_resume_and_get(&pdev->dev);
> > + if (ret < 0)
> > + return dev_err_probe(&pdev->dev, ret, "Failed to resume eDMA engine!\n");
> > +
> > + ret = fsl_edma->drvdata->setup_irq(pdev, fsl_edma);
> > + if (ret)
> > + return ret;
>
> [Severity: Medium]
> By calling dmaenginem_async_device_register() just prior to enabling
> runtime PM and setting up IRQs, do we expose the channels to clients
> prematurely?
>
> If a client allocates a channel concurrently before runtime PM is enabled,
> fsl_edma_alloc_chan_resources() will invoke pm_runtime_resume_and_get(),
> which could fail with -EINVAL.
>
> [Severity: Medium]
> If fsl_edma->drvdata->setup_irq() fails here, we return the error code
> directly. Does this leave the pdev->dev usage counter permanently
> incremented from the pm_runtime_resume_and_get() call just above?
>
> > + for (i = 0; i < fsl_edma->n_chans; i++) {
> > + struct fsl_edma_chan *fsl_chan = &fsl_edma->chans[i];
> > + struct device *chan_dev;
> > +
> > + if (fsl_edma->chan_masked & BIT(i))
> > + continue;
> > +
> > + chan_dev = &fsl_chan->vchan.chan.dev->device;
> > + dev_set_drvdata(chan_dev, fsl_chan);
> > + dev_pm_domain_set(chan_dev, &fsl_edma_chan_pm_domain);
> > +
> > + if (fsl_chan->pd_dev) {
> > + fsl_chan->pd_dev_link = device_link_add(chan_dev, fsl_chan->pd_dev,
> > + DL_FLAG_STATELESS |
> > + DL_FLAG_PM_RUNTIME);
> > + if (!fsl_chan->pd_dev_link) {
> > + return dev_err_probe(&pdev->dev, -EINVAL,
> > + "Failed to add device_link to %d!\n", i);
> > + }
> > + }
>
> [Severity: High]
> Could the teardown order of these stateless device links result in a
> use-after-free on probe failure or driver unbind?
>
> Earlier in probe, devm_fsl_edma3_detach_pd is registered as a devres
> cleanup action before the DMA engine is registered.
>
> However, the stateless device link is added here with the channel device
> (chan_dev) as the consumer. On unbind or probe failure, devres actions
> execute in reverse order:
>
> 1. The DMA engine unregisters, destroying chan_dev.
> 2. The driver core automatically purges and frees all device links attached
> to the deleted device.
> 3. devm_fsl_edma3_detach_pd executes and calls device_link_del() on the
> now-freed fsl_chan->pd_dev_link.
>
> > + ret = devm_pm_runtime_enable(chan_dev);
> > + if (ret)
> > + return dev_err_probe(&pdev->dev, ret,
> > + "Can't enable eDMA channel PM runtime!\n");
> > +
> > + if (fsl_chan->pd_dev) {
> > + ret = pm_runtime_resume_and_get(fsl_chan->pd_dev);
> > + if (ret)
> > + return dev_err_probe(&pdev->dev, ret,
> > + "Failed to power on eDMA channel %d!\n",
> > + fsl_chan->vchan.chan.chan_id);
> > + }
> > +
> > + ret = pm_runtime_resume_and_get(chan_dev);
> > + if (ret < 0) {
> > + return dev_err_probe(&pdev->dev, ret,
> > + "Failed to resume eDMA channel %d!\n",
> > + fsl_chan->vchan.chan.chan_id);
> > + }
>
> [Severity: Medium]
> If pm_runtime_resume_and_get(chan_dev) fails here, we return directly.
> Does this leak both the engine's usage counter (pdev->dev) and the channel
> power domain's usage counter (fsl_chan->pd_dev) acquired earlier in this
> function?
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260701-b4-edma-runtime-opt-v6-0-354ff4229c00@oss.nxp.com?part=4
next prev parent reply other threads:[~2026-07-01 14:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 9:29 [PATCH v6 0/5] add runtime suspend/resume support joy.zou
2026-07-01 9:29 ` [PATCH v6 1/5] dmaengine: fsl-edma: use devm_clk_get_optional() for channel clock joy.zou
2026-07-01 9:38 ` sashiko-bot
2026-07-01 9:29 ` [PATCH v6 2/5] dmaengine: fsl-edma: use devm_clk_get_optional() for DMA engine clock joy.zou
2026-07-01 9:40 ` sashiko-bot
2026-07-01 9:29 ` [PATCH v6 3/5] dmaengine: fsl-edma: convert DMAMUX clock handling to bulk clock API joy.zou
2026-07-01 9:39 ` sashiko-bot
2026-07-01 9:29 ` [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support joy.zou
2026-07-01 9:43 ` sashiko-bot
2026-07-01 14:50 ` Frank Li [this message]
2026-07-01 9:29 ` [PATCH v6 5/5] dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach() joy.zou
2026-07-01 9:44 ` sashiko-bot
2026-07-01 14:47 ` Frank Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=akUpOso_-leCZeJu@SMW015318 \
--to=frank.li@oss.nxp.com \
--cc=Frank.Li@kernel.org \
--cc=dmaengine@vger.kernel.org \
--cc=imx@lists.linux.dev \
--cc=joy.zou@oss.nxp.com \
--cc=sashiko-reviews@lists.linux.dev \
--cc=vkoul@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox