Re: [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support

Linux kernel and device drivers for NXP i.MX platforms
 help / color / mirror / Atom feed

From: Frank Li <Frank.li@oss.nxp.com>
To: sashiko-reviews@lists.linux.dev
Cc: joy.zou@oss.nxp.com, Frank.Li@kernel.org, imx@lists.linux.dev,
	vkoul@kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support
Date: Wed, 1 Jul 2026 09:50:34 -0500	[thread overview]
Message-ID: <akUpOso_-leCZeJu@SMW015318> (raw)
In-Reply-To: <20260701094312.B0EE61F000E9@smtp.kernel.org>

On Wed, Jul 01, 2026 at 09:43:12AM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
> - [High] Skipping hardware interrupt acknowledgment for inactive PM channels leads to an unrecoverable infinite IRQ storm.
> - [High] The channel runtime suspend callback calls `synchronize_irq()` with a negative error code and fails to synchronize actual per-channel shared IRQs.
> - [High] Accessing a freed `device_link` in `fsl_edma3_detach_pd()` due to devres teardown order inversion.
> - [Medium] The DMA engine and its channels are exposed to clients before their IRQs and runtime PM domains are fully initialized.
> - [Medium] Runtime PM usage counters for the engine device and channel power domains are permanently leaked on multiple error paths during probe.
> --
>
> commit a5800ff547600d938eb443a21684d6291b4658d1
> Author: Joy Zou <joy.zou@nxp.com>
>
> dmaengine: fsl-edma: add runtime suspend/resume support
>
> Introduce runtime suspend and resume support for FSL eDMA. Enable
> per-channel power domain management to facilitate runtime suspend and
> resume operations.
>
> > diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
> > index fe02b68d75fd3..3518dfb4292d2 100644
> > --- a/drivers/dma/fsl-edma-main.c
> > +++ b/drivers/dma/fsl-edma-main.c
> > @@ -37,15 +37,27 @@ static irqreturn_t fsl_edma_tx_handler(int irq, void *dev_id)
> >  	unsigned int intr, ch;
> >  	struct edma_regs *regs = &fsl_edma->regs;
> >
> > +	if (pm_runtime_get_if_active(fsl_edma->dma_dev.dev) <= 0)
> > +		return IRQ_NONE;
> > +
> >  	intr = edma_readl(fsl_edma, regs->intl);
> > -	if (!intr)
> > +	if (!intr) {
> > +		pm_runtime_put(fsl_edma->dma_dev.dev);
> >  		return IRQ_NONE;
> > +	}
> >
> > +	pm_runtime_put(fsl_edma->dma_dev.dev);
> >  	for (ch = 0; ch < fsl_edma->n_chans; ch++) {
> > +		struct fsl_edma_chan *fsl_chan = &fsl_edma->chans[ch];
> > +
> > +		if (pm_runtime_get_if_active(&fsl_chan->vchan.chan.dev->device) <= 0)
> > +			continue;
> > +
> >  		if (intr & (0x1 << ch)) {
> >  			edma_writeb(fsl_edma, EDMA_CINT_CINT(ch), regs->cint);
> >  			fsl_edma_tx_chan_handler(&fsl_edma->chans[ch]);
> >  		}
> > +		pm_runtime_put(&fsl_chan->vchan.chan.dev->device);
> >  	}
> >  	return IRQ_HANDLED;
> >  }
>
> [Severity: High]
> If a channel is runtime suspended (e.g., due to a hardware race or a spurious
> interrupt), the interrupt bit is never cleared because the loop evaluates
> pm_runtime_get_if_active() <= 0 and continues to the next iteration.

Make sense, please check other AI result

Frank

>
> However, the function ultimately returns IRQ_HANDLED unconditionally.
> Could returning IRQ_HANDLED without clearing the hardware interrupt flag
> trick the interrupt controller into thinking the event was serviced,
> potentially leading to an infinite IRQ storm?
>
> This same pattern appears in fsl_edma_err_handler() and
> fsl_edma3_or_err_handler().
>
> [ ... ]
>
> > @@ -691,6 +711,36 @@ static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_eng
> >  	return -EINVAL;
> >  }
> >
> > +/* Per channel dma power domain */
> > +static int fsl_edma_chan_runtime_suspend(struct device *dev)
> > +{
> > +	struct fsl_edma_chan *fsl_chan = dev_get_drvdata(dev);
> > +	struct fsl_edma_engine *fsl_edma = fsl_chan->edma;
> > +
> > +	if (fsl_edma->txirq)
> > +		synchronize_irq(fsl_edma->txirq);
> > +
> > +	if (fsl_edma->errirq)
> > +		synchronize_irq(fsl_edma->errirq);
> > +
> > +	clk_disable_unprepare(fsl_chan->clk);
> > +
> > +	return 0;
> > +}
>
> [Severity: High]
> Are we synchronizing the correct IRQs before disabling the channel clock?
>
> During probe, global IRQs like fsl_edma->txirq are initialized to -EINVAL
> for SoCs using per-channel IRQs. This check evaluates to true because -EINVAL
> is non-zero, causing a negative error code to be passed to synchronize_irq().
>
> Additionally, this code doesn't synchronize the channel-specific shared
> IRQ (fsl_chan->txirq). Could this allow a shared IRQ handler to execute
> and access hardware after the clock is disabled?
>
> [ ... ]
>
> > @@ -891,6 +917,64 @@ static int fsl_edma_probe(struct platform_device *pdev)
> >  		return dev_err_probe(&pdev->dev, ret,
> >  				     "Can't register Freescale eDMA engine.\n");
> >
> > +	ret = devm_pm_runtime_enable(&pdev->dev);
> > +	if (ret)
> > +		return dev_err_probe(&pdev->dev, ret, "Can't enable eDMA engine PM runtime!\n");
> > +
> > +	ret = pm_runtime_resume_and_get(&pdev->dev);
> > +	if (ret < 0)
> > +		return dev_err_probe(&pdev->dev, ret, "Failed to resume eDMA engine!\n");
> > +
> > +	ret = fsl_edma->drvdata->setup_irq(pdev, fsl_edma);
> > +	if (ret)
> > +		return ret;
>
> [Severity: Medium]
> By calling dmaenginem_async_device_register() just prior to enabling
> runtime PM and setting up IRQs, do we expose the channels to clients
> prematurely?
>
> If a client allocates a channel concurrently before runtime PM is enabled,
> fsl_edma_alloc_chan_resources() will invoke pm_runtime_resume_and_get(),
> which could fail with -EINVAL.
>
> [Severity: Medium]
> If fsl_edma->drvdata->setup_irq() fails here, we return the error code
> directly. Does this leave the pdev->dev usage counter permanently
> incremented from the pm_runtime_resume_and_get() call just above?
>
> > +	for (i = 0; i < fsl_edma->n_chans; i++) {
> > +		struct fsl_edma_chan *fsl_chan = &fsl_edma->chans[i];
> > +		struct device *chan_dev;
> > +
> > +		if (fsl_edma->chan_masked & BIT(i))
> > +			continue;
> > +
> > +		chan_dev = &fsl_chan->vchan.chan.dev->device;
> > +		dev_set_drvdata(chan_dev, fsl_chan);
> > +		dev_pm_domain_set(chan_dev, &fsl_edma_chan_pm_domain);
> > +
> > +		if (fsl_chan->pd_dev) {
> > +			fsl_chan->pd_dev_link = device_link_add(chan_dev, fsl_chan->pd_dev,
> > +								DL_FLAG_STATELESS |
> > +								DL_FLAG_PM_RUNTIME);
> > +			if (!fsl_chan->pd_dev_link) {
> > +				return dev_err_probe(&pdev->dev, -EINVAL,
> > +						     "Failed to add device_link to %d!\n", i);
> > +			}
> > +		}
>
> [Severity: High]
> Could the teardown order of these stateless device links result in a
> use-after-free on probe failure or driver unbind?
>
> Earlier in probe, devm_fsl_edma3_detach_pd is registered as a devres
> cleanup action before the DMA engine is registered.
>
> However, the stateless device link is added here with the channel device
> (chan_dev) as the consumer. On unbind or probe failure, devres actions
> execute in reverse order:
>
> 1. The DMA engine unregisters, destroying chan_dev.
> 2. The driver core automatically purges and frees all device links attached
>    to the deleted device.
> 3. devm_fsl_edma3_detach_pd executes and calls device_link_del() on the
>    now-freed fsl_chan->pd_dev_link.
>
> > +		ret = devm_pm_runtime_enable(chan_dev);
> > +		if (ret)
> > +			return dev_err_probe(&pdev->dev, ret,
> > +					     "Can't enable eDMA channel PM runtime!\n");
> > +
> > +		if (fsl_chan->pd_dev) {
> > +			ret = pm_runtime_resume_and_get(fsl_chan->pd_dev);
> > +			if (ret)
> > +				return dev_err_probe(&pdev->dev, ret,
> > +						     "Failed to power on eDMA channel %d!\n",
> > +						     fsl_chan->vchan.chan.chan_id);
> > +		}
> > +
> > +		ret = pm_runtime_resume_and_get(chan_dev);
> > +		if (ret < 0) {
> > +			return dev_err_probe(&pdev->dev, ret,
> > +					     "Failed to resume eDMA channel %d!\n",
> > +					     fsl_chan->vchan.chan.chan_id);
> > +		}
>
> [Severity: Medium]
> If pm_runtime_resume_and_get(chan_dev) fails here, we return directly.
> Does this leak both the engine's usage counter (pdev->dev) and the channel
> power domain's usage counter (fsl_chan->pd_dev) acquired earlier in this
> function?
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260701-b4-edma-runtime-opt-v6-0-354ff4229c00@oss.nxp.com?part=4

next prev parent reply	other threads:[~2026-07-01 14:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01  9:29 [PATCH v6 0/5] add runtime suspend/resume support joy.zou
2026-07-01  9:29 ` [PATCH v6 1/5] dmaengine: fsl-edma: use devm_clk_get_optional() for channel clock joy.zou
2026-07-01  9:38   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 2/5] dmaengine: fsl-edma: use devm_clk_get_optional() for DMA engine clock joy.zou
2026-07-01  9:40   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 3/5] dmaengine: fsl-edma: convert DMAMUX clock handling to bulk clock API joy.zou
2026-07-01  9:39   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support joy.zou
2026-07-01  9:43   ` sashiko-bot
2026-07-01 14:50     ` Frank Li [this message]
2026-07-01  9:29 ` [PATCH v6 5/5] dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach() joy.zou
2026-07-01  9:44   ` sashiko-bot
2026-07-01 14:47   ` Frank Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akUpOso_-leCZeJu@SMW015318 \
    --to=frank.li@oss.nxp.com \
    --cc=Frank.Li@kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=imx@lists.linux.dev \
    --cc=joy.zou@oss.nxp.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox