Re: [PATCH v6 5/5] dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach()

Linux kernel and device drivers for NXP i.MX platforms
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: joy.zou@oss.nxp.com
Cc: imx@lists.linux.dev, dmaengine@vger.kernel.org,
	Frank.Li@kernel.org, vkoul@kernel.org
Subject: Re: [PATCH v6 5/5] dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach()
Date: Wed, 01 Jul 2026 09:44:36 +0000	[thread overview]
Message-ID: <20260701094436.B87531F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260701-b4-edma-runtime-opt-v6-5-354ff4229c00@oss.nxp.com>

Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:

New issues:
- [Medium] The fix moves a meaningless `pm_runtime_set_suspended()` call instead of deleting it, creating dead code and an API misuse.

Pre-existing issues:
- [High] Use-after-free of device link pointer in the probe error path due to incorrect devres unroll ordering.
- [High] Missing PM runtime checks in `fsl_edma3_err_handler_per_chan` and `fsl_edma3_err_handler_shared` interrupt handlers.
- [High] The `fsl_edma_chan_runtime_suspend()` callback synchronizes the global engine IRQs instead of the shared per-channel IRQs.
- [High] Runtime PM usage counter leak on the probe error path.
- [High] Unacknowledged hardware interrupts cause an IRQ storm and system lockup if a channel is suspended.
--

commit be8ea96b6aa69c23219ef38987b11d3332453c91
Author: Joy Zou <joy.zou@nxp.com>

dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach()

Remove pm_runtime_set_suspended() call after dev_pm_domain_detach()
to prevent use-after-free. When a power domain is attached via
dev_pm_domain_attach_by_id(), calling dev_pm_domain_detach()
unregisters and frees the underlying virtual device, making
fsl_chan->pd_dev a dangling pointer.

Accessing the freed pointer in pm_runtime_set_suspended() triggers
undefined behavior and potential crashes.

Fixes: ccfa3131d4a0 ("dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd()")

> diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
> index 3518dfb4292d2..266cc082a9f04 100644
> --- a/drivers/dma/fsl-edma-main.c
> +++ b/drivers/dma/fsl-edma-main.c
> @@ -671,8 +671,8 @@ static void fsl_edma3_detach_pd(struct fsl_edma_engine *fsl_edma)
>  			fsl_chan->pd_dev_link = NULL;
>  		}

[Severity: High]
This isn't a bug introduced by this patch, but does this attempt to delete
a device link that has already been purged and freed by the dmaengine devres
rollback?

If fsl_edma_probe() fails after successfully calling device_link_add(), the
devres unwinding triggers. First, dmaenginem_async_device_unregister()
executes, which unregisters chan_dev. 

The driver core automatically purges and frees all device links associated
with this device, freeing fsl_chan->pd_dev_link. Next in the devres stack,
devm_fsl_edma3_detach_pd() executes and calls
device_link_del(fsl_chan->pd_dev_link), which could result in a
use-after-free.

>  		if (fsl_chan->pd_dev) {
> -			dev_pm_domain_detach(fsl_chan->pd_dev, false);
>  			pm_runtime_set_suspended(fsl_chan->pd_dev);
> +			dev_pm_domain_detach(fsl_chan->pd_dev, false);

[Severity: Medium]
The commit message states "Remove pm_runtime_set_suspended() call after
dev_pm_domain_detach()", but the call was moved to occur before detach rather
than being completely deleted.

Since fsl_chan->pd_dev is a virtual device with disable_depth == 0, doesn't
pm_runtime_set_suspended() immediately return -EAGAIN and do nothing?

Should this call be completely removed to avoid dead code and API misuse?

>  		}
>  	}
>  }

While reviewing the file, I noticed a few additional pre-existing issues.

In drivers/dma/fsl-edma-main.c:fsl_edma3_err_handler_per_chan():

	fsl_edma3_err_check(fsl_chan);

[Severity: High]
This is a pre-existing issue, but does the per-channel error handler lack the
necessary pm_runtime_get_if_active() check?

If a shared interrupt triggers or a spurious interrupt fires while the eDMA
hardware is suspended, these handlers invoke fsl_edma3_err_check(), which
performs direct MMIO register reads. Accessing registers on a suspended device
with its clock disabled will crash the system.

A similar issue appears in fsl_edma3_err_handler_shared():

		fsl_edma3_err_check(&fsl_edma->chans[ch]);

[Severity: High]
This is a pre-existing issue, but the shared error handler accesses all
channel registers without verifying their PM runtime state.

In drivers/dma/fsl-edma-main.c:fsl_edma_chan_runtime_suspend():

	if (fsl_edma->txirq)
		synchronize_irq(fsl_edma->txirq);

	if (fsl_edma->errirq)
		synchronize_irq(fsl_edma->errirq);

[Severity: High]
This is a pre-existing issue, but does the runtime suspend callback omit the
synchronize_irq() calls for the actual IRQF_SHARED IRQs requested for the
specific channel?

In fsl_edma_alloc_chan_resources(), the driver requests per-channel IRQs
using the IRQF_SHARED flag. However, fsl_edma_chan_runtime_suspend()
synchronizes the global engine IRQs instead of the channel's IRQs. Could a
per-channel IRQ handler execute concurrently while the channel hardware
clock/power is being disabled?

In drivers/dma/fsl-edma-main.c:fsl_edma_probe():

	ret = fsl_edma->drvdata->setup_irq(pdev, fsl_edma);
	if (ret)
		return ret;

[Severity: High]
This is a pre-existing issue, but does this early return leak the PM usage
counter of pdev->dev and any already resumed channels?

During probe, the driver calls pm_runtime_resume_and_get(&pdev->dev) and
pm_runtime_resume_and_get(chan_dev), incrementing their usage counters. If an
error occurs subsequently, the function returns directly without dropping the
counters, preventing the hardware from suspending upon subsequent unbind or
retry.

In drivers/dma/fsl-edma-main.c:fsl_edma_tx_handler():

		if (pm_runtime_get_if_active(&fsl_chan->vchan.chan.dev->device) <= 0)
			continue;

		if (intr & (0x1 << ch)) {
			edma_writeb(fsl_edma, EDMA_CINT_CINT(ch), regs->cint);

[Severity: High]
This is a pre-existing issue, but if the channel's PM reference cannot be
acquired, does the loop skip clearing the completion interrupt and leave the
hardware IRQ asserted?

If a channel with a pending interrupt is suspended or suspending,
pm_runtime_get_if_active() returns <= 0 and the handler skips the channel
using continue. Because the hardware interrupt source is never cleared, but
the handler returns IRQ_HANDLED, the interrupt controller will immediately
re-trigger the ISR, potentially causing an infinite IRQ storm and CPU lockup.

This also appears in fsl_edma_err_handler():

		if (pm_runtime_get_if_active(&fsl_chan->vchan.chan.dev->device) <= 0)
			continue;

		if (err & (0x1 << ch)) {
			fsl_edma_disable_request(&fsl_edma->chans[ch]);
			edma_writeb(fsl_edma, EDMA_CERR_CERR(ch), regs->cerr);

[Severity: High]
This is a pre-existing issue, but it appears to skip clearing the error
interrupt here under the same conditions.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260701-b4-edma-runtime-opt-v6-0-354ff4229c00@oss.nxp.com?part=5

next prev parent reply	other threads:[~2026-07-01  9:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01  9:29 [PATCH v6 0/5] add runtime suspend/resume support joy.zou
2026-07-01  9:29 ` [PATCH v6 1/5] dmaengine: fsl-edma: use devm_clk_get_optional() for channel clock joy.zou
2026-07-01  9:38   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 2/5] dmaengine: fsl-edma: use devm_clk_get_optional() for DMA engine clock joy.zou
2026-07-01  9:40   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 3/5] dmaengine: fsl-edma: convert DMAMUX clock handling to bulk clock API joy.zou
2026-07-01  9:39   ` sashiko-bot
2026-07-01  9:29 ` [PATCH v6 4/5] dmaengine: fsl-edma: add runtime suspend/resume support joy.zou
2026-07-01  9:43   ` sashiko-bot
2026-07-01 14:50     ` Frank Li
2026-07-01  9:29 ` [PATCH v6 5/5] dmaengine: fsl-edma: fix use-after-free after dev_pm_domain_detach() joy.zou
2026-07-01  9:44   ` sashiko-bot [this message]
2026-07-01 14:47   ` Frank Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260701094436.B87531F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=imx@lists.linux.dev \
    --cc=joy.zou@oss.nxp.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox