DMA Engine development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Joy Zou" <joy.zou@nxp.com>
Cc: vkoul@kernel.org, dmaengine@vger.kernel.org, Frank.Li@kernel.org,
	imx@lists.linux.dev
Subject: Re: [PATCH v5 4/4] dmaengine: fsl-edma: add runtime suspend/resume support
Date: Thu, 14 May 2026 05:55:09 +0000	[thread overview]
Message-ID: <20260514055509.A379BC2BCB7@smtp.kernel.org> (raw)
In-Reply-To: <20260513-b4-b4-edma-runtime-opt-v5-4-1e595bfb8423@nxp.com>

Thank you for your contribution! Sashiko AI review found 9 potential issue(s) to consider:
- [Critical] Double-free and Use-After-Free in the probe error path if device_link_add() fails.
- [Critical] Use-After-Free during driver unbind due to manual device link deletion after core purging.
- [Critical] Pre-existing Use-After-Free vulnerability in `fsl_edma3_detach_pd`.
- [Critical] Unclocked register access in shared IRQ handlers causing system crashes.
- [High] Initialization race condition exposing uninitialized DMA channels to consumers.
- [High] System hang risk due to enabling channel clocks before their corresponding power domains are resumed.
- [High] Unbalanced clock disable and CCF underflow on driver removal.
- [High] Hardware remains inaccessible, crashing the system if CONFIG_PM is disabled.
- [High] Unclocked register access during system resume (`resume_early`).
--

commit 29a9c658afa0aacc5ffd271f1b4ac1fc006b0536
Author: Joy Zou <joy.zou@nxp.com>

dmaengine: fsl-edma: add runtime suspend/resume support

Introduce runtime suspend and resume support for FSL eDMA. Enable
per-channel power domain management to facilitate runtime suspend and
resume operations.

> diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
> index c12126ea6552d..9446a0c3bc576 100644
> --- a/drivers/dma/fsl-edma-main.c
> +++ b/drivers/dma/fsl-edma-main.c
> @@ -629,7 +629,6 @@ static void fsl_edma3_detach_pd(struct fsl_edma_engine *fsl_edma)
>  			device_link_del(fsl_chan->pd_dev_link);

During driver unbind, the devres cleanups are executed in reverse order.
The DMA engine and its channel devices are unregistered first via
dmaenginem_async_device_unregister(), which triggers device_links_purge()
to delete all links where the channel is a consumer. 

When the devres action later calls fsl_edma3_detach_pd() and reaches this
device_link_del(), will it cause a use-after-free by attempting to delete
a link that has already been purged by the driver core?

>  		if (fsl_chan->pd_dev) {
>  			dev_pm_domain_detach(fsl_chan->pd_dev, false);
> -			pm_runtime_dont_use_autosuspend(fsl_chan->pd_dev);
>  			pm_runtime_set_suspended(fsl_chan->pd_dev);

This issue pre-dates the patch, but could this cause a use-after-free?

For power domains attached via dev_pm_domain_attach_by_id(), calling
dev_pm_domain_detach() unregisters and frees the underlying virtual device.
Passing fsl_chan->pd_dev to pm_runtime_set_suspended() immediately after
appears to use a freed pointer.

[ ... ]

> @@ -684,6 +668,29 @@ static int fsl_edma3_attach_pd(struct platform_device *pdev, struct fsl_edma_eng
>  	return -EINVAL;
>  }
>  
> +/* Per channel dma power domain */
> +static int fsl_edma_chan_runtime_suspend(struct device *dev)
> +{
> +	struct fsl_edma_chan *fsl_chan = dev_get_drvdata(dev);
> +
> +	clk_disable_unprepare(fsl_chan->clk);
> +
> +	return 0;
> +}

Does this function need to call synchronize_irq() before disabling the clock?

Since the driver uses shared interrupts (IRQF_SHARED), if another device on
the same IRQ line triggers an interrupt while the channel is suspended,
the eDMA interrupt handler might attempt to read hardware registers while the
clock is disabled.

Should the interrupt handlers (like fsl_edma3_tx_handler) also use
pm_runtime_get_if_active() to ensure the device is powered before checking
status registers?

[ ... ]

> @@ -869,6 +872,51 @@ static int fsl_edma_probe(struct platform_device *pdev)
>  		return dev_err_probe(&pdev->dev, ret,
>  				     "Can't register Freescale eDMA engine.\n");

Is there a race condition here?

By calling dmaenginem_async_device_register() before setting up the runtime
PM domains and hardware registers below, clients can immediately request a
channel.

If a client allocates a channel concurrently, it could call
fsl_edma_alloc_chan_resources() before devm_pm_runtime_enable() is called,
or race with the unconditional clk_disable_unprepare() and edma_write_tcdreg()
in the loop below, potentially corrupting hardware state.

> +	ret = devm_pm_runtime_enable(&pdev->dev);
> +	if (ret)
> +		return dev_err_probe(&pdev->dev, ret, "Can't enable eDMA engine PM runtime!");
> +
> +	for (i = 0; i < fsl_edma->n_chans; i++) {
> +		struct fsl_edma_chan *fsl_chan = &fsl_edma->chans[i];
> +		struct device *chan_dev;
> +
> +		if (fsl_edma->chan_masked & BIT(i))
> +			continue;
> +
> +		chan_dev = &fsl_chan->vchan.chan.dev->device;
> +		dev_set_drvdata(chan_dev, fsl_chan);
> +		dev_pm_domain_set(chan_dev, &fsl_edma_chan_pm_domain);
> +
> +		if (fsl_chan->pd_dev) {
> +			fsl_chan->pd_dev_link = device_link_add(chan_dev, fsl_chan->pd_dev,
> +								DL_FLAG_STATELESS |
> +								DL_FLAG_PM_RUNTIME);
> +			if (!fsl_chan->pd_dev_link) {
> +				fsl_edma3_detach_pd(fsl_edma);
> +				return dev_err_probe(&pdev->dev, -EINVAL,
> +						     "Failed to add device_link to %d\n", i);

If device_link_add() fails, fsl_edma3_detach_pd() is called manually to detach
domains and delete links, but it doesn't clear the pointers.

Since devm_fsl_edma3_detach_pd() was already registered as a devres action
earlier in probe, returning an error here will cause the devres framework to
call fsl_edma3_detach_pd() a second time. Could this lead to a double-free
of the device link and double-detach of the PM domain?

> +			}
> +		}
> +		ret = devm_pm_runtime_enable(chan_dev);
> +		if (ret)
> +			return dev_err_probe(&pdev->dev, ret,
> +					     "Can't enable eDMA channel PM runtime!");
> +
> +		if (fsl_chan->pd_dev) {
> +			ret = pm_runtime_resume_and_get(fsl_chan->pd_dev);

Since the earlier loop in probe acquires the channel clocks with
devm_clk_get_optional_enabled(), the clocks are already enabled before we
reach this point.

Because the previous DL_FLAG_RPM_ACTIVE flag was removed from the device link,
does enabling the channel clock before its parent power domain is resumed
risk an asynchronous external abort on NXP SoCs?

> +			if (ret)
> +				return dev_err_probe(&pdev->dev, ret,
> +						     "Failed to power on eDMA channel %d\n",
> +						     fsl_chan->vchan.chan.chan_id);
> +		}
> +
> +		edma_write_tcdreg(fsl_chan, cpu_to_le32(0), csr);
> +		fsl_edma_chan_mux(fsl_chan, 0, false);
> +		clk_disable_unprepare(fsl_chan->clk);

If the kernel is compiled with CONFIG_PM=n, pm_runtime_resume_and_get()
evaluates to 1 (success) but does not call the runtime PM resume callbacks.

By unconditionally disabling the clocks here, do they remain permanently
disabled when CONFIG_PM=n, leading to bus faults on subsequent DMA requests?

> +		if (fsl_chan->pd_dev)
> +			pm_runtime_put_sync_suspend(fsl_chan->pd_dev);
> +	}
> +
>  	ret = devm_of_dma_controller_register(&pdev->dev, np,
>  			drvdata->dmamuxs ? fsl_edma_xlate : fsl_edma3_xlate,
>  			fsl_edma);
> @@ -880,6 +928,9 @@ static int fsl_edma_probe(struct platform_device *pdev)
>  	if (!(drvdata->flags & FSL_EDMA_DRV_SPLIT_REG))
>  		edma_writel(fsl_edma, EDMA_CR_ERGA | EDMA_CR_ERCA, regs->cr);
>  
> +	clk_disable_unprepare(fsl_edma->dmaclk);
> +	clk_bulk_disable_unprepare(fsl_edma->drvdata->dmamuxs, fsl_edma->muxclk);

These clocks were acquired using devm_clk_get_optional_enabled() earlier in
probe.

If they are manually disabled here, and the device is suspended during
driver unbind, won't the devres cleanup attempt to disable them again,
causing a Common Clock Framework underflow warning?

Similarly, if CONFIG_PM=n, these engine clocks will never be re-enabled.

[ ... ]

> @@ -928,6 +992,13 @@ static int fsl_edma_resume_early(struct device *dev)
>  		fsl_chan = &fsl_edma->chans[i];
>  		if (fsl_edma->chan_masked & BIT(i))
>  			continue;
> +
> +		if (pm_runtime_status_suspended(&fsl_chan->vchan.chan.dev->device) ||
> +		    (!(fsl_edma->drvdata->flags & FSL_EDMA_DRV_HAS_PD) &&
> +		     (fsl_edma->drvdata->flags & FSL_EDMA_DRV_SPLIT_REG) &&
> +		     !fsl_chan->srcid))
> +			continue;
> +
>  		fsl_chan->pm_state = RUNNING;
>  		edma_write_tcdreg(fsl_chan, 0, csr);
>  		if (fsl_chan->srcid != 0)
> @@ -940,6 +1011,35 @@ static int fsl_edma_resume_early(struct device *dev)
>  	return 0;
>  }

Before returning, fsl_edma_resume_early() writes to regs->cr:

    if (!(fsl_edma->drvdata->flags & FSL_EDMA_DRV_SPLIT_REG))
        edma_writel(fsl_edma, EDMA_CR_ERGA | EDMA_CR_ERCA, regs->cr);

If all channels were idle prior to system suspend, the eDMA engine is in a
runtime-suspended state and dmaclk is gated. Since the PM core doesn't
automatically wake runtime-suspended devices during resume_early, could
writing to regs->cr while dmaclk is gated trigger a bus fault?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260513-b4-b4-edma-runtime-opt-v5-0-1e595bfb8423@nxp.com?part=4

      reply	other threads:[~2026-05-14  5:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 11:23 [PATCH v5 0/4] add runtime suspend/resume support Joy Zou
2026-05-13 11:23 ` [PATCH v5 1/4] dmaengine: fsl-edma: use devm_clk_get_optional_enabled() for channel clock Joy Zou
2026-05-13 14:51   ` Frank Li
2026-05-13 11:23 ` [PATCH v5 2/4] dmaengine: fsl-edma: use devm_clk_get_optional_enabled() for DMA engine clock Joy Zou
2026-05-13 14:53   ` Frank Li
2026-05-14  4:55   ` sashiko-bot
2026-05-13 11:23 ` [PATCH v5 3/4] dmaengine: fsl-edma: convert DMAMUX clock handling to bulk clock API Joy Zou
2026-05-13 14:54   ` Frank Li
2026-05-14  5:18   ` sashiko-bot
2026-05-13 11:23 ` [PATCH v5 4/4] dmaengine: fsl-edma: add runtime suspend/resume support Joy Zou
2026-05-14  5:55   ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514055509.A379BC2BCB7@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=imx@lists.linux.dev \
    --cc=joy.zou@nxp.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox