All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Linus Walleij" <linusw@kernel.org>
Cc: Frank.Li@kernel.org, robh@kernel.org, vkoul@kernel.org,
	dmaengine@vger.kernel.org, devicetree@vger.kernel.org,
	dri-devel@lists.freedesktop.org, conor+dt@kernel.org
Subject: Re: [PATCH 08/11] dmaengine: ste_dma40: Use power domain for LCLA SRAM
Date: Thu, 18 Jun 2026 05:15:38 +0000	[thread overview]
Message-ID: <20260618051539.15E201F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260618-ux500-power-domains-v7-1-v1-8-eb5e50b1a588@kernel.org>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:

New issues:
- [High] Calling synchronous runtime PM functions on a non-IRQ-safe device from atomic context.
- [High] Forcing immediate synchronous suspend of the SRAM PM domain defeats the DMA controller's autosuspend design and causes severe performance degradation.
- [High] The consumer driver forcefully enables runtime PM on a supplier device, risking unbalanced PM states.
- [High] Missing device link between the DMA controller and the external LCLA SRAM device, leading to system suspend ordering races.

Pre-existing issues:
- [Critical] Validation logic incorrectly bounds `dev_type`, leading to a critical out-of-bounds pointer write in `lookup_log_chans`.
- [High] An early probe failure unconditionally calls `pm_runtime_disable(base->dev)`, permanently disabling runtime PM on subsequent probe retries.
- [High] Missing `free_irq()` in the error paths of `d40_probe()`.
- [High] Unchecked return value of `d40_queue_start()` leaks runtime PM reference and permanently hangs the channel.
--

commit ee86cbf896c47f035fc9bd6d0f30e49d8a83841a
Author: Linus Walleij <linusw@kernel.org>

dmaengine: ste_dma40: Use power domain for LCLA SRAM

This patch replaces the LCLA ESRAM regulator with runtime PM support.
It fetches the SRAM device for the ESRAM34 power domain from the device tree.
The driver then holds a runtime PM reference on this domain while DMA
transfers are active.

> diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c
> index 9b803c0aec254..6ca67ec446dc5 100644
> --- a/drivers/dma/ste_dma40.c
> +++ b/drivers/dma/ste_dma40.c
[ ... ]
> @@ -628,6 +630,22 @@ static struct device *chan2dev(struct d40_chan *d40c)
>  	return &d40c->chan.dev->device;
>  }
>  
> +static void d40_transfer_runtime_get(struct d40_base *base)
> +{
> +	if (base->lcla_dev)
> +		pm_runtime_get_sync(base->lcla_dev);

[Severity: High]
Does this code call a blocking runtime PM function on an external SRAM device
from an atomic context?

d40_transfer_runtime_get() and d40_transfer_runtime_put() are called from
d40_queue_start() and dma_tc_handle(). Both of these execute under the
d40c->lock spinlock, and dma_tc_handle() runs from the hardirq handler.

Because base->lcla_dev is an external SRAM device whose PM domain requires
sleeping locks to power on or off, it is not IRQ-safe. Calling synchronous
PM operations on it in an atomic context might trigger a might_sleep() warning
and a scheduling-while-atomic kernel panic.

> +
> +	pm_runtime_get_sync(base->dev);
> +}
> +
> +static void d40_transfer_runtime_put(struct d40_base *base)
> +{
> +	pm_runtime_put_autosuspend(base->dev);
> +
> +	if (base->lcla_dev)
> +		pm_runtime_put_sync_suspend(base->lcla_dev);

[Severity: High]
Could forcing an immediate synchronous suspend of the SRAM PM domain defeat
the DMA controller's autosuspend design and cause performance degradation?

By using pm_runtime_put_sync_suspend() here, the external SRAM device is
forced to power down immediately as soon as a single DMA transfer completes.
The DMA controller itself uses pm_runtime_put_autosuspend() to handle bursty
transfers efficiently.

Suspending the shared SRAM immediately and synchronously might cause every
subsequent DMA transfer to block while the PM domain wakes up, which could
introduce significant latency.

> +}
[ ... ]
> @@ -1516,7 +1534,7 @@ static struct d40_desc *d40_queue_start(struct d40_chan *d40c)
>  	if (d40d != NULL) {
>  		if (!d40c->busy) {
>  			d40c->busy = true;
> -			pm_runtime_get_sync(d40c->base->dev);
> +			d40_transfer_runtime_get(d40c->base);
>  		}
>  
>  		/* Remove from queue */

[Severity: High]
This is a pre-existing issue, but does d40_queue_start() leak a runtime PM
reference and hang the channel if the hardware start fails?

If d40_start(d40c) fails later in this function, it returns NULL without
reverting d40c->busy or dropping the PM reference:

    err = d40_start(d40c);
    if (err)
        return NULL;

Callers like d40_issue_pending() ignore this return value. The channel then
remains permanently marked as busy, preventing future transfers, and the
leaked PM reference prevents the device from suspending.

[ ... ]
> @@ -3590,23 +3593,32 @@ static int __init d40_probe(struct platform_device *pdev)
>  	}
>  
>  	if (base->plat_data->use_esram_lcla) {
> +		np_lcla = of_parse_phandle(np, "sram", 1);
> +		if (!np_lcla) {
> +			dev_err(dev, "no LCLA SRAM node\n");
> +			ret = -EINVAL;
> +			goto destroy_cache;
> +		}
>  
> -		base->lcpa_regulator = regulator_get(base->dev, "lcla_esram");
> -		if (IS_ERR(base->lcpa_regulator)) {
> -			d40_err(dev, "Failed to get lcpa_regulator\n");
> -			ret = PTR_ERR(base->lcpa_regulator);
> -			base->lcpa_regulator = NULL;
> +		np_lcla_parent = of_get_parent(np_lcla);
> +		of_node_put(np_lcla);
> +		if (!np_lcla_parent) {
> +			dev_err(dev, "no LCLA SRAM parent node\n");
> +			ret = -EINVAL;
>  			goto destroy_cache;
>  		}
>  
> -		ret = regulator_enable(base->lcpa_regulator);
> -		if (ret) {
> -			d40_err(dev,
> -				"Failed to enable lcpa_regulator\n");
> -			regulator_put(base->lcpa_regulator);
> -			base->lcpa_regulator = NULL;
> +		lcla_pdev = of_find_device_by_node(np_lcla_parent);
> +		of_node_put(np_lcla_parent);
> +		if (!lcla_pdev) {
> +			ret = -EPROBE_DEFER;
>  			goto destroy_cache;
>  		}
> +		base->lcla_dev = &lcla_pdev->dev;

[Severity: High]
Is a device link missing between the DMA controller and the external LCLA
SRAM device?

The driver relies on the runtime PM of lcla_dev, but it doesn't appear to
call device_link_add() to inform the PM core that base->dev depends on
base->lcla_dev.

During a system sleep transition, the PM core might suspend devices in
parallel or based on standard device tree hierarchy, potentially suspending
lcla_dev before the DMA controller. This could physically power off the SRAM
while active DMA transfers are still running.

> +		if (!pm_runtime_enabled(base->lcla_dev)) {
> +			pm_runtime_enable(base->lcla_dev);

[Severity: High]
Is it safe for a consumer driver to forcefully alter the core runtime PM
state of an external supplier device?

A consumer driver generally shouldn't manually manipulate the enable/disable
state of a supplier device it does not own.

If the supplier SRAM driver probes concurrently or manages its own state,
its call to pm_runtime_enable() will find disable_depth already at 0,
which could trigger an "Unbalanced pm_runtime_enable!" warning and
corrupt the device's internal power state machine.

> +			base->lcla_pm_enabled = true;
> +		}
>  	}
>  
>  	writel_relaxed(D40_DREG_GCC_ENABLE_ALL, base->virtbase + D40_DREG_GCC);
[ ... ]
> @@ -3642,16 +3654,17 @@ static int __init d40_probe(struct platform_device *pdev)
>  				 SZ_1K * base->num_phy_chans,
>  				 DMA_TO_DEVICE);
>  
> -	if (!base->lcla_pool.base_unaligned && base->lcla_pool.base)
> +	if (!base->lcla_pool.base_unaligned && base->lcla_pool.base &&
> +	    base->lcla_pool.pages)
>  		free_pages((unsigned long)base->lcla_pool.base,
>  			   base->lcla_pool.pages);
>  
>  	kfree(base->lcla_pool.base_unaligned);
>  
> -	if (base->lcpa_regulator) {
> -		regulator_disable(base->lcpa_regulator);
> -		regulator_put(base->lcpa_regulator);
> -	}
> +	if (base->lcla_pm_enabled)
> +		pm_runtime_disable(base->lcla_dev);
> +	if (base->lcla_dev)
> +		put_device(base->lcla_dev);
>  	pm_runtime_disable(base->dev);

[Severity: High]
This is a pre-existing issue, but might this error path permanently disable
runtime PM on subsequent probe retries?

If d40_probe() encounters an error before pm_runtime_enable(base->dev) is
executed (for example, when lcla_pdev is not found and returns -EPROBE_DEFER),
it jumps to destroy_cache.

This unconditionally calls pm_runtime_disable(base->dev), incrementing
disable_depth from 1 to 2. When the probe is retried and succeeds,
pm_runtime_enable() will only decrement the depth to 1, leaving the device
permanently disabled for runtime PM.

[Severity: High]
This is a pre-existing issue, but could the destroy_cache error path leak
the registered IRQ?

Earlier in d40_probe(), request_irq() is called to register the hardware
interrupt handler:

    ret = request_irq(base->irq, d40_handle_interrupt, 0, D40_NAME, base);

If a subsequent step fails, the function jumps to destroy_cache, which cleans
up memory and DMA mappings but does not call free_irq().

The interrupt handler remains registered for a device whose driver data and
resources have been freed. If the hardware asserts an interrupt, the kernel
might execute the handler using freed memory.

>  
>   report_failure:

[Severity: Critical]
This is a pre-existing issue, but does the validation logic for dev_type
allow an out-of-bounds write during channel allocation?

In d40_validate_conf(), conf->dev_type is validated against num_log_chans:

    if ((is_log && conf->dev_type > d40c->base->num_log_chans)

However, in d40_allocate_channel(), log_num is calculated as either
2 * dev_type or 2 * dev_type + 1. This allows log_num to reach up to
2 * 256 + 1 = 513.

The driver then writes to the lookup array:

    d40c->base->lookup_log_chans[d40c->log_num] = d40c;

Since lookup_log_chans is only sized to hold num_log_chans elements,
this results in a write past the end of the array.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260618-ux500-power-domains-v7-1-v1-0-eb5e50b1a588@kernel.org?part=8

  reply	other threads:[~2026-06-18  5:15 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-18  5:00 [PATCH 00/11] pmdomain: st: ux500: Implement ux500 power domains Linus Walleij
2026-06-18  5:00 ` [PATCH 01/11] dt-bindings: power: Convert Ux500 PM domains to schema Linus Walleij
2026-06-18  5:07   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 02/11] dt-bindings: Add the actual power domains on U8500 Linus Walleij
2026-06-18  5:00 ` [PATCH 03/11] pmdomain: st: ux500: Implement more power domains Linus Walleij
2026-06-18  5:10   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 04/11] ARM: dts: ux500: Rename power domains node Linus Walleij
2026-06-18  5:00 ` [PATCH 05/11] ARM: dts: ux500: Add power domains Linus Walleij
2026-06-18  5:14   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 06/11] pmdomain: st: ux500: Control DB8500 EPODs Linus Walleij
2026-06-18  5:14   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 07/11] drm/mcde: Use power domain for display power Linus Walleij
2026-06-18  5:11   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 08/11] dmaengine: ste_dma40: Use power domain for LCLA SRAM Linus Walleij
2026-06-18  5:15   ` sashiko-bot [this message]
2026-06-18  5:00 ` [PATCH 09/11] regulator: db8500-prcmu: Remove EPOD regulators Linus Walleij
2026-06-18  5:15   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 10/11] regulator: db8500: Add power domain regulators Linus Walleij
2026-06-18  5:23   ` sashiko-bot
2026-06-18  5:00 ` [PATCH 11/11] ARM: dts: ux500: Remove DB8500 EPOD regulators Linus Walleij
2026-06-18  7:20   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260618051539.15E201F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linusw@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.