From: Vinod Koul <vkoul@kernel.org>
To: "Golla, Nagendra" <Nagendra.Golla@amd.com>
Cc: sashiko-reviews@lists.linux.dev, robh@kernel.org,
devicetree@vger.kernel.org, Frank.Li@kernel.org,
dmaengine@vger.kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH 2/2] dmaengine: zynqmp_dma: Add per-channel reset support
Date: Tue, 16 Jun 2026 14:14:33 +0530 [thread overview]
Message-ID: <ajEM8QfB5brd3sOQ@vaman> (raw)
In-Reply-To: <efa45058-6724-45cb-8b8f-75427446f62c@amd.com>
On 15-06-26, 14:59, Golla, Nagendra wrote:
>
>
> On 5/25/2026 5:03 PM, sashiko-bot@kernel.org wrote:
> > Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
> >
> > Pre-existing issues:
> > - [High] The probe error path leaks a runtime PM usage counter reference, permanently breaking runtime PM and leaving hardware clocks enabled.
> > - [High] Dropping the spinlock inside `list_for_each_entry_safe` allows concurrent modification of the list, leading to memory corruption.
> > - [High] Hardware registers are accessed in the IRQ handler without verifying if the device is active, risking a kernel panic on spurious interrupts.
> > --
> >
> > commit b2d2a678f48614c7188f8d81cbc70143b2bdfc0f
> > Author: Golla Nagendra <nagendra.golla@amd.com>
> > Subject: dmaengine: zynqmp_dma: Add per-channel reset support
> >
> > This commit introduces per-channel reset support for Versal Gen 2 and
> > Versal Net SoCs. It adds reset handling in the channel probe path using
> > device_reset_optional() to trigger a reset pulse during initialization.
> > Platforms without per-channel reset are unaffected.
> >
> > > diff --git a/drivers/dma/xilinx/zynqmp_dma.c b/drivers/dma/xilinx/zynqmp_dma.c
> > > index f6a812e49ddca..51c831760372c 100644
> > > --- a/drivers/dma/xilinx/zynqmp_dma.c
> > > +++ b/drivers/dma/xilinx/zynqmp_dma.c
> > [ ... ]
> > > @@ -916,6 +917,11 @@ static int zynqmp_dma_chan_probe(struct zynqmp_dma_device *zdev,
> > > if (IS_ERR(chan->regs))
> > > return PTR_ERR(chan->regs);
> > > + err = device_reset_optional(&pdev->dev);
> > > + if (err)
> > > + return dev_err_probe(&pdev->dev, err,
> > > + "failed to reset channel\n");
> > > +
> >
> > [Severity: High]
> > This isn't a bug introduced by this patch, but does this error path leak a
> > runtime PM usage counter reference?
> >
> > In zynqmp_dma_probe(), pm_runtime_resume_and_get() is called to increment
> > the runtime PM usage counter and enable the device clocks. If a subsequent
> > initialization step fails, such as the newly added device_reset_optional()
> > returning -EPROBE_DEFER, the code jumps to the err_disable_pm label:
> >
> > zynqmp_dma_probe()
> > ...
> > ret = pm_runtime_resume_and_get(zdev->dev);
> > ...
> > ret = zynqmp_dma_chan_probe(zdev, pdev);
> > if (ret) {
> > dev_err_probe(&pdev->dev, ret, "Probing channel failed\n");
> > goto err_disable_pm;
> > }
> > ...
> > err_disable_pm:
> > if (!pm_runtime_enabled(zdev->dev))
> > zynqmp_dma_runtime_suspend(zdev->dev);
> > pm_runtime_disable(zdev->dev);
> > return ret;
> >
> > Since the device is still runtime PM enabled, it skips
> > zynqmp_dma_runtime_suspend() and fails to drop the usage counter with
> > pm_runtime_put_noidle().
> >
> > Because -EPROBE_DEFER is a normal occurrence during boot, this path can
> > be hit multiple times. Will this permanently leave the usage counter above
> > zero, preventing the device from ever entering runtime suspend and leaving
> > its clocks enabled indefinitely?
> >
> > > chan->bus_width = ZYNQMP_DMA_BUS_WIDTH_64;
> > > chan->dst_burst_len = ZYNQMP_DMA_MAX_DST_BURST_LEN;
> >
> > [Severity: High]
> > This is a pre-existing issue, but does dropping the spinlock inside
> > list_for_each_entry_safe() in zynqmp_dma_chan_desc_cleanup() allow concurrent
> > modification of the list, leading to memory corruption?
> >
> > When zynqmp_dma_chan_desc_cleanup() temporarily drops the lock to invoke
> > the DMA callback:
> >
> > zynqmp_dma_chan_desc_cleanup()
> > ...
> > list_for_each_entry_safe(desc, next, &chan->done_list, node) {
> > ...
> > if (dmaengine_desc_callback_valid(&cb)) {
> > spin_unlock_irqrestore(&chan->lock, irqflags);
> > dmaengine_desc_callback_invoke(&cb, NULL);
> > spin_lock_irqsave(&chan->lock, irqflags);
> > }
> >
> > A client driver can concurrently invoke dmaengine_terminate_sync(), which
> > calls zynqmp_dma_device_terminate_all() and executes
> > zynqmp_dma_free_descriptors(). This moves all remaining elements, including
> > the prefetched next pointer, from done_list to free_list.
> >
> > When zynqmp_dma_chan_desc_cleanup() reacquires the lock, won't it continue
> > iterating using the corrupted next pointer that now resides in free_list,
> > eventually interpreting the list head &chan->free_list as a struct
> > zynqmp_dma_desc_sw and resulting in out-of-bounds memory accesses?
> >
> > [Severity: High]
> > This is a pre-existing issue, but are hardware registers accessed in the
> > IRQ handler without verifying if the device is active, risking a kernel
> > panic on spurious interrupts?
>
> Hi,
>
> Thanks for pointing this out.After reviewing the runtime PM guidelines and
> the driver behavior, I believe this is not a strict requirement for all IRQ
> handlers. The PM documentation does not mandate that
> pm_runtime_get_if_active() must be used before every register access in an
> ISR. It is mainly intended for cases where an interrupt can occur while the
> device might be runtime-suspended, such as shared IRQ or spurious interrupt
> scenarios.
>
> In the case of the zynqmp DMA driver, the interrupt is generated only when
> the hardware is active during a transfer, and the driver ensures the device
> remains runtime-active while DMA operations are in progress. Because of
> this, the ISR is not expected to run when the device is suspended, and
> accessing registers directly should be safe.
>
> So, in this context, adding pm_runtime_get_if_active() in the IRQ handler
> may not be necessary. That said, please let me know if you see a scenario
> where the interrupt could be triggered while the device is
> runtime-suspended, and I can revisit accordingly.
We can have a spurious interrupt. if you have an isr registered, your
cide should be able to handle the invocation... I think it is indeed
pragmatic to handle this
--
~Vinod
prev parent reply other threads:[~2026-06-16 8:44 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-25 10:50 [PATCH 0/2] dmaengine: zynqmp_dma: Add per-channel reset support Golla Nagendra
2026-05-25 10:50 ` [PATCH 1/2] dt-bindings: dma: xilinx: Add optional resets property for ZDMA Golla Nagendra
2026-05-25 11:00 ` sashiko-bot
2026-05-25 17:02 ` Conor Dooley
2026-06-04 11:28 ` Golla, Nagendra
2026-06-04 11:26 ` Golla, Nagendra
2026-05-25 10:50 ` [PATCH 2/2] dmaengine: zynqmp_dma: Add per-channel reset support Golla Nagendra
2026-05-25 11:33 ` sashiko-bot
2026-06-15 9:29 ` Golla, Nagendra
2026-06-16 8:44 ` Vinod Koul [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajEM8QfB5brd3sOQ@vaman \
--to=vkoul@kernel.org \
--cc=Frank.Li@kernel.org \
--cc=Nagendra.Golla@amd.com \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=dmaengine@vger.kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.