From: sashiko-bot@kernel.org
To: "Rosen Penev" <rosenp@gmail.com>
Cc: Frank.Li@kernel.org, vkoul@kernel.org, dmaengine@vger.kernel.org
Subject: Re: [PATCHv3 02/15] dmaengine: fsldma: drop desc_lock before invoking client callback
Date: Tue, 09 Jun 2026 22:32:03 +0000 [thread overview]
Message-ID: <20260609223203.B15D51F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260609221926.35538-3-rosenp@gmail.com>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [Critical] Hardware State Corruption due to incorrect `chan->idle` assignment.
- [Critical] Premature completion of active descriptors due to stale `curr_phys` caching.
- [High] Concurrency, callback ordering violation, and completion cookie regression.
- [High] Use-After-Free / NULL pointer dereference of descriptor pool during teardown.
Pre-existing issues:
- [High] Missing `tasklet_kill()` in `fsl_dma_chan_remove` leading to Use-After-Free on `chan`.
--
commit 3632dcf83e269dafd0b93a77697b31d6b209bd8a
Author: Rosen Penev <rosenp@gmail.com>
dmaengine: fsldma: drop desc_lock before invoking client callback
This commit modifies the descriptor cleanup routine in the fsldma driver to
extract callback information under a lock, drop the lock, and invoke the client
callback. This is intended to fix a self-deadlock when a client submits a new
transaction from the completion callback.
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> index 22d62d958abd9..5c8a8fe25617d 100644
> --- a/drivers/dma/fsldma.c
> +++ b/drivers/dma/fsldma.c
[ ... ]
> @@ -603,22 +603,23 @@ static void fsldma_cleanup_descriptors(struct fsldma_chan *chan)
> {
> struct fsl_desc_sw *desc;
> dma_cookie_t cookie = 0;
> dma_addr_t curr_phys = get_cdar(chan);
[Severity: Critical]
Is it safe to cache curr_phys outside the while loop when dropping the lock
inside the loop?
If the lock is dropped inside the loop to invoke callbacks, a concurrent thread
could enter, process the remaining completed descriptors (including the one
matching this thread's cached curr_phys), and remove them from ld_running.
When the first thread reacquires the lock, its cached curr_phys is now stale and
no longer matches any descriptor in the list. As a result, the loop's safety
break condition:
if (desc->async_tx.phys == curr_phys) {
seen_current = 1;
will never be met. The first thread will erroneously process all remaining
descriptors in ld_running as completed, running their callbacks and freeing them
even if the hardware is actively executing them.
> int seen_current = 0;
>
> fsldma_clean_completed_descriptor(chan);
>
[ ... ]
> @@ -662,9 +631,42 @@ static void fsldma_cleanup_descriptors(struct fsldma_chan *chan)
> break;
> }
>
> - cookie = fsldma_run_tx_complete_actions(chan, desc, cookie);
> + cookie = fsldma_run_tx_complete_actions(chan, desc, cookie, &cb);
>
> - fsldma_clean_running_descriptor(chan, desc);
> + /*
> + * Remove from the running list before dropping the lock so
> + * that terminate_all cannot free this descriptor while we
> + * call into the client below.
> + */
> + list_del(&desc->node);
> +
> + /*
> + * Prevent dma_run_dependencies() from calling
> + * fsl_chan_xfer_ld_queue() while we are not holding the
> + * lock. That would splice pending descriptors into
> + * ld_running before they have been completed by hardware.
> + * fsl_chan_xfer_ld_queue at the end of this function will
> + * re-evaluate the situation.
> + */
> + chan->idle = false;
> +
> + /*
> + * Drop the lock before invoking the client callback, since
> + * the DMAengine API explicitly allows clients to submit new
> + * transactions from their completion callback. Otherwise
> + * we self-deadlock on chan->desc_lock.
> + */
> + spin_unlock(&chan->desc_lock);
> + dmaengine_desc_callback_invoke(&cb, NULL);
> + dma_run_dependencies(&desc->async_tx);
> + spin_lock(&chan->desc_lock);
[Severity: High]
Does dropping the lock here expose the list iteration to concurrent execution?
Since the lock is dropped within the loop over the shared ld_running list,
multiple threads (such as a tasklet and a thread calling fsl_tx_status()) can
concurrently enter fsldma_cleanup_descriptors() and process descriptors
simultaneously.
This could cause callbacks to be invoked concurrently or out of order.
Additionally, each thread updates chan->common.completed_cookie at the end of
the function using their local cookie variable:
if (cookie > 0)
chan->common.completed_cookie = cookie;
If a thread processing an older descriptor finishes after a thread processing a
newer descriptor, the completed_cookie will regress to a smaller value, causing
clients waiting for DMA completion to hang.
> +
> + chan->idle = true;
[Severity: Critical]
Can this unconditional assignment to chan->idle mask the active busy state
of the hardware?
If a descriptor completes, the loop drops the lock, runs the callback,
reacquires the lock, and sets chan->idle = true.
If the next descriptor in the queue is currently being executed by the hardware,
the loop breaks. At this point, chan->idle incorrectly remains true from the
previous iteration, despite the hardware actively processing the current
descriptor.
When fsl_chan_xfer_ld_queue() is subsequently called at the end of the function,
it checks this variable:
drivers/dma/fsldma.c:fsl_chan_xfer_ld_queue() {
...
if (!chan->idle) {
chan_dbg(chan, "DMA controller still busy\n");
return;
}
...
}
Because chan->idle is improperly true, fsl_chan_xfer_ld_queue() will bypass its
safety check, inappropriately restart the controller, and overwrite the hardware
registers while the hardware is actively running.
> +
> + if (!async_tx_test_ack(&desc->async_tx))
> + list_add_tail(&desc->node, &chan->ld_completed);
> + else
> + dma_pool_free(chan->desc_pool, desc, desc->async_tx.phys);
[Severity: High]
Is it possible for chan->desc_pool to be destroyed by concurrent teardown before
we call dma_pool_free() here?
Because chan->desc_lock was dropped earlier to invoke callbacks, this breaks
mutual exclusion with the channel teardown.
drivers/dma/fsldma.c:fsl_dma_free_chan_resources() {
...
dma_pool_destroy(chan->desc_pool);
chan->desc_pool = NULL;
...
}
If fsl_dma_free_chan_resources() runs on another CPU while the lock is dropped
here, it can finish teardown and destroy the pool. When the lock is reacquired
and this code is executed, it will dereference the destroyed (NULL) pool.
> }
>
> /*
[Severity: High]
This isn't a bug introduced by this patch, but is there a missing tasklet_kill()
when removing the channel?
drivers/dma/fsldma.c:fsl_dma_chan_remove() {
...
kfree(chan);
...
}
If a tasklet is scheduled (e.g., by an interrupt) just before or during channel
removal, the tasklet will execute dma_do_tasklet() after chan has been freed,
dereferencing the freed pointer and causing a use-after-free.
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609221926.35538-3-rosenp@gmail.com?part=1
next prev parent reply other threads:[~2026-06-09 22:32 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-09 22:19 [PATCHv3 00/15] dmaengine: fsldma: devm conversion, fixups, and cleanups Rosen Penev
2026-06-09 22:19 ` [PATCHv3 01/15] dmaengine: fsldma: kill tasklet before removing channel Rosen Penev
2026-06-09 22:31 ` sashiko-bot
2026-06-10 1:35 ` Frank Li
2026-06-09 22:19 ` [PATCHv3 02/15] dmaengine: fsldma: drop desc_lock before invoking client callback Rosen Penev
2026-06-09 22:32 ` sashiko-bot [this message]
2026-06-09 22:19 ` [PATCHv3 03/15] dmaengine: fsldma: halt DMA engine before freeing resources Rosen Penev
2026-06-10 2:46 ` Frank Li
2026-06-09 22:19 ` [PATCHv3 04/15] dmaengine: fsldma: provide device_release callback Rosen Penev
2026-06-09 22:29 ` sashiko-bot
2026-06-09 22:19 ` [PATCHv3 05/15] dmaengine: fsldma: check dma_async_device_register() return value Rosen Penev
2026-06-09 22:29 ` sashiko-bot
2026-06-09 22:19 ` [PATCHv3 06/15] dmaengine: fsldma: fix probe error path not freeing IRQs Rosen Penev
2026-06-09 22:19 ` [PATCHv3 07/15] dmaengine: fsldma: fix request_irqs unwind freeing unregistered IRQ Rosen Penev
2026-06-09 22:28 ` sashiko-bot
2026-06-09 22:19 ` [PATCHv3 08/15] dmaengine: fsldma: convert to platform_get_irq_optional() Rosen Penev
2026-06-10 2:58 ` Frank Li
2026-06-09 22:19 ` [PATCHv3 09/15] dmaengine: fsldma: use devm for kzalloc() Rosen Penev
2026-06-10 1:57 ` Frank Li
2026-06-09 22:19 ` [PATCHv3 10/15] dmaengine: fsldma: use devm_platform_ioremap_resource() Rosen Penev
2026-06-09 22:19 ` [PATCHv3 11/15] dmaengine: fsldma: convert channel allocation to devm_kzalloc() Rosen Penev
2026-06-09 22:19 ` [PATCHv3 12/15] dmaengine: fsldma: use devm for of_iomap() Rosen Penev
2026-06-10 1:53 ` Frank Li
2026-06-09 22:19 ` [PATCHv3 13/15] dmaengine: fsldma: replace irq_of_parse_and_map with of_irq_get Rosen Penev
2026-06-09 22:36 ` sashiko-bot
2026-06-09 22:19 ` [PATCHv3 14/15] dmaengine: fsldma: replace ppc-specific accessors with portable generic ones Rosen Penev
2026-06-09 22:19 ` [PATCHv3 15/15] dmaengine: fsldma: fix kernel-doc param names to match function signatures Rosen Penev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260609223203.B15D51F00893@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=Frank.Li@kernel.org \
--cc=dmaengine@vger.kernel.org \
--cc=rosenp@gmail.com \
--cc=sashiko-reviews@lists.linux.dev \
--cc=vkoul@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.