DMA Engine development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Nathan Lynch via B4 Relay" <devnull+nathan.lynch.amd.com@kernel.org>
Cc: linux-pci@vger.kernel.org, Frank.Li@kernel.org, vkoul@kernel.org,
	dmaengine@vger.kernel.org
Subject: Re: [PATCH v3 23/23] dmaengine: sdxi: Add DMA engine provider
Date: Sat, 06 Jun 2026 00:33:41 +0000	[thread overview]
Message-ID: <20260606003341.DA7871F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260605-sdxi-base-v3-23-4d38ca2bdffe@amd.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:
- [Critical] Polling timeout in `sdxi_dma_synchronize()` or `sdxi_stop_cxt()` frees the completion status block while the hardware DMA engine may still write to it.
- [Critical] The interrupt handler silently ignores DMA hardware errors, treating failed transfers as successful and masking data corruption.
- [High] `sdxi_dma_terminate_all()` fails to abort or reclaim issued transfers, violating `dmaengine_terminate_all` semantics and causing DMA Use-After-Free.
- [High] Error paths in `sdxi_dma_synchronize()` bypass `vchan_synchronize()` and leave uninitialized holes in the hardware ring, permanently stalling the channel.
- [High] `sdxi_tx_status()` returns `DMA_ERROR` without unlinking the errored descriptor, causing destructive state inconsistencies.
- [High] Polling `sdxi_tx_status()` completes descriptors out of order, stranding older unpolled transfers in `desc_issued`.
- [High] Freeing an unsubmitted descriptor leaves invalid slots in the hardware ring, permanently hanging the DMA channel.
- [Medium] Missing error check for `sdxi_dma_register` in `sdxi_device_init` ignores registration failures.
--

commit c729c81d44b7c609574437378a8416c4d4f6cf2b
Author: Nathan Lynch <nathan.lynch@amd.com>

dmaengine: sdxi: Add DMA engine provider

Register a DMA engine provider that implements memcpy. The number of
channels per SDXI function can be controlled via a module
parameter (dma_channels). The provider uses the virt-dma library.

> diff --git a/drivers/dma/sdxi/device.c b/drivers/dma/sdxi/device.c
> index 26be376c9545f..d6097a6d8992a 100644
> --- a/drivers/dma/sdxi/device.c
> +++ b/drivers/dma/sdxi/device.c
> @@ -321,6 +322,7 @@ static int sdxi_device_init(struct sdxi_dev *sdxi)
>  	if (err)
>  		return err;
>  
> +	sdxi_dma_register(sdxi);

[Severity: Medium]
Is it intentional to ignore the return value of sdxi_dma_register() here?

If registration fails, the device probe will succeed even though the DMA
provider interface isn't actually registered.

>  	return 0;
>  }
>  
> diff --git a/drivers/dma/sdxi/dma.c b/drivers/dma/sdxi/dma.c
> new file mode 100644
> index 0000000000000..01c85cc2763d8
> --- /dev/null
> +++ b/drivers/dma/sdxi/dma.c
[ ... ]
> +static void sdxi_tx_desc_free(struct virt_dma_desc *vdesc)
> +{
> +	struct sdxi_dma_desc *sddesc = to_sdxi_dma_desc(vdesc);
> +
> +	sdxi_completion_free(sddesc->completion);
> +	kfree(to_sdxi_dma_desc(vdesc));
> +}

[Severity: High]
If a client frees a prepared descriptor without submitting it, won't the
reserved hardware slots be left in the ring with their valid bits cleared?

When subsequent descriptors are submitted, it seems the hardware will
permanently hang waiting for these abandoned slots to become valid.

[ ... ]
> +static enum dma_status sdxi_tx_status(struct dma_chan *chan,
> +				      dma_cookie_t cookie,
> +				      struct dma_tx_state *state)
> +{
[ ... ]
> +	if (!sdxi_completion_signaled(sddesc->completion))
> +		return DMA_IN_PROGRESS;
> +
> +	if (sdxi_completion_errored(sddesc->completion))
> +		return DMA_ERROR;

[Severity: High]
If sdxi_completion_errored() returns true, this returns DMA_ERROR without
unlinking the descriptor from the desc_issued list.

Will the interrupt handler sdxi_dma_cxt_irq() then process this errored
descriptor, ignore the error, and successfully complete it via
vchan_cookie_complete()? Does this cause subsequent polling to report
DMA_COMPLETE and destroy the error state?

> +
> +	list_del(&vdesc->node);
> +	vchan_cookie_complete(vdesc);

[Severity: High]
If this function successfully calls vchan_cookie_complete(), doesn't it
unconditionally set chan->completed_cookie to this single descriptor's cookie?

If there are older, uncompleted descriptors preceding it in desc_issued, will
subsequent status checks for those older descriptors return DMA_COMPLETE
early?

> +
> +	return dma_cookie_status(chan, cookie, state);
> +}
[ ... ]
> +static int sdxi_dma_terminate_all(struct dma_chan *dma_chan)
> +{
> +	struct virt_dma_chan *vchan = to_virt_chan(dma_chan);
> +	u64 dbval = 0;
[ ... ]
> +	scoped_guard(spinlock_irqsave, &vchan->lock) {
> +		struct virt_dma_desc *vdesc;
> +		LIST_HEAD(head);
> +
> +		list_splice_tail_init(&vchan->desc_allocated, &head);
> +		list_splice_tail_init(&vchan->desc_submitted, &head);

[Severity: High]
Does this completely ignore the desc_issued list and fail to halt the
hardware context using sdxi_stop_cxt()?

If the desc_issued list is ignored, won't the hardware continue executing
active transfers and writing to destination buffers that the client may have
already freed after calling terminate_all?

> +
> +		if (list_empty(&head))
> +			return 0;
[ ... ]
> +static void sdxi_dma_synchronize(struct dma_chan *dma_chan)
> +{
> +	struct sdxi_cxt *cxt = to_sdxi_dma_chan(dma_chan)->cxt;
> +	struct sdxi_ring_resv resv;
> +	struct sdxi_desc *nop;
> +	int err;
> +
> +	/* Submit a single nop with fence and wait for it to complete. */
> +
> +	if (sdxi_ring_reserve(cxt->ring_state, 1, &resv))
> +		return;
> +
> +	struct sdxi_completion *comp __free(sdxi_completion) = sdxi_completion_alloc(cxt->sdxi);
> +	if (!comp)
> +		return;

[Severity: High]
If sdxi_completion_alloc() fails and returns early, does this leave the
reserved hardware slot completely uninitialized with a valid bit of 0?

Will the hardware permanently hang when it encounters this slot on a
subsequent doorbell ring? Also, does returning early bypass
vchan_synchronize() while the channel might still be active?

> +
> +	nop = sdxi_ring_resv_next(&resv);
> +	sdxi_serialize_nop(nop);
> +	sdxi_completion_attach(nop, comp);
> +	sdxi_desc_set_fence(nop);
> +	sdxi_desc_make_valid(nop);
> +	sdxi_cxt_push_doorbell(cxt, sdxi_ring_resv_dbval(&resv));
> +
> +	err = sdxi_completion_poll(comp);
> +	WARN_ONCE(err, "got %d polling cst_blk", err);

[Severity: Critical]
If sdxi_completion_poll() times out and returns -ETIMEDOUT, the function
returns and comp is automatically freed by the __free() cleanup wrapper.

Does this return the physical memory to the DMA pool while the hardware DMA
engine retains the physical address pointer?

If the hardware eventually processes the descriptor, will it blindly write
status updates to freed DMA memory that may have been reallocated?

(Note: this identical polling timeout issue also seems to affect
sdxi_stop_cxt() in context.c)

> +
> +	vchan_synchronize(to_virt_chan(dma_chan));
> +}
> +
> +static irqreturn_t sdxi_dma_cxt_irq(int irq, void *data)
> +{
> +	struct sdxi_dma_chan *sdchan = data;
> +	struct virt_dma_chan *vchan = &sdchan->vchan;
> +	struct virt_dma_desc *vdesc;
> +	bool completed = false;
> +
> +	guard(spinlock_irqsave)(&vchan->lock);
> +
> +	while ((vdesc = vchan_next_desc(vchan))) {
> +		struct sdxi_dma_desc *sddesc = to_sdxi_dma_desc(vdesc);
> +
> +		if (!sdxi_completion_signaled(sddesc->completion))
> +			break;
> +
> +		list_del(&vdesc->node);
> +		vchan_cookie_complete(&sddesc->vdesc);
> +		completed = true;
> +	}

[Severity: Critical]
Does this interrupt handler blindly process signaled descriptors without
calling sdxi_completion_errored() to check for hardware errors?

If the hardware reported an error bit, does this mask the failure and treat
the transfer as successful, potentially causing undetected data corruption?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-sdxi-base-v3-0-4d38ca2bdffe@amd.com?part=23

      reply	other threads:[~2026-06-06  0:33 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-06  0:02 [PATCH v3 00/23] dmaengine: Smart Data Accelerator Interface (SDXI) basic support Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 01/23] PCI: Add SNIA SDXI accelerator sub-class Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 02/23] MAINTAINERS: Add entry for SDXI driver Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 03/23] dmaengine: sdxi: Add PCI initialization Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 04/23] dmaengine: sdxi: Feature discovery and initial configuration Nathan Lynch via B4 Relay
2026-06-06  0:14   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 05/23] dmaengine: sdxi: Configure context tables Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 06/23] dmaengine: sdxi: Allocate DMA pools Nathan Lynch via B4 Relay
2026-06-06  0:15   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 07/23] dmaengine: sdxi: Allocate administrative context Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 08/23] dmaengine: sdxi: Install " Nathan Lynch via B4 Relay
2026-06-06  0:26   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 09/23] dmaengine: sdxi: Start functions on probe, stop on remove Nathan Lynch via B4 Relay
2026-06-06  0:14   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 10/23] dmaengine: sdxi: Complete administrative context jump start Nathan Lynch via B4 Relay
2026-06-06  0:12   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 11/23] dmaengine: sdxi: Add client context alloc and release APIs Nathan Lynch via B4 Relay
2026-06-06  0:22   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 12/23] dmaengine: sdxi: Add descriptor ring management Nathan Lynch via B4 Relay
2026-06-06  0:19   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 13/23] dmaengine: sdxi: Add unit tests for descriptor ring reservations Nathan Lynch via B4 Relay
2026-06-06  0:16   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 14/23] dmaengine: sdxi: Attach descriptor ring state to contexts Nathan Lynch via B4 Relay
2026-06-06  0:24   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 15/23] dmaengine: sdxi: Per-context access key (AKey) table entry allocator Nathan Lynch via B4 Relay
2026-06-06  0:20   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 16/23] dmaengine: sdxi: Generic descriptor manipulation helpers Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 17/23] dmaengine: sdxi: Add completion status block API Nathan Lynch via B4 Relay
2026-06-06  0:21   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 18/23] dmaengine: sdxi: Encode context start, stop, and sync descriptors Nathan Lynch via B4 Relay
2026-06-06  0:02 ` [PATCH v3 19/23] dmaengine: sdxi: Provide context start and stop APIs Nathan Lynch via B4 Relay
2026-06-06  0:22   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 20/23] dmaengine: sdxi: Encode nop, copy, and interrupt descriptors Nathan Lynch via B4 Relay
2026-06-06  0:20   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 21/23] dmaengine: sdxi: Add unit tests for descriptor encoding Nathan Lynch via B4 Relay
2026-06-06  0:26   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 22/23] dmaengine: sdxi: MSI/MSI-X vector allocation and mapping Nathan Lynch via B4 Relay
2026-06-06  0:31   ` sashiko-bot
2026-06-06  0:02 ` [PATCH v3 23/23] dmaengine: sdxi: Add DMA engine provider Nathan Lynch via B4 Relay
2026-06-06  0:33   ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260606003341.DA7871F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Frank.Li@kernel.org \
    --cc=devnull+nathan.lynch.amd.com@kernel.org \
    --cc=dmaengine@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox