[PATCH v5 0/2] dmaengine: fsl-edma: Scatter/gather improvements

Linux kernel and device drivers for NXP i.MX platforms
 help / color / mirror / Atom feed

* [PATCH v5 0/2] dmaengine: fsl-edma: Scatter/gather improvements
@ 2026-07-02 12:34 Benoît Monin
  2026-07-02 12:34 ` [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
  2026-07-02 12:34 ` [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
  0 siblings, 2 replies; 6+ messages in thread
From: Benoît Monin @ 2026-07-02 12:34 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

This series adds support for scatter/gather DMA transfers via dma_vec
and dynamic descriptor chaining to the Freescale eDMA controller driver.

The first patch implements the .device_prep_peripheral_dma_vec() callback,
enabling the DMA engine to accept an array of dma_vec structures. This
callback supports both regular and cyclic transfer modes.

The second patch introduces dynamic scatter/gather chaining, which allows
multiple DMA descriptors to be linked together without stopping the channel.
This optimization eliminates idle periods when back-to-back transfers are
submitted, improving throughput and reducing latency. The implementation
carefully preserves cyclic transfer semantics and respects hardware
constraints on platforms with split register layouts.

I tested it on the i.MX93. The dynamic scatter/gather chaining should
work with other eDMA controller with split register layout.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
Changes in v5:
- Rebased on v7.2-rc1.
- Add a call to dma_wmb() to ensure that dlast_sga is updated
  before csr when linking scatter/gather transactions.
- Don't update TCD registers if updating csr requires clearing the
  channel DONE bit to avoid a status mismatch in fsl_edma_tx_chan_handler().
- Link to v4: https://patch.msgid.link/20260518-fsl-edma-dyn-sg-v4-0-8ce7d95b1ce9@bootlin.com

Changes in v4:
- To keep transactions in order, link DMA transaction to the end of
  submitted list first, only lookup the issued list is the submitted
  list is empty.
- Link to v3: https://patch.msgid.link/20260511-fsl-edma-dyn-sg-v3-0-98a181775dae@bootlin.com

Changes in v3:
- Fix formatting errors reported by Frank Li.
- Add fsl_edma_tx_submit() to link the DMA transactions
  when they are submitted, not when they are prepared.
- Link to v2: https://patch.msgid.link/20260506-fsl-edma-dyn-sg-v2-0-66439cdd414e@bootlin.com

Changes in v2:
- Drop the RFC prefix, as asked by Frank Li
- No code change
- Link to v1: https://patch.msgid.link/20260430-fsl-edma-dyn-sg-v1-0-4e0ecbe2df66@bootlin.com

To: Frank Li <Frank.Li@nxp.com>
To: Vinod Koul <vkoul@kernel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Frank Li <Frank.Li@kernel.org>
Cc: imx@lists.linux.dev
Cc: dmaengine@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
Benoît Monin (2):
      dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
      dmaengine: fsl-edma: Support dynamic scatter/gather chaining

 drivers/dma/fsl-edma-common.c | 197 ++++++++++++++++++++++++++++++++++++++++--
 drivers/dma/fsl-edma-common.h |   4 +
 drivers/dma/fsl-edma-main.c   |   2 +
 drivers/dma/fsl-edma-trace.h  |   5 ++
 4 files changed, 202 insertions(+), 6 deletions(-)
---
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
change-id: 20260428-fsl-edma-dyn-sg-960731e37da2

Best regards,
--  
Benoît Monin, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
  2026-07-02 12:34 [PATCH v5 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
@ 2026-07-02 12:34 ` Benoît Monin
  2026-07-02 12:47   ` sashiko-bot
  2026-07-02 12:34 ` [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
  1 sibling, 1 reply; 6+ messages in thread
From: Benoît Monin @ 2026-07-02 12:34 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

Add implementation of .device_prep_peripheral_dma_vec() callback to setup
a scatter/gather DMA transfer from an array of dma_vec structures. Setup
a cyclic transfer if the DMA_PREP_REPEAT flag is set.

Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
 drivers/dma/fsl-edma-common.c | 109 ++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/fsl-edma-common.h |   4 ++
 drivers/dma/fsl-edma-main.c   |   2 +
 3 files changed, 115 insertions(+)

diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index bb7531c456df..c10190164926 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -673,6 +673,115 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
 }
 
+struct dma_async_tx_descriptor *
+fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
+				 size_t nb, enum dma_transfer_direction direction,
+				 unsigned long flags)
+{
+	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
+	dma_addr_t src_addr, dst_addr, last_sg;
+	struct fsl_edma_desc *fsl_desc;
+	u16 soff, doff, iter;
+	u32 nbytes;
+	int i;
+
+	if (!is_slave_direction(direction))
+		return NULL;
+
+	if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
+		return NULL;
+
+	fsl_desc = fsl_edma_alloc_desc(fsl_chan, nb);
+	if (!fsl_desc)
+		return NULL;
+	fsl_desc->iscyclic = flags & DMA_PREP_REPEAT;
+	fsl_desc->dirn = direction;
+
+	if (direction == DMA_MEM_TO_DEV) {
+		if (!fsl_chan->cfg.src_addr_width)
+			fsl_chan->cfg.src_addr_width = fsl_chan->cfg.dst_addr_width;
+		fsl_chan->attr =
+			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
+					      fsl_chan->cfg.dst_addr_width);
+		nbytes = fsl_chan->cfg.dst_addr_width * fsl_chan->cfg.dst_maxburst;
+	} else {
+		if (!fsl_chan->cfg.dst_addr_width)
+			fsl_chan->cfg.dst_addr_width = fsl_chan->cfg.src_addr_width;
+		fsl_chan->attr =
+			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
+					      fsl_chan->cfg.dst_addr_width);
+		nbytes = fsl_chan->cfg.src_addr_width * fsl_chan->cfg.src_maxburst;
+	}
+
+	for (i = 0; i < nb; i++) {
+		if (direction == DMA_MEM_TO_DEV) {
+			src_addr = vecs[i].addr;
+			dst_addr = fsl_chan->dma_dev_addr;
+			soff = fsl_chan->cfg.dst_addr_width;
+			doff = 0;
+		} else if (direction == DMA_DEV_TO_MEM) {
+			src_addr = fsl_chan->dma_dev_addr;
+			dst_addr = vecs[i].addr;
+			soff = 0;
+			doff = fsl_chan->cfg.src_addr_width;
+		} else {
+			/* DMA_DEV_TO_DEV */
+			src_addr = fsl_chan->cfg.src_addr;
+			dst_addr = fsl_chan->cfg.dst_addr;
+			soff = 0;
+			doff = 0;
+		}
+
+		/*
+		 * Choose the suitable burst length if dma_vec length is not
+		 * multiple of burst length so that the whole transfer length is
+		 * multiple of minor loop(burst length).
+		 */
+		if (vecs[i].len % nbytes) {
+			u32 width = (direction == DMA_DEV_TO_MEM) ? doff : soff;
+			u32 burst = (direction == DMA_DEV_TO_MEM) ?
+						fsl_chan->cfg.src_maxburst :
+						fsl_chan->cfg.dst_maxburst;
+			int j;
+
+			for (j = burst; j > 1; j--) {
+				if (!(vecs[i].len % (j * width))) {
+					nbytes = j * width;
+					break;
+				}
+			}
+			/* Set burst size as 1 if there's no suitable one */
+			if (j == 1)
+				nbytes = width;
+		}
+
+		iter = vecs[i].len / nbytes;
+		if (i < nb - 1) {
+			last_sg = fsl_desc->tcd[(i + 1)].ptcd;
+			fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+					  dst_addr, fsl_chan->attr, soff,
+					  nbytes, 0, iter, iter, doff, last_sg,
+					  false, false, true);
+		} else {
+			if (fsl_desc->iscyclic) {
+				last_sg = fsl_desc->tcd[0].ptcd;
+				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+						  dst_addr, fsl_chan->attr, soff,
+						  nbytes, 0, iter, iter, doff, last_sg,
+						  true, false, true);
+			} else {
+				last_sg = 0;
+				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+						  dst_addr, fsl_chan->attr, soff,
+						  nbytes, 0, iter, iter, doff, last_sg,
+						  true, true, false);
+			}
+		}
+	}
+
+	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+}
+
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_transfer_direction direction,
diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-common.h
index 205a96489094..0d028048701d 100644
--- a/drivers/dma/fsl-edma-common.h
+++ b/drivers/dma/fsl-edma-common.h
@@ -496,6 +496,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 		struct dma_chan *chan, dma_addr_t dma_addr, size_t buf_len,
 		size_t period_len, enum dma_transfer_direction direction,
 		unsigned long flags);
+struct dma_async_tx_descriptor *fsl_edma_prep_peripheral_dma_vec(
+		struct dma_chan *chan, const struct dma_vec *vecs,
+		size_t nb, enum dma_transfer_direction direction,
+		unsigned long flags);
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_transfer_direction direction,
diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
index 36155ab1602a..6693b4270a1a 100644
--- a/drivers/dma/fsl-edma-main.c
+++ b/drivers/dma/fsl-edma-main.c
@@ -841,6 +841,8 @@ static int fsl_edma_probe(struct platform_device *pdev)
 	fsl_edma->dma_dev.device_free_chan_resources
 		= fsl_edma_free_chan_resources;
 	fsl_edma->dma_dev.device_tx_status = fsl_edma_tx_status;
+	fsl_edma->dma_dev.device_prep_peripheral_dma_vec
+		= fsl_edma_prep_peripheral_dma_vec;
 	fsl_edma->dma_dev.device_prep_slave_sg = fsl_edma_prep_slave_sg;
 	fsl_edma->dma_dev.device_prep_dma_cyclic = fsl_edma_prep_dma_cyclic;
 	fsl_edma->dma_dev.device_prep_dma_memcpy = fsl_edma_prep_memcpy;

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-07-02 12:34 [PATCH v5 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
  2026-07-02 12:34 ` [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
@ 2026-07-02 12:34 ` Benoît Monin
  2026-07-02 12:45   ` sashiko-bot
  1 sibling, 1 reply; 6+ messages in thread
From: Benoît Monin @ 2026-07-02 12:34 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

Implement dynamic linking of scatter/gather transfers to enable
chaining multiple DMA descriptors without stopping the channel.
This avoids waiting for the channel to go idle if there is another
transaction already issued.

Add fsl_edma_link_sg() to dynamically link the last TCD of a previously
submitted descriptor to the first TCD of a new descriptor by setting
the scatter/gather address and the E_SG flag, and keeping the channel
active by clearing the DREQ bit.

Linking is done when the transaction is submitted by fsl_edma_tx_submit().
To do so, the .tx_submit() callback is overridden for non-cyclic
transactions prepared by fsl_edma_prep_peripheral_dma_vec() and
fsl_edma_prep_slave_sg(). This ensures that transactions are linked
in the order they are submitted.

Update fsl_edma_xfer_desc() to avoid re-initializing the hardware when a
transfer is already in progress, allowing seamless chaining of descriptors.

Modify the transfer completion handler to check the DONE flag in the
channel CSR before marking the transfer complete. Since this flag is
only available on SoC with the split registers layout, we only link
transactions for DMA controllers flagged with FSL_EDMA_DRV_SPLIT_REG.

Add trace event for scatter/gather linking operations.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
 drivers/dma/fsl-edma-common.c | 90 +++++++++++++++++++++++++++++++++++++++----
 drivers/dma/fsl-edma-trace.h  |  5 +++
 2 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index c10190164926..e2f480c8fe21 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
 		list_del(&fsl_chan->edesc->vdesc.node);
 		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
 		fsl_chan->edesc = NULL;
-		fsl_chan->status = DMA_COMPLETE;
+		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
+		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
+			fsl_chan->status = DMA_COMPLETE;
+		}
 	} else {
 		vchan_cyclic_callback(&fsl_chan->edesc->vdesc);
 	}
@@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
 }
 
+static void fsl_edma_link_sg(struct fsl_edma_chan *fsl_chan, struct fsl_edma_desc *fsl_desc)
+{
+	u32 flags = fsl_edma_drvflags(fsl_chan);
+	struct fsl_edma_hw_tcd *last_tcd;
+	struct fsl_edma_desc *prev_desc;
+	struct virt_dma_desc *vdesc;
+	u16 csr;
+
+	lockdep_assert_held(&fsl_chan->vchan.lock);
+
+	if (!(flags & FSL_EDMA_DRV_SPLIT_REG))
+		return;
+
+	vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_submitted,
+					struct virt_dma_desc, node);
+	if (!vdesc)
+		vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_issued,
+						struct virt_dma_desc, node);
+	if (!vdesc)
+		return;
+
+	prev_desc = to_fsl_edma_desc(vdesc);
+	last_tcd = prev_desc->tcd[prev_desc->n_tcds - 1].vtcd;
+
+	csr = fsl_edma_get_tcd_to_cpu(fsl_chan, last_tcd, csr);
+	if (!(csr & EDMA_TCD_CSR_D_REQ))
+		return;
+
+	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, fsl_desc->tcd[0].ptcd, dlast_sga);
+	dma_wmb();
+
+	csr &= ~EDMA_TCD_CSR_D_REQ;
+	csr |= EDMA_TCD_CSR_E_SG;
+	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, csr, csr);
+
+	if (prev_desc == fsl_chan->edesc &&
+	    prev_desc->n_tcds == 1 &&
+	    !(flags & FSL_EDMA_DRV_CLEAR_DONE_E_SG)) {
+		edma_cp_tcd_to_reg(fsl_chan, last_tcd, dlast_sga);
+		edma_cp_tcd_to_reg(fsl_chan, last_tcd, csr);
+	}
+
+	trace_edma_link_sg(fsl_chan, last_tcd);
+}
+
+static dma_cookie_t fsl_edma_tx_submit(struct dma_async_tx_descriptor *tx)
+{
+	struct virt_dma_desc *vd = container_of(tx, struct virt_dma_desc, tx);
+	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(tx->chan);
+	struct fsl_edma_desc *fsl_desc = to_fsl_edma_desc(vd);
+	struct virt_dma_chan *vc = to_virt_chan(tx->chan);
+	dma_cookie_t cookie;
+
+	guard(spinlock_irqsave)(&fsl_chan->vchan.lock);
+
+	fsl_edma_link_sg(fsl_chan, fsl_desc);
+	cookie = dma_cookie_assign(tx);
+	list_move_tail(&vd->node, &vc->desc_submitted);
+
+	return cookie;
+}
+
 struct dma_async_tx_descriptor *
 fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
 				 size_t nb, enum dma_transfer_direction direction,
@@ -680,6 +745,7 @@ fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *ve
 {
 	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
 	dma_addr_t src_addr, dst_addr, last_sg;
+	struct dma_async_tx_descriptor *tx;
 	struct fsl_edma_desc *fsl_desc;
 	u16 soff, doff, iter;
 	u32 nbytes;
@@ -779,7 +845,10 @@ fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *ve
 		}
 	}
 
-	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx = vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	if (!fsl_desc->iscyclic)
+		tx->tx_submit = fsl_edma_tx_submit;
+	return tx;
 }
 
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
@@ -788,9 +857,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		unsigned long flags, void *context)
 {
 	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
+	dma_addr_t src_addr, dst_addr, last_sg;
+	struct dma_async_tx_descriptor *tx;
 	struct fsl_edma_desc *fsl_desc;
 	struct scatterlist *sg;
-	dma_addr_t src_addr, dst_addr, last_sg;
 	u16 soff, doff, iter;
 	u32 nbytes;
 	int i;
@@ -882,7 +952,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		}
 	}
 
-	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx = vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx->tx_submit = fsl_edma_tx_submit;
+
+	return tx;
 }
 
 struct dma_async_tx_descriptor *fsl_edma_prep_memcpy(struct dma_chan *chan,
@@ -924,9 +997,12 @@ void fsl_edma_xfer_desc(struct fsl_edma_chan *fsl_chan)
 	if (!vdesc)
 		return;
 	fsl_chan->edesc = to_fsl_edma_desc(vdesc);
-	fsl_edma_set_tcd_regs(fsl_chan, fsl_chan->edesc->tcd[0].vtcd);
-	fsl_edma_enable_request(fsl_chan);
-	fsl_chan->status = DMA_IN_PROGRESS;
+
+	if (fsl_chan->status != DMA_IN_PROGRESS) {
+		fsl_edma_set_tcd_regs(fsl_chan, fsl_chan->edesc->tcd[0].vtcd);
+		fsl_edma_enable_request(fsl_chan);
+		fsl_chan->status = DMA_IN_PROGRESS;
+	}
 }
 
 void fsl_edma_issue_pending(struct dma_chan *chan)
diff --git a/drivers/dma/fsl-edma-trace.h b/drivers/dma/fsl-edma-trace.h
index d3541301a247..ac319d2dbb90 100644
--- a/drivers/dma/fsl-edma-trace.h
+++ b/drivers/dma/fsl-edma-trace.h
@@ -119,6 +119,11 @@ DEFINE_EVENT(edma_log_tcd, edma_fill_tcd,
 	TP_ARGS(chan, tcd)
 );
 
+DEFINE_EVENT(edma_log_tcd, edma_link_sg,
+	     TP_PROTO(struct fsl_edma_chan *chan, void *tcd),
+	     TP_ARGS(chan, tcd)
+);
+
 #endif
 
 /* this part must be outside header guard */

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-07-02 12:34 ` [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
@ 2026-07-02 12:45   ` sashiko-bot
  2026-07-02 13:17     ` Frank Li
  0 siblings, 1 reply; 6+ messages in thread
From: sashiko-bot @ 2026-07-02 12:45 UTC (permalink / raw)
  To: Benoît Monin; +Cc: Frank.Li, imx, vkoul, dmaengine

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Double-execution of DMA descriptors due to state machine desync when interrupts coalesce.
- [Critical] Channel deadlock due to dynamic descriptor linking bypassing `issue_pending()`.
--

commit a67c20eb8f20e15fe2cadb092322588069fe328e
Author: Benoît Monin <benoit.monin@bootlin.com>

dmaengine: fsl-edma: Support dynamic scatter/gather chaining

Implement dynamic linking of scatter/gather transfers to enable
chaining multiple DMA descriptors without stopping the channel.
This avoids waiting for the channel to go idle if there is another
transaction already issued.

> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c
> @@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
>  		list_del(&fsl_chan->edesc->vdesc.node);
>  		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
>  		fsl_chan->edesc = NULL;
> -		fsl_chan->status = DMA_COMPLETE;
> +		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
> +		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
> +			fsl_chan->status = DMA_COMPLETE;
> +		}

[Severity: Critical]
Does checking the channel-global DONE flag here cause chained descriptors
to execute twice if interrupts coalesce?

For example, if descriptors A and B are chained and the hardware finishes
both before the interrupt handler runs, won't the handler complete A,
see DONE is set, and mark the channel DMA_COMPLETE?

If that happens, wouldn't fsl_edma_xfer_desc() then fetch B from the
issued list and restart the hardware, executing B a second time?

[ ... ]

> @@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
> +static dma_cookie_t fsl_edma_tx_submit(struct dma_async_tx_descriptor *tx)
> +{
> +	struct virt_dma_desc *vd = container_of(tx, struct virt_dma_desc, tx);
> +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(tx->chan);
> +	struct fsl_edma_desc *fsl_desc = to_fsl_edma_desc(vd);
> +	struct virt_dma_chan *vc = to_virt_chan(tx->chan);
> +	dma_cookie_t cookie;
> +
> +	guard(spinlock_irqsave)(&fsl_chan->vchan.lock);
> +
> +	fsl_edma_link_sg(fsl_chan, fsl_desc);
> +	cookie = dma_cookie_assign(tx);
> +	list_move_tail(&vd->node, &vc->desc_submitted);
> +
> +	return cookie;
> +}

[Severity: Critical]
Can this cause a channel deadlock by executing unissued descriptors?

Since hardware linking happens in fsl_edma_tx_submit() rather than waiting
for the issue_pending() callback, if the hardware completes the newly linked
descriptor before the client calls issue_pending(), won't the interrupt
handler fire but find edesc is NULL?

In fsl_edma_tx_chan_handler():

    if (!fsl_chan->edesc) {
        /* terminate_all called before */
        spin_unlock(&fsl_chan->vchan.lock);
        return;
    }

Wouldn't this drop the completion and leave the channel stuck when
issue_pending() is eventually called, since the channel status is still left
as DMA_IN_PROGRESS?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702-fsl-edma-dyn-sg-v5-0-16787185be49@bootlin.com?part=2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
  2026-07-02 12:34 ` [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
@ 2026-07-02 12:47   ` sashiko-bot
  0 siblings, 0 replies; 6+ messages in thread
From: sashiko-bot @ 2026-07-02 12:47 UTC (permalink / raw)
  To: Benoît Monin; +Cc: dmaengine, imx, Frank.Li, vkoul

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [High] The driver is vulnerable to divide-by-zero kernel panics during DMA preparation, primarily when handling unaligned DMA_DEV_TO_DEV transfers or when slave configuration attributes are uninitialized (zero).
- [High] Passing a zero-length array (`nb == 0` or `sg_len == 0`) leads to a zero-length allocation for a flexible array, causing an out-of-bounds read/write during descriptor submission and transfer execution.
- [High] Multi-FIFO configuration is ignored for offsets in SG/vec transfers, leading to negative address progression and hardware state corruption.
--

commit 19b9121b085ec8ee890fa0c25990d53b2a25758c
Author: Benoît Monin <benoit.monin@bootlin.com>

dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec

Add implementation of .device_prep_peripheral_dma_vec() callback to setup
a scatter/gather DMA transfer from an array of dma_vec structures. Setup
a cyclic transfer if the DMA_PREP_REPEAT flag is set.

> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> index bb7531c456dfa..c10190164926a 100644
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c
[ ... ]
> @@ -673,6 +673,115 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>  	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
>  }
>  
> +struct dma_async_tx_descriptor *
> +fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
> +				 size_t nb, enum dma_transfer_direction direction,
> +				 unsigned long flags)
> +{
> +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> +	dma_addr_t src_addr, dst_addr, last_sg;
> +	struct fsl_edma_desc *fsl_desc;
> +	u16 soff, doff, iter;
> +	u32 nbytes;
> +	int i;
> +
> +	if (!is_slave_direction(direction))
> +		return NULL;
> +
> +	if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
> +		return NULL;
> +
> +	fsl_desc = fsl_edma_alloc_desc(fsl_chan, nb);

[Severity: High]
This is a pre-existing issue, but does this allocation lack a check for nb > 0?

If nb is 0, fsl_edma_alloc_desc() allocates a descriptor with a 0-length
flexible array for tcd. Later, when the descriptor is submitted and executed:

drivers/dma/fsl-edma-common.c:fsl_edma_xfer_desc() {
    ...
    fsl_edma_set_tcd_regs(fsl_chan, fsl_chan->edesc->tcd[0].vtcd);
    ...
}

This unconditionally accesses tcd[0], resulting in an out-of-bounds read.

Also, fsl_edma_link_sg() accesses prev_desc->tcd[prev_desc->n_tcds - 1], which
would evaluate to index -1. Could this cause memory corruption or a crash if
nb is 0?

> +	if (!fsl_desc)
> +		return NULL;
> +	fsl_desc->iscyclic = flags & DMA_PREP_REPEAT;
> +	fsl_desc->dirn = direction;
> +
> +	if (direction == DMA_MEM_TO_DEV) {
> +		if (!fsl_chan->cfg.src_addr_width)
> +			fsl_chan->cfg.src_addr_width = fsl_chan->cfg.dst_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.dst_addr_width * fsl_chan->cfg.dst_maxburst;
> +	} else {
> +		if (!fsl_chan->cfg.dst_addr_width)
> +			fsl_chan->cfg.dst_addr_width = fsl_chan->cfg.src_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.src_addr_width * fsl_chan->cfg.src_maxburst;
> +	}
> +
> +	for (i = 0; i < nb; i++) {
> +		if (direction == DMA_MEM_TO_DEV) {
> +			src_addr = vecs[i].addr;
> +			dst_addr = fsl_chan->dma_dev_addr;
> +			soff = fsl_chan->cfg.dst_addr_width;
> +			doff = 0;

[Severity: High]
This is a pre-existing issue, but does hardcoding doff to 0 break multi-FIFO
devices?

In fsl_edma_prep_dma_cyclic(), doff is set to 4 if fsl_chan->is_multi_fifo is
true. Here it is hardcoded to 0. When fsl_edma_fill_tcd() calculates a
negative minor loop offset for multi-FIFO configurations:

drivers/dma/fsl-edma-common.c:fsl_edma_fill_tcd() {
    ...
    nbytes |= EDMA_V3_TCD_NBYTES_MLOFF(-burst);
    ...
}

Since doff is 0, the address does not advance during the minor loop, and the
negative offset continually rewinds the DMA address backwards in memory.
Can this corrupt adjacent memory or hardware registers?

> +		} else if (direction == DMA_DEV_TO_MEM) {
> +			src_addr = fsl_chan->dma_dev_addr;
> +			dst_addr = vecs[i].addr;
> +			soff = 0;
> +			doff = fsl_chan->cfg.src_addr_width;
> +		} else {
> +			/* DMA_DEV_TO_DEV */
> +			src_addr = fsl_chan->cfg.src_addr;
> +			dst_addr = fsl_chan->cfg.dst_addr;
> +			soff = 0;
> +			doff = 0;
> +		}
> +
> +		/*
> +		 * Choose the suitable burst length if dma_vec length is not
> +		 * multiple of burst length so that the whole transfer length is
> +		 * multiple of minor loop(burst length).
> +		 */
> +		if (vecs[i].len % nbytes) {
> +			u32 width = (direction == DMA_DEV_TO_MEM) ? doff : soff;
> +			u32 burst = (direction == DMA_DEV_TO_MEM) ?
> +						fsl_chan->cfg.src_maxburst :
> +						fsl_chan->cfg.dst_maxburst;
> +			int j;
> +
> +			for (j = burst; j > 1; j--) {
> +				if (!(vecs[i].len % (j * width))) {

[Severity: High]
This is a pre-existing issue, but can this code cause a divide-by-zero panic?

If direction is DMA_DEV_TO_DEV, both doff and soff are hardcoded to 0, which
makes width evaluate to 0. Then, the fallback loop calculates:
    vecs[i].len % (j * width)
resulting in a modulo by zero.

Additionally, if the vector length is perfectly aligned and the block is
skipped, the later calculation:
    iter = vecs[i].len / nbytes;
could divide by zero if nbytes evaluates to 0 due to uninitialized slave
address width configuration.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702-fsl-edma-dyn-sg-v5-0-16787185be49@bootlin.com?part=1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-07-02 12:45   ` sashiko-bot
@ 2026-07-02 13:17     ` Frank Li
  0 siblings, 0 replies; 6+ messages in thread
From: Frank Li @ 2026-07-02 13:17 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: Benoît Monin, Frank.Li, imx, vkoul, dmaengine

On Thu, Jul 02, 2026 at 12:45:52PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [Critical] Double-execution of DMA descriptors due to state machine desync when interrupts coalesce.
> - [Critical] Channel deadlock due to dynamic descriptor linking bypassing `issue_pending()`.
> --
>
> commit a67c20eb8f20e15fe2cadb092322588069fe328e
> Author: Benoît Monin <benoit.monin@bootlin.com>
>
> dmaengine: fsl-edma: Support dynamic scatter/gather chaining
>
> Implement dynamic linking of scatter/gather transfers to enable
> chaining multiple DMA descriptors without stopping the channel.
> This avoids waiting for the channel to go idle if there is another
> transaction already issued.
>
> > diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> > --- a/drivers/dma/fsl-edma-common.c
> > +++ b/drivers/dma/fsl-edma-common.c
> > @@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
> >  		list_del(&fsl_chan->edesc->vdesc.node);
> >  		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
> >  		fsl_chan->edesc = NULL;
> > -		fsl_chan->status = DMA_COMPLETE;
> > +		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
> > +		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
> > +			fsl_chan->status = DMA_COMPLETE;
> > +		}
>
> [Severity: Critical]
> Does checking the channel-global DONE flag here cause chained descriptors
> to execute twice if interrupts coalesce?
>
> For example, if descriptors A and B are chained and the hardware finishes
> both before the interrupt handler runs, won't the handler complete A,
> see DONE is set, and mark the channel DMA_COMPLETE?
>
> If that happens, wouldn't fsl_edma_xfer_desc() then fetch B from the
> issued list and restart the hardware, executing B a second time?

This is problem, need go though issue list and compared current hardware
working pos. then move all finished item to compelete queue

A
B
C  <- hardware working this
D

Need move A and B to complete queue. Because eDMA have not write back
status to descript and no compelete queue, it needs some efforts to find
dma is working on which item in issue list

>
> [ ... ]
>
> > @@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
> > +static dma_cookie_t fsl_edma_tx_submit(struct dma_async_tx_descriptor *tx)
> > +{
> > +	struct virt_dma_desc *vd = container_of(tx, struct virt_dma_desc, tx);
> > +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(tx->chan);
> > +	struct fsl_edma_desc *fsl_desc = to_fsl_edma_desc(vd);
> > +	struct virt_dma_chan *vc = to_virt_chan(tx->chan);
> > +	dma_cookie_t cookie;
> > +
> > +	guard(spinlock_irqsave)(&fsl_chan->vchan.lock);
> > +
> > +	fsl_edma_link_sg(fsl_chan, fsl_desc);
> > +	cookie = dma_cookie_assign(tx);
> > +	list_move_tail(&vd->node, &vc->desc_submitted);
> > +
> > +	return cookie;
> > +}
>
> [Severity: Critical]
> Can this cause a channel deadlock by executing unissued descriptors?
>
> Since hardware linking happens in fsl_edma_tx_submit() rather than waiting
> for the issue_pending() callback, if the hardware completes the newly linked
> descriptor before the client calls issue_pending(), won't the interrupt
> handler fire but find edesc is NULL?

Yes, you can't link hardware queue in tx_submit

you can create hardware queue list,  N1->N2->N3.

at issue_pending() function, assume issue queue I1->I2->I3

link submit queue header N3 to issue queue tail I1.

Frank

>
> In fsl_edma_tx_chan_handler():
>
>     if (!fsl_chan->edesc) {
>         /* terminate_all called before */
>         spin_unlock(&fsl_chan->vchan.lock);
>         return;
>     }
>
> Wouldn't this drop the completion and leave the channel stuck when
> issue_pending() is eventually called, since the channel status is still left
> as DMA_IN_PROGRESS?
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260702-fsl-edma-dyn-sg-v5-0-16787185be49@bootlin.com?part=2

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-07-02 13:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-02 12:34 [PATCH v5 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
2026-07-02 12:34 ` [PATCH v5 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
2026-07-02 12:47   ` sashiko-bot
2026-07-02 12:34 ` [PATCH v5 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
2026-07-02 12:45   ` sashiko-bot
2026-07-02 13:17     ` Frank Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox