[PATCH v3 0/2] dmaengine: fsl-edma: Scatter/gather improvements

DMA Engine development
 help / color / mirror / Atom feed

* [PATCH v3 0/2] dmaengine: fsl-edma: Scatter/gather improvements
@ 2026-05-11 13:57 Benoît Monin
  2026-05-11 13:57 ` [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
  2026-05-11 13:57 ` [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
  0 siblings, 2 replies; 7+ messages in thread
From: Benoît Monin @ 2026-05-11 13:57 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

This series adds support for scatter/gather DMA transfers via dma_vec
and dynamic descriptor chaining to the Freescale eDMA controller driver.

The first patch implements the .device_prep_peripheral_dma_vec() callback,
enabling the DMA engine to accept an array of dma_vec structures. This
callback supports both regular and cyclic transfer modes.

The second patch introduces dynamic scatter/gather chaining, which allows
multiple DMA descriptors to be linked together without stopping the channel.
This optimization eliminates idle periods when back-to-back transfers are
submitted, improving throughput and reducing latency. The implementation
carefully preserves cyclic transfer semantics and respects hardware
constraints on platforms with split register layouts.

I tested it on the i.MX93. The dynamic scatter/gather chaining should
work with other eDMA controller with split register layout.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
Changes in v3:
- Fix formatting errors reported by Frank Li.
- Add fsl_edma_tx_submit() to link the DMA transactions
  when they are submitted, not when they are prepared.
- Link to v2: https://patch.msgid.link/20260506-fsl-edma-dyn-sg-v2-0-66439cdd414e@bootlin.com

Changes in v2:
- Drop the RFC prefix, as asked by Frank Li
- No code change
- Link to v1: https://patch.msgid.link/20260430-fsl-edma-dyn-sg-v1-0-4e0ecbe2df66@bootlin.com

To: Frank Li <Frank.Li@nxp.com>
To: Vinod Koul <vkoul@kernel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Frank Li <Frank.Li@kernel.org>
Cc: imx@lists.linux.dev
Cc: dmaengine@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

---
Benoît Monin (2):
      dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
      dmaengine: fsl-edma: Support dynamic scatter/gather chaining

 drivers/dma/fsl-edma-common.c | 197 ++++++++++++++++++++++++++++++++++++++++--
 drivers/dma/fsl-edma-common.h |   4 +
 drivers/dma/fsl-edma-main.c   |   2 +
 drivers/dma/fsl-edma-trace.h  |   5 ++
 4 files changed, 202 insertions(+), 6 deletions(-)
---
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
change-id: 20260428-fsl-edma-dyn-sg-960731e37da2

Best regards,
--  
Benoît Monin, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
  2026-05-11 13:57 [PATCH v3 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
@ 2026-05-11 13:57 ` Benoît Monin
  2026-05-11 19:13   ` Frank Li
  2026-05-12  5:06   ` sashiko-bot
  2026-05-11 13:57 ` [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
  1 sibling, 2 replies; 7+ messages in thread
From: Benoît Monin @ 2026-05-11 13:57 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

Add implementation of .device_prep_peripheral_dma_vec() callback to setup
a scatter/gather DMA transfer from an array of dma_vec structures. Setup
a cyclic transfer if the DMA_PREP_REPEAT flag is set.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
 drivers/dma/fsl-edma-common.c | 109 ++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/fsl-edma-common.h |   4 ++
 drivers/dma/fsl-edma-main.c   |   2 +
 3 files changed, 115 insertions(+)

diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index bb7531c456df..c10190164926 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -673,6 +673,115 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
 }
 
+struct dma_async_tx_descriptor *
+fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
+				 size_t nb, enum dma_transfer_direction direction,
+				 unsigned long flags)
+{
+	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
+	dma_addr_t src_addr, dst_addr, last_sg;
+	struct fsl_edma_desc *fsl_desc;
+	u16 soff, doff, iter;
+	u32 nbytes;
+	int i;
+
+	if (!is_slave_direction(direction))
+		return NULL;
+
+	if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
+		return NULL;
+
+	fsl_desc = fsl_edma_alloc_desc(fsl_chan, nb);
+	if (!fsl_desc)
+		return NULL;
+	fsl_desc->iscyclic = flags & DMA_PREP_REPEAT;
+	fsl_desc->dirn = direction;
+
+	if (direction == DMA_MEM_TO_DEV) {
+		if (!fsl_chan->cfg.src_addr_width)
+			fsl_chan->cfg.src_addr_width = fsl_chan->cfg.dst_addr_width;
+		fsl_chan->attr =
+			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
+					      fsl_chan->cfg.dst_addr_width);
+		nbytes = fsl_chan->cfg.dst_addr_width * fsl_chan->cfg.dst_maxburst;
+	} else {
+		if (!fsl_chan->cfg.dst_addr_width)
+			fsl_chan->cfg.dst_addr_width = fsl_chan->cfg.src_addr_width;
+		fsl_chan->attr =
+			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
+					      fsl_chan->cfg.dst_addr_width);
+		nbytes = fsl_chan->cfg.src_addr_width * fsl_chan->cfg.src_maxburst;
+	}
+
+	for (i = 0; i < nb; i++) {
+		if (direction == DMA_MEM_TO_DEV) {
+			src_addr = vecs[i].addr;
+			dst_addr = fsl_chan->dma_dev_addr;
+			soff = fsl_chan->cfg.dst_addr_width;
+			doff = 0;
+		} else if (direction == DMA_DEV_TO_MEM) {
+			src_addr = fsl_chan->dma_dev_addr;
+			dst_addr = vecs[i].addr;
+			soff = 0;
+			doff = fsl_chan->cfg.src_addr_width;
+		} else {
+			/* DMA_DEV_TO_DEV */
+			src_addr = fsl_chan->cfg.src_addr;
+			dst_addr = fsl_chan->cfg.dst_addr;
+			soff = 0;
+			doff = 0;
+		}
+
+		/*
+		 * Choose the suitable burst length if dma_vec length is not
+		 * multiple of burst length so that the whole transfer length is
+		 * multiple of minor loop(burst length).
+		 */
+		if (vecs[i].len % nbytes) {
+			u32 width = (direction == DMA_DEV_TO_MEM) ? doff : soff;
+			u32 burst = (direction == DMA_DEV_TO_MEM) ?
+						fsl_chan->cfg.src_maxburst :
+						fsl_chan->cfg.dst_maxburst;
+			int j;
+
+			for (j = burst; j > 1; j--) {
+				if (!(vecs[i].len % (j * width))) {
+					nbytes = j * width;
+					break;
+				}
+			}
+			/* Set burst size as 1 if there's no suitable one */
+			if (j == 1)
+				nbytes = width;
+		}
+
+		iter = vecs[i].len / nbytes;
+		if (i < nb - 1) {
+			last_sg = fsl_desc->tcd[(i + 1)].ptcd;
+			fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+					  dst_addr, fsl_chan->attr, soff,
+					  nbytes, 0, iter, iter, doff, last_sg,
+					  false, false, true);
+		} else {
+			if (fsl_desc->iscyclic) {
+				last_sg = fsl_desc->tcd[0].ptcd;
+				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+						  dst_addr, fsl_chan->attr, soff,
+						  nbytes, 0, iter, iter, doff, last_sg,
+						  true, false, true);
+			} else {
+				last_sg = 0;
+				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
+						  dst_addr, fsl_chan->attr, soff,
+						  nbytes, 0, iter, iter, doff, last_sg,
+						  true, true, false);
+			}
+		}
+	}
+
+	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+}
+
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_transfer_direction direction,
diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-common.h
index 205a96489094..0d028048701d 100644
--- a/drivers/dma/fsl-edma-common.h
+++ b/drivers/dma/fsl-edma-common.h
@@ -496,6 +496,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 		struct dma_chan *chan, dma_addr_t dma_addr, size_t buf_len,
 		size_t period_len, enum dma_transfer_direction direction,
 		unsigned long flags);
+struct dma_async_tx_descriptor *fsl_edma_prep_peripheral_dma_vec(
+		struct dma_chan *chan, const struct dma_vec *vecs,
+		size_t nb, enum dma_transfer_direction direction,
+		unsigned long flags);
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		struct dma_chan *chan, struct scatterlist *sgl,
 		unsigned int sg_len, enum dma_transfer_direction direction,
diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
index 36155ab1602a..6693b4270a1a 100644
--- a/drivers/dma/fsl-edma-main.c
+++ b/drivers/dma/fsl-edma-main.c
@@ -841,6 +841,8 @@ static int fsl_edma_probe(struct platform_device *pdev)
 	fsl_edma->dma_dev.device_free_chan_resources
 		= fsl_edma_free_chan_resources;
 	fsl_edma->dma_dev.device_tx_status = fsl_edma_tx_status;
+	fsl_edma->dma_dev.device_prep_peripheral_dma_vec
+		= fsl_edma_prep_peripheral_dma_vec;
 	fsl_edma->dma_dev.device_prep_slave_sg = fsl_edma_prep_slave_sg;
 	fsl_edma->dma_dev.device_prep_dma_cyclic = fsl_edma_prep_dma_cyclic;
 	fsl_edma->dma_dev.device_prep_dma_memcpy = fsl_edma_prep_memcpy;

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-05-11 13:57 [PATCH v3 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
  2026-05-11 13:57 ` [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
@ 2026-05-11 13:57 ` Benoît Monin
  2026-05-11 19:20   ` Frank Li
  2026-05-12  5:49   ` sashiko-bot
  1 sibling, 2 replies; 7+ messages in thread
From: Benoît Monin @ 2026-05-11 13:57 UTC (permalink / raw)
  To: Frank Li, Vinod Koul
  Cc: Thomas Petazzoni, Frank Li, imx, dmaengine, linux-kernel,
	Benoît Monin

Implement dynamic linking of scatter/gather transfers to enable
chaining multiple DMA descriptors without stopping the channel.
This avoids waiting for the channel to go idle if there is another
transaction already issued.

Add fsl_edma_link_sg() to dynamically link the last TCD of a previously
submitted descriptor to the first TCD of a new descriptor by setting
the scatter/gather address and the E_SG flag, and keeping the channel
active by clearing the DREQ bit.

Linking is done when the transaction is submitted by fsl_edma_tx_submit().
To do so, the .tx_submit() callback is overridden for non-cyclic
transactions prepared by fsl_edma_prep_peripheral_dma_vec() and
fsl_edma_prep_slave_sg(). This ensures that transactions are linked
in the order they are submitted.

Update fsl_edma_xfer_desc() to avoid re-initializing the hardware when a
transfer is already in progress, allowing seamless chaining of descriptors.

Modify the transfer completion handler to check the DONE flag in the
channel CSR before marking the transfer complete. Since this flag is
only available on SoC with the split registers layout, we only link
transactions for DMA controllers flagged with FSL_EDMA_DRV_SPLIT_REG.

Add trace event for scatter/gather linking operations.

Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
---
 drivers/dma/fsl-edma-common.c | 90 +++++++++++++++++++++++++++++++++++++++----
 drivers/dma/fsl-edma-trace.h  |  5 +++
 2 files changed, 88 insertions(+), 7 deletions(-)

diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
index c10190164926..b83d1b91dca2 100644
--- a/drivers/dma/fsl-edma-common.c
+++ b/drivers/dma/fsl-edma-common.c
@@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
 		list_del(&fsl_chan->edesc->vdesc.node);
 		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
 		fsl_chan->edesc = NULL;
-		fsl_chan->status = DMA_COMPLETE;
+		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
+		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
+			fsl_chan->status = DMA_COMPLETE;
+		}
 	} else {
 		vchan_cyclic_callback(&fsl_chan->edesc->vdesc);
 	}
@@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
 	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
 }
 
+static void fsl_edma_link_sg(struct fsl_edma_chan *fsl_chan, struct fsl_edma_desc *fsl_desc)
+{
+	u32 flags = fsl_edma_drvflags(fsl_chan);
+	struct fsl_edma_hw_tcd *last_tcd;
+	struct fsl_edma_desc *prev_desc;
+	struct virt_dma_desc *vdesc;
+	u16 csr;
+
+	lockdep_assert_held(&fsl_chan->vchan.lock);
+
+	if (!(flags & FSL_EDMA_DRV_SPLIT_REG))
+		return;
+
+	vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_issued,
+					struct virt_dma_desc, node);
+	if (!vdesc)
+		vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_submitted,
+						struct virt_dma_desc, node);
+	if (!vdesc)
+		return;
+
+	prev_desc = to_fsl_edma_desc(vdesc);
+	last_tcd = prev_desc->tcd[prev_desc->n_tcds - 1].vtcd;
+
+	csr = fsl_edma_get_tcd_to_cpu(fsl_chan, last_tcd, csr);
+	if (!(csr & EDMA_TCD_CSR_D_REQ))
+		return;
+
+	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, fsl_desc->tcd[0].ptcd, dlast_sga);
+
+	csr &= ~EDMA_TCD_CSR_D_REQ;
+	csr |= EDMA_TCD_CSR_E_SG;
+	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, csr, csr);
+
+	if (prev_desc == fsl_chan->edesc && prev_desc->n_tcds == 1) {
+		if (flags & FSL_EDMA_DRV_CLEAR_DONE_E_SG)
+			edma_writel_chreg(fsl_chan, edma_readl_chreg(fsl_chan, ch_csr), ch_csr);
+
+		edma_cp_tcd_to_reg(fsl_chan, last_tcd, dlast_sga);
+		edma_cp_tcd_to_reg(fsl_chan, last_tcd, csr);
+	}
+
+	trace_edma_link_sg(fsl_chan, last_tcd);
+}
+
+static dma_cookie_t fsl_edma_tx_submit(struct dma_async_tx_descriptor *tx)
+{
+	struct virt_dma_desc *vd = container_of(tx, struct virt_dma_desc, tx);
+	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(tx->chan);
+	struct fsl_edma_desc *fsl_desc = to_fsl_edma_desc(vd);
+	struct virt_dma_chan *vc = to_virt_chan(tx->chan);
+	dma_cookie_t cookie;
+
+	guard(spinlock_irqsave)(&fsl_chan->vchan.lock);
+
+	fsl_edma_link_sg(fsl_chan, fsl_desc);
+	cookie = dma_cookie_assign(tx);
+	list_move_tail(&vd->node, &vc->desc_submitted);
+
+	return cookie;
+}
+
 struct dma_async_tx_descriptor *
 fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
 				 size_t nb, enum dma_transfer_direction direction,
@@ -680,6 +745,7 @@ fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *ve
 {
 	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
 	dma_addr_t src_addr, dst_addr, last_sg;
+	struct dma_async_tx_descriptor *tx;
 	struct fsl_edma_desc *fsl_desc;
 	u16 soff, doff, iter;
 	u32 nbytes;
@@ -779,7 +845,10 @@ fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *ve
 		}
 	}
 
-	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx = vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	if (!fsl_desc->iscyclic)
+		tx->tx_submit = fsl_edma_tx_submit;
+	return tx;
 }
 
 struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
@@ -788,9 +857,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		unsigned long flags, void *context)
 {
 	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
+	dma_addr_t src_addr, dst_addr, last_sg;
+	struct dma_async_tx_descriptor *tx;
 	struct fsl_edma_desc *fsl_desc;
 	struct scatterlist *sg;
-	dma_addr_t src_addr, dst_addr, last_sg;
 	u16 soff, doff, iter;
 	u32 nbytes;
 	int i;
@@ -882,7 +952,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
 		}
 	}
 
-	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx = vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
+	tx->tx_submit = fsl_edma_tx_submit;
+
+	return tx;
 }
 
 struct dma_async_tx_descriptor *fsl_edma_prep_memcpy(struct dma_chan *chan,
@@ -924,9 +997,12 @@ void fsl_edma_xfer_desc(struct fsl_edma_chan *fsl_chan)
 	if (!vdesc)
 		return;
 	fsl_chan->edesc = to_fsl_edma_desc(vdesc);
-	fsl_edma_set_tcd_regs(fsl_chan, fsl_chan->edesc->tcd[0].vtcd);
-	fsl_edma_enable_request(fsl_chan);
-	fsl_chan->status = DMA_IN_PROGRESS;
+
+	if (fsl_chan->status != DMA_IN_PROGRESS) {
+		fsl_edma_set_tcd_regs(fsl_chan, fsl_chan->edesc->tcd[0].vtcd);
+		fsl_edma_enable_request(fsl_chan);
+		fsl_chan->status = DMA_IN_PROGRESS;
+	}
 }
 
 void fsl_edma_issue_pending(struct dma_chan *chan)
diff --git a/drivers/dma/fsl-edma-trace.h b/drivers/dma/fsl-edma-trace.h
index d3541301a247..ac319d2dbb90 100644
--- a/drivers/dma/fsl-edma-trace.h
+++ b/drivers/dma/fsl-edma-trace.h
@@ -119,6 +119,11 @@ DEFINE_EVENT(edma_log_tcd, edma_fill_tcd,
 	TP_ARGS(chan, tcd)
 );
 
+DEFINE_EVENT(edma_log_tcd, edma_link_sg,
+	     TP_PROTO(struct fsl_edma_chan *chan, void *tcd),
+	     TP_ARGS(chan, tcd)
+);
+
 #endif
 
 /* this part must be outside header guard */

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
  2026-05-11 13:57 ` [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
@ 2026-05-11 19:13   ` Frank Li
  2026-05-12  5:06   ` sashiko-bot
  1 sibling, 0 replies; 7+ messages in thread
From: Frank Li @ 2026-05-11 19:13 UTC (permalink / raw)
  To: Benoît Monin
  Cc: Vinod Koul, Thomas Petazzoni, Frank Li, imx, dmaengine,
	linux-kernel

On Mon, May 11, 2026 at 03:57:19PM +0200, Benoît Monin wrote:
> Add implementation of .device_prep_peripheral_dma_vec() callback to setup
> a scatter/gather DMA transfer from an array of dma_vec structures. Setup
> a cyclic transfer if the DMA_PREP_REPEAT flag is set.
>
> Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
> ---

Reviewed-by: Frank Li <Frank.Li@nxp.com>

>  drivers/dma/fsl-edma-common.c | 109 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/dma/fsl-edma-common.h |   4 ++
>  drivers/dma/fsl-edma-main.c   |   2 +
>  3 files changed, 115 insertions(+)
>
> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> index bb7531c456df..c10190164926 100644
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c
> @@ -673,6 +673,115 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>  	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
>  }
>
> +struct dma_async_tx_descriptor *
> +fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
> +				 size_t nb, enum dma_transfer_direction direction,
> +				 unsigned long flags)
> +{
> +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> +	dma_addr_t src_addr, dst_addr, last_sg;
> +	struct fsl_edma_desc *fsl_desc;
> +	u16 soff, doff, iter;
> +	u32 nbytes;
> +	int i;
> +
> +	if (!is_slave_direction(direction))
> +		return NULL;
> +
> +	if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
> +		return NULL;
> +
> +	fsl_desc = fsl_edma_alloc_desc(fsl_chan, nb);
> +	if (!fsl_desc)
> +		return NULL;
> +	fsl_desc->iscyclic = flags & DMA_PREP_REPEAT;
> +	fsl_desc->dirn = direction;
> +
> +	if (direction == DMA_MEM_TO_DEV) {
> +		if (!fsl_chan->cfg.src_addr_width)
> +			fsl_chan->cfg.src_addr_width = fsl_chan->cfg.dst_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.dst_addr_width * fsl_chan->cfg.dst_maxburst;
> +	} else {
> +		if (!fsl_chan->cfg.dst_addr_width)
> +			fsl_chan->cfg.dst_addr_width = fsl_chan->cfg.src_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.src_addr_width * fsl_chan->cfg.src_maxburst;
> +	}
> +
> +	for (i = 0; i < nb; i++) {
> +		if (direction == DMA_MEM_TO_DEV) {
> +			src_addr = vecs[i].addr;
> +			dst_addr = fsl_chan->dma_dev_addr;
> +			soff = fsl_chan->cfg.dst_addr_width;
> +			doff = 0;
> +		} else if (direction == DMA_DEV_TO_MEM) {
> +			src_addr = fsl_chan->dma_dev_addr;
> +			dst_addr = vecs[i].addr;
> +			soff = 0;
> +			doff = fsl_chan->cfg.src_addr_width;
> +		} else {
> +			/* DMA_DEV_TO_DEV */
> +			src_addr = fsl_chan->cfg.src_addr;
> +			dst_addr = fsl_chan->cfg.dst_addr;
> +			soff = 0;
> +			doff = 0;
> +		}
> +
> +		/*
> +		 * Choose the suitable burst length if dma_vec length is not
> +		 * multiple of burst length so that the whole transfer length is
> +		 * multiple of minor loop(burst length).
> +		 */
> +		if (vecs[i].len % nbytes) {
> +			u32 width = (direction == DMA_DEV_TO_MEM) ? doff : soff;
> +			u32 burst = (direction == DMA_DEV_TO_MEM) ?
> +						fsl_chan->cfg.src_maxburst :
> +						fsl_chan->cfg.dst_maxburst;
> +			int j;
> +
> +			for (j = burst; j > 1; j--) {
> +				if (!(vecs[i].len % (j * width))) {
> +					nbytes = j * width;
> +					break;
> +				}
> +			}
> +			/* Set burst size as 1 if there's no suitable one */
> +			if (j == 1)
> +				nbytes = width;
> +		}
> +
> +		iter = vecs[i].len / nbytes;
> +		if (i < nb - 1) {
> +			last_sg = fsl_desc->tcd[(i + 1)].ptcd;
> +			fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
> +					  dst_addr, fsl_chan->attr, soff,
> +					  nbytes, 0, iter, iter, doff, last_sg,
> +					  false, false, true);
> +		} else {
> +			if (fsl_desc->iscyclic) {
> +				last_sg = fsl_desc->tcd[0].ptcd;
> +				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
> +						  dst_addr, fsl_chan->attr, soff,
> +						  nbytes, 0, iter, iter, doff, last_sg,
> +						  true, false, true);
> +			} else {
> +				last_sg = 0;
> +				fsl_edma_fill_tcd(fsl_chan, fsl_desc->tcd[i].vtcd, src_addr,
> +						  dst_addr, fsl_chan->attr, soff,
> +						  nbytes, 0, iter, iter, doff, last_sg,
> +						  true, true, false);
> +			}
> +		}
> +	}
> +
> +	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
> +}
> +
>  struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
>  		struct dma_chan *chan, struct scatterlist *sgl,
>  		unsigned int sg_len, enum dma_transfer_direction direction,
> diff --git a/drivers/dma/fsl-edma-common.h b/drivers/dma/fsl-edma-common.h
> index 205a96489094..0d028048701d 100644
> --- a/drivers/dma/fsl-edma-common.h
> +++ b/drivers/dma/fsl-edma-common.h
> @@ -496,6 +496,10 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>  		struct dma_chan *chan, dma_addr_t dma_addr, size_t buf_len,
>  		size_t period_len, enum dma_transfer_direction direction,
>  		unsigned long flags);
> +struct dma_async_tx_descriptor *fsl_edma_prep_peripheral_dma_vec(
> +		struct dma_chan *chan, const struct dma_vec *vecs,
> +		size_t nb, enum dma_transfer_direction direction,
> +		unsigned long flags);
>  struct dma_async_tx_descriptor *fsl_edma_prep_slave_sg(
>  		struct dma_chan *chan, struct scatterlist *sgl,
>  		unsigned int sg_len, enum dma_transfer_direction direction,
> diff --git a/drivers/dma/fsl-edma-main.c b/drivers/dma/fsl-edma-main.c
> index 36155ab1602a..6693b4270a1a 100644
> --- a/drivers/dma/fsl-edma-main.c
> +++ b/drivers/dma/fsl-edma-main.c
> @@ -841,6 +841,8 @@ static int fsl_edma_probe(struct platform_device *pdev)
>  	fsl_edma->dma_dev.device_free_chan_resources
>  		= fsl_edma_free_chan_resources;
>  	fsl_edma->dma_dev.device_tx_status = fsl_edma_tx_status;
> +	fsl_edma->dma_dev.device_prep_peripheral_dma_vec
> +		= fsl_edma_prep_peripheral_dma_vec;
>  	fsl_edma->dma_dev.device_prep_slave_sg = fsl_edma_prep_slave_sg;
>  	fsl_edma->dma_dev.device_prep_dma_cyclic = fsl_edma_prep_dma_cyclic;
>  	fsl_edma->dma_dev.device_prep_dma_memcpy = fsl_edma_prep_memcpy;
>
> --
> 2.54.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-05-11 13:57 ` [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
@ 2026-05-11 19:20   ` Frank Li
  2026-05-12  5:49   ` sashiko-bot
  1 sibling, 0 replies; 7+ messages in thread
From: Frank Li @ 2026-05-11 19:20 UTC (permalink / raw)
  To: Benoît Monin
  Cc: Vinod Koul, Thomas Petazzoni, Frank Li, imx, dmaengine,
	linux-kernel

On Mon, May 11, 2026 at 03:57:20PM +0200, Benoît Monin wrote:
> Implement dynamic linking of scatter/gather transfers to enable
> chaining multiple DMA descriptors without stopping the channel.
> This avoids waiting for the channel to go idle if there is another
> transaction already issued.
>
> Add fsl_edma_link_sg() to dynamically link the last TCD of a previously
> submitted descriptor to the first TCD of a new descriptor by setting
> the scatter/gather address and the E_SG flag, and keeping the channel
> active by clearing the DREQ bit.
>
> Linking is done when the transaction is submitted by fsl_edma_tx_submit().
> To do so, the .tx_submit() callback is overridden for non-cyclic
> transactions prepared by fsl_edma_prep_peripheral_dma_vec() and
> fsl_edma_prep_slave_sg(). This ensures that transactions are linked
> in the order they are submitted.
>
> Update fsl_edma_xfer_desc() to avoid re-initializing the hardware when a
> transfer is already in progress, allowing seamless chaining of descriptors.
>
> Modify the transfer completion handler to check the DONE flag in the
> channel CSR before marking the transfer complete. Since this flag is
> only available on SoC with the split registers layout, we only link
> transactions for DMA controllers flagged with FSL_EDMA_DRV_SPLIT_REG.
>
> Add trace event for scatter/gather linking operations.
>
> Signed-off-by: Benoît Monin <benoit.monin@bootlin.com>
> ---
>  drivers/dma/fsl-edma-common.c | 90 +++++++++++++++++++++++++++++++++++++++----
>  drivers/dma/fsl-edma-trace.h  |  5 +++
>  2 files changed, 88 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> index c10190164926..b83d1b91dca2 100644
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c
> @@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
>  		list_del(&fsl_chan->edesc->vdesc.node);
>  		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
>  		fsl_chan->edesc = NULL;
> -		fsl_chan->status = DMA_COMPLETE;
> +		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
> +		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
> +			fsl_chan->status = DMA_COMPLETE;

Does fsl_edma_desc_residue() needs to update?

> +		}
>  	} else {
>  		vchan_cyclic_callback(&fsl_chan->edesc->vdesc);
>  	}
> @@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>  	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
>  }
>
> +static void fsl_edma_link_sg(struct fsl_edma_chan *fsl_chan, struct fsl_edma_desc *fsl_desc)
> +{
> +	u32 flags = fsl_edma_drvflags(fsl_chan);
> +	struct fsl_edma_hw_tcd *last_tcd;
> +	struct fsl_edma_desc *prev_desc;
> +	struct virt_dma_desc *vdesc;
> +	u16 csr;
> +
> +	lockdep_assert_held(&fsl_chan->vchan.lock);
> +
> +	if (!(flags & FSL_EDMA_DRV_SPLIT_REG))
> +		return;
> +
> +	vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_issued,
> +					struct virt_dma_desc, node);
> +	if (!vdesc)
> +		vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_submitted,
> +						struct virt_dma_desc, node);
> +	if (!vdesc)
> +		return;

Suppose you only check submit queue,

issue transfer will move submit queue to issue queue.

Frank
> --
> 2.54.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec
  2026-05-11 13:57 ` [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
  2026-05-11 19:13   ` Frank Li
@ 2026-05-12  5:06   ` sashiko-bot
  1 sibling, 0 replies; 7+ messages in thread
From: sashiko-bot @ 2026-05-12  5:06 UTC (permalink / raw)
  To: Benoît Monin; +Cc: vkoul, dmaengine, Frank.Li, imx

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] Submitting a transfer with zero elements (nb == 0) causes an out-of-bounds memory read and kernel oops.
- [High] Modulo by zero kernel panic for DMA_DEV_TO_DEV transfers with unaligned lengths.
- [High] Missing multi-FIFO support causes backward memory corruption for peripheral transfers.
- [High] The nbytes burst size is never restored per loop iteration, causing subsequent vectors to permanently inherit a heavily degraded burst size.
--

commit 995264f0108e475b7f259a8a9e431ac6988e4a11
Author: Benoît Monin <benoit.monin@bootlin.com>

dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec

Add implementation of .device_prep_peripheral_dma_vec() callback to setup
a scatter/gather DMA transfer from an array of dma_vec structures. Setup
a cyclic transfer if the DMA_PREP_REPEAT flag is set.

> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> index bb7531c456dfa..c10190164926a 100644
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c

[ ... ]

> +struct dma_async_tx_descriptor *
> +fsl_edma_prep_peripheral_dma_vec(struct dma_chan *chan, const struct dma_vec *vecs,
> +				 size_t nb, enum dma_transfer_direction direction,
> +				 unsigned long flags)
> +{
> +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(chan);
> +	dma_addr_t src_addr, dst_addr, last_sg;
> +	struct fsl_edma_desc *fsl_desc;
> +	u16 soff, doff, iter;
> +	u32 nbytes;
> +	int i;
> +
> +	if (!is_slave_direction(direction))
> +		return NULL;
> +
> +	if (!fsl_edma_prep_slave_dma(fsl_chan, direction))
> +		return NULL;
> +
> +	fsl_desc = fsl_edma_alloc_desc(fsl_chan, nb);

This problem wasn't introduced by this patch, but does this function need to
validate that nb > 0?

If nb == 0 is provided, fsl_edma_alloc_desc() allocates a descriptor with
zero tcd structures in its flexible array. When the DMA engine later attempts
to execute it in fsl_edma_xfer_desc(), it unconditionally dereferences
fsl_chan->edesc->tcd[0].vtcd. This reads a garbage pointer from adjacent heap
memory and leads to an immediate kernel oops when trying to write to the DMA
hardware registers.

Additionally, the subsequent patch Support dynamic scatter/gather chaining
introduces prev_desc->n_tcds - 1, which underflows to UINT_MAX when nb == 0,
guaranteeing another out-of-bounds crash on tx_submit().

> +	if (!fsl_desc)
> +		return NULL;
> +	fsl_desc->iscyclic = flags & DMA_PREP_REPEAT;
> +	fsl_desc->dirn = direction;
> +
> +	if (direction == DMA_MEM_TO_DEV) {
> +		if (!fsl_chan->cfg.src_addr_width)
> +			fsl_chan->cfg.src_addr_width = fsl_chan->cfg.dst_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.dst_addr_width * fsl_chan->cfg.dst_maxburst;
> +	} else {
> +		if (!fsl_chan->cfg.dst_addr_width)
> +			fsl_chan->cfg.dst_addr_width = fsl_chan->cfg.src_addr_width;
> +		fsl_chan->attr =
> +			fsl_edma_get_tcd_attr(fsl_chan->cfg.src_addr_width,
> +					      fsl_chan->cfg.dst_addr_width);
> +		nbytes = fsl_chan->cfg.src_addr_width * fsl_chan->cfg.src_maxburst;
> +	}
> +
> +	for (i = 0; i < nb; i++) {

This isn't a regression from this patch, but does nbytes need to be restored
to its default optimal value at the start of each loop iteration?

If vecs[i].len is not a multiple of nbytes, nbytes is permanently reduced to
a smaller valid multiple (or down to width). However, because nbytes is not
reset to its default optimal value at the start of each loop iteration, all
remaining vectors in the array will be forced to use the heavily degraded
burst size, even if they are perfectly aligned.

This exponentially increases the number of hardware DMA requests (minor loops)
for the remainder of the transfer, which can cause peripheral FIFO overruns
on fast data streams.

> +		if (direction == DMA_MEM_TO_DEV) {
> +			src_addr = vecs[i].addr;
> +			dst_addr = fsl_chan->dma_dev_addr;
> +			soff = fsl_chan->cfg.dst_addr_width;
> +			doff = 0;

This problem wasn't introduced by this patch, but should soff and doff handle
multi-FIFO support here?

Unlike fsl_edma_prep_dma_cyclic(), which correctly assigns doff and soff to
support multi-FIFO (e.g., doff = fsl_chan->is_multi_fifo ? 4 : 0), this
function unconditionally hardcodes doff = 0 and soff = 0 for device addresses.

However, fsl_edma_fill_tcd() still applies a negative Minor Loop Offset
(MLOFF) at the end of the minor loop if is_multi_fifo or port_window_size are
enabled on the channel. Because the address never incrementally advanced
during the minor loop, applying MLOFF forces the device address to step
backwards continuously, severely corrupting system memory or registers located
before the target device address.

> +		} else if (direction == DMA_DEV_TO_MEM) {
> +			src_addr = fsl_chan->dma_dev_addr;
> +			dst_addr = vecs[i].addr;
> +			soff = 0;
> +			doff = fsl_chan->cfg.src_addr_width;
> +		} else {
> +			/* DMA_DEV_TO_DEV */
> +			src_addr = fsl_chan->cfg.src_addr;
> +			dst_addr = fsl_chan->cfg.dst_addr;
> +			soff = 0;
> +			doff = 0;
> +		}
> +
> +		/*
> +		 * Choose the suitable burst length if dma_vec length is not
> +		 * multiple of burst length so that the whole transfer length is
> +		 * multiple of minor loop(burst length).
> +		 */
> +		if (vecs[i].len % nbytes) {
> +			u32 width = (direction == DMA_DEV_TO_MEM) ? doff : soff;
> +			u32 burst = (direction == DMA_DEV_TO_MEM) ?
> +						fsl_chan->cfg.src_maxburst :
> +						fsl_chan->cfg.dst_maxburst;
> +			int j;
> +
> +			for (j = burst; j > 1; j--) {
> +				if (!(vecs[i].len % (j * width))) {

This problem wasn't introduced by this patch, but can this cause a modulo by
zero kernel panic for DMA_DEV_TO_DEV transfers?

For DMA_DEV_TO_DEV transfers, both soff and doff are explicitly set to 0. If
vecs[i].len is not divisible by the initial burst size nbytes, the driver
assigns width = soff, which evaluates to 0.

The loop then evaluates if (!(vecs[i].len % (j * width))), causing an
immediate modulo by zero panic.

Furthermore, if the loop finishes with j == 1, nbytes is set to width (0),
and iter = vecs[i].len / nbytes triggers another division by zero.

> +					nbytes = j * width;
> +					break;
> +				}
> +			}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260511-fsl-edma-dyn-sg-v3-0-98a181775dae@bootlin.com?part=1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining
  2026-05-11 13:57 ` [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
  2026-05-11 19:20   ` Frank Li
@ 2026-05-12  5:49   ` sashiko-bot
  1 sibling, 0 replies; 7+ messages in thread
From: sashiko-bot @ 2026-05-12  5:49 UTC (permalink / raw)
  To: Benoît Monin; +Cc: vkoul, Frank.Li, dmaengine, imx

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [Critical] Hardware executes dynamically linked descriptors before issue_pending() is called, causing missed interrupts and double execution.
- [Critical] TOCTOU race condition with the channel DONE bit in the interrupt handler causes double execution of chained descriptors.
- [High] Incorrect list check order in fsl_edma_link_sg() drops intermediate descriptors from the hardware scatter-gather chain.
- [High] Clearing the channel DONE bit during dynamic linking causes a permanent DMA channel stall.
- [High] Missing memory barrier when updating in-memory scatter-gather links risks torn hardware reads and system bus errors.
--

commit 09e7fbe06b869eaaa82c7821bd60464f115cd6b5
Author: Benoît Monin <benoit.monin@bootlin.com>

dmaengine: fsl-edma: Support dynamic scatter/gather chaining

This commit implements dynamic linking of scatter/gather transfers to enable
chaining multiple DMA descriptors without stopping the channel. It adds
fsl_edma_link_sg() which is called during tx_submit() to link the new
descriptor to the previously submitted one.

> diff --git a/drivers/dma/fsl-edma-common.c b/drivers/dma/fsl-edma-common.c
> index c10190164926a..b83d1b91dca2b 100644
> --- a/drivers/dma/fsl-edma-common.c
> +++ b/drivers/dma/fsl-edma-common.c
> @@ -58,7 +58,10 @@ void fsl_edma_tx_chan_handler(struct fsl_edma_chan *fsl_chan)
>  		list_del(&fsl_chan->edesc->vdesc.node);
>  		vchan_cookie_complete(&fsl_chan->edesc->vdesc);
>  		fsl_chan->edesc = NULL;
> -		fsl_chan->status = DMA_COMPLETE;
> +		if (!(fsl_edma_drvflags(fsl_chan) & FSL_EDMA_DRV_SPLIT_REG) ||
> +		    (edma_readl_chreg(fsl_chan, ch_csr) & EDMA_V3_CH_CSR_DONE)) {
> +			fsl_chan->status = DMA_COMPLETE;
> +		}

If the hardware dynamically chains from descriptor A to descriptor B, it
still raises an interrupt for A. If the CPU is delayed and the hardware
finishes executing B and halts, setting the DONE bit, won't the ISR for A
see the DONE bit set here, set the status to DMA_COMPLETE, and then later
explicitly re-enable the channel to execute B again, causing memory
corruption?

>  	} else {
>  		vchan_cyclic_callback(&fsl_chan->edesc->vdesc);
>  	}

[ ... ]

> @@ -673,6 +676,68 @@ struct dma_async_tx_descriptor *fsl_edma_prep_dma_cyclic(
>  	return vchan_tx_prep(&fsl_chan->vchan, &fsl_desc->vdesc, flags);
>  }
>  
> +static void fsl_edma_link_sg(struct fsl_edma_chan *fsl_chan, struct fsl_edma_desc *fsl_desc)
> +{
> +	u32 flags = fsl_edma_drvflags(fsl_chan);
> +	struct fsl_edma_hw_tcd *last_tcd;
> +	struct fsl_edma_desc *prev_desc;
> +	struct virt_dma_desc *vdesc;
> +	u16 csr;
> +
> +	lockdep_assert_held(&fsl_chan->vchan.lock);
> +
> +	if (!(flags & FSL_EDMA_DRV_SPLIT_REG))
> +		return;
> +
> +	vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_issued,
> +					struct virt_dma_desc, node);
> +	if (!vdesc)
> +		vdesc = list_last_entry_or_null(&fsl_chan->vchan.desc_submitted,
> +						struct virt_dma_desc, node);

If desc_issued is not empty but there are also descriptors in
desc_submitted, this will link against the tail of desc_issued instead of
the newest descriptor at the tail of desc_submitted. Will this cause the
hardware to skip intermediate submissions and corrupt the transfer sequence?

> +	if (!vdesc)
> +		return;
> +
> +	prev_desc = to_fsl_edma_desc(vdesc);
> +	last_tcd = prev_desc->tcd[prev_desc->n_tcds - 1].vtcd;
> +
> +	csr = fsl_edma_get_tcd_to_cpu(fsl_chan, last_tcd, csr);
> +	if (!(csr & EDMA_TCD_CSR_D_REQ))
> +		return;
> +
> +	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, fsl_desc->tcd[0].ptcd, dlast_sga);
> +
> +	csr &= ~EDMA_TCD_CSR_D_REQ;
> +	csr |= EDMA_TCD_CSR_E_SG;
> +	fsl_edma_set_tcd_to_le(fsl_chan, last_tcd, csr, csr);

Is a write memory barrier like dma_wmb() needed between updating dlast_sga
and setting the E_SG bit in the csr? If the CPU or compiler reorders these
writes, could the actively running eDMA controller observe the E_SG bit set
with a stale dlast_sga pointer and trigger a bus error?

> +
> +	if (prev_desc == fsl_chan->edesc && prev_desc->n_tcds == 1) {
> +		if (flags & FSL_EDMA_DRV_CLEAR_DONE_E_SG)
> +			edma_writel_chreg(fsl_chan, edma_readl_chreg(fsl_chan, ch_csr), ch_csr);

If the hardware has just finished executing the transfer and halted before
this register access, the DONE bit is already set. Writing it back clears
it. Will the subsequent completion interrupt ISR then wrongly assume the
channel chained successfully, leaving the state as DMA_IN_PROGRESS and
causing a permanent channel stall?

> +
> +		edma_cp_tcd_to_reg(fsl_chan, last_tcd, dlast_sga);
> +		edma_cp_tcd_to_reg(fsl_chan, last_tcd, csr);
> +	}
> +
> +	trace_edma_link_sg(fsl_chan, last_tcd);
> +}
> +
> +static dma_cookie_t fsl_edma_tx_submit(struct dma_async_tx_descriptor *tx)
> +{
> +	struct virt_dma_desc *vd = container_of(tx, struct virt_dma_desc, tx);
> +	struct fsl_edma_chan *fsl_chan = to_fsl_edma_chan(tx->chan);
> +	struct fsl_edma_desc *fsl_desc = to_fsl_edma_desc(vd);
> +	struct virt_dma_chan *vc = to_virt_chan(tx->chan);
> +	dma_cookie_t cookie;
> +
> +	guard(spinlock_irqsave)(&fsl_chan->vchan.lock);
> +
> +	fsl_edma_link_sg(fsl_chan, fsl_desc);

By dynamically chaining the new descriptor to the currently running hardware
here, the hardware could start executing it before dma_async_issue_pending()
is called.

If the hardware completes the new descriptor while it is still in the
desc_submitted list, could the interrupt handler ignore its completion
because it only processes desc_issued?

If so, would this result in double execution later when issue_pending() is
finally called and the software explicitly restarts the already-completed
descriptor?

> +	cookie = dma_cookie_assign(tx);
> +	list_move_tail(&vd->node, &vc->desc_submitted);
> +
> +	return cookie;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260511-fsl-edma-dyn-sg-v3-0-98a181775dae@bootlin.com?part=2

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-12  5:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 13:57 [PATCH v3 0/2] dmaengine: fsl-edma: Scatter/gather improvements Benoît Monin
2026-05-11 13:57 ` [PATCH v3 1/2] dmaengine: fsl-edma: Implement device_prep_peripheral_dma_vec Benoît Monin
2026-05-11 19:13   ` Frank Li
2026-05-12  5:06   ` sashiko-bot
2026-05-11 13:57 ` [PATCH v3 2/2] dmaengine: fsl-edma: Support dynamic scatter/gather chaining Benoît Monin
2026-05-11 19:20   ` Frank Li
2026-05-12  5:49   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox