* [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 19:41 ` Frank Li
2025-06-27 10:21 ` [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status James Clark
` (6 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
In target mode, extra interrupts can be received between the end of a
transfer and halting the module if the host continues sending more data.
If the interrupt from this occurs after the reinit_completion() then the
completion counter is left at a non-zero value. The next unrelated
transfer initiated by userspace will then complete immediately without
waiting for the interrupt or writing to the RX buffer.
Fix it by resetting the counter before the transfer so that lingering
values are cleared. This is done after clearing the FIFOs and the
status register but before the transfer is initiated, so no interrupts
should be received at this point resulting in other race conditions.
Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 04c88d090c4d..4bd4377551b5 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -1122,11 +1122,20 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
status = dspi_dma_xfer(dspi);
} else {
+ /*
+ * Reinitialize the completion before transferring data
+ * to avoid the case where it might remain in the done
+ * state due to a spurious interrupt from a previous
+ * transfer. This could falsely signal that the current
+ * transfer has completed.
+ */
+ if (dspi->irq)
+ reinit_completion(&dspi->xfer_done);
+
dspi_fifo_write(dspi);
if (dspi->irq) {
wait_for_completion(&dspi->xfer_done);
- reinit_completion(&dspi->xfer_done);
} else {
do {
status = dspi_poll(dspi);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer
2025-06-27 10:21 ` [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer James Clark
@ 2025-06-27 19:41 ` Frank Li
0 siblings, 0 replies; 35+ messages in thread
From: Frank Li @ 2025-06-27 19:41 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:37AM +0100, James Clark wrote:
> In target mode, extra interrupts can be received between the end of a
> transfer and halting the module if the host continues sending more data.
> If the interrupt from this occurs after the reinit_completion() then the
> completion counter is left at a non-zero value. The next unrelated
> transfer initiated by userspace will then complete immediately without
> waiting for the interrupt or writing to the RX buffer.
>
> Fix it by resetting the counter before the transfer so that lingering
> values are cleared. This is done after clearing the FIFOs and the
> status register but before the transfer is initiated, so no interrupts
> should be received at this point resulting in other race conditions.
>
> Fixes: 4f5ee75ea171 ("spi: spi-fsl-dspi: Replace interruptible wait queue with a simple completion")
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
> ---
> drivers/spi/spi-fsl-dspi.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index 04c88d090c4d..4bd4377551b5 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -1122,11 +1122,20 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
> if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
> status = dspi_dma_xfer(dspi);
> } else {
> + /*
> + * Reinitialize the completion before transferring data
> + * to avoid the case where it might remain in the done
> + * state due to a spurious interrupt from a previous
> + * transfer. This could falsely signal that the current
> + * transfer has completed.
> + */
> + if (dspi->irq)
> + reinit_completion(&dspi->xfer_done);
> +
> dspi_fifo_write(dspi);
>
> if (dspi->irq) {
> wait_for_completion(&dspi->xfer_done);
> - reinit_completion(&dspi->xfer_done);
> } else {
> do {
> status = dspi_poll(dspi);
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
2025-06-27 10:21 ` [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 21:30 ` Vladimir Oltean
2025-06-27 10:21 ` [PATCH v4 3/6] spi: spi-fsl-dspi: Stub out DMA functions James Clark
` (5 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
This will allow us to return a status from the interrupt handler in a
later commit and avoids copying it at the end of
dspi_transfer_one_message(). For consistency make polling and DMA modes
use the same mechanism.
Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
this isn't actually a status that was ever returned to the core layer
but some internal state. Wherever that was used we can look at dspi->len
instead.
No functional changes intended.
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 70 ++++++++++++++++++++++++----------------------
1 file changed, 36 insertions(+), 34 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 4bd4377551b5..65567745fe9a 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -591,11 +591,10 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
static void dspi_setup_accel(struct fsl_dspi *dspi);
-static int dspi_dma_xfer(struct fsl_dspi *dspi)
+static void dspi_dma_xfer(struct fsl_dspi *dspi)
{
struct spi_message *message = dspi->cur_msg;
struct device *dev = &dspi->pdev->dev;
- int ret = 0;
/*
* dspi->len gets decremented by dspi_pop_tx_pushr in
@@ -612,14 +611,12 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
message->actual_length += dspi->words_in_flight *
dspi->oper_word_size;
- ret = dspi_next_xfer_dma_submit(dspi);
- if (ret) {
+ message->status = dspi_next_xfer_dma_submit(dspi);
+ if (message->status) {
dev_err(dev, "DMA transfer failed\n");
break;
}
}
-
- return ret;
}
static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
@@ -986,36 +983,40 @@ static void dspi_fifo_write(struct fsl_dspi *dspi)
dspi->progress, !dspi->irq);
}
-static int dspi_rxtx(struct fsl_dspi *dspi)
+static void dspi_rxtx(struct fsl_dspi *dspi)
{
dspi_fifo_read(dspi);
if (!dspi->len)
/* Success! */
- return 0;
+ return;
dspi_fifo_write(dspi);
-
- return -EINPROGRESS;
}
-static int dspi_poll(struct fsl_dspi *dspi)
+static void dspi_poll(struct fsl_dspi *dspi)
{
- int tries = 1000;
+ int tries;
u32 spi_sr;
- do {
- regmap_read(dspi->regmap, SPI_SR, &spi_sr);
- regmap_write(dspi->regmap, SPI_SR, spi_sr);
+ while (dspi->len) {
+ for (tries = 1000; tries > 0; --tries) {
+ regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+ regmap_write(dspi->regmap, SPI_SR, spi_sr);
- if (spi_sr & SPI_SR_CMDTCF)
- break;
- } while (--tries);
+ if (spi_sr & SPI_SR_CMDTCF)
+ break;
+ }
- if (!tries)
- return -ETIMEDOUT;
+ if (!tries) {
+ dspi->cur_msg->status = -ETIMEDOUT;
+ return;
+ }
- return dspi_rxtx(dspi);
+ dspi_rxtx(dspi);
+ }
+
+ dspi->cur_msg->status = 0;
}
static irqreturn_t dspi_interrupt(int irq, void *dev_id)
@@ -1029,8 +1030,13 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
if (!(spi_sr & SPI_SR_CMDTCF))
return IRQ_NONE;
- if (dspi_rxtx(dspi) == 0)
+ dspi_rxtx(dspi);
+
+ if (!dspi->len) {
+ if (dspi->cur_msg)
+ WRITE_ONCE(dspi->cur_msg->status, 0);
complete(&dspi->xfer_done);
+ }
return IRQ_HANDLED;
}
@@ -1060,7 +1066,6 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
struct spi_device *spi = message->spi;
struct spi_transfer *transfer;
bool cs = false;
- int status = 0;
u32 val = 0;
bool cs_change = false;
@@ -1120,7 +1125,7 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi->progress, !dspi->irq);
if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
- status = dspi_dma_xfer(dspi);
+ dspi_dma_xfer(dspi);
} else {
/*
* Reinitialize the completion before transferring data
@@ -1134,15 +1139,12 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi_fifo_write(dspi);
- if (dspi->irq) {
+ if (dspi->irq)
wait_for_completion(&dspi->xfer_done);
- } else {
- do {
- status = dspi_poll(dspi);
- } while (status == -EINPROGRESS);
- }
+ else
+ dspi_poll(dspi);
}
- if (status)
+ if (READ_ONCE(message->status))
break;
spi_transfer_delay_exec(transfer);
@@ -1151,7 +1153,8 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi_deassert_cs(spi, &cs);
}
- if (status || !cs_change) {
+ dspi->cur_msg = NULL;
+ if (message->status || !cs_change) {
/* Put DSPI in stop mode */
regmap_update_bits(dspi->regmap, SPI_MCR,
SPI_MCR_HALT, SPI_MCR_HALT);
@@ -1160,10 +1163,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
;
}
- message->status = status;
spi_finalize_current_message(ctlr);
- return status;
+ return message->status;
}
static int dspi_set_mtf(struct fsl_dspi *dspi)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-27 10:21 ` [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status James Clark
@ 2025-06-27 21:30 ` Vladimir Oltean
2025-06-30 12:54 ` James Clark
2025-07-21 13:25 ` James Clark
0 siblings, 2 replies; 35+ messages in thread
From: Vladimir Oltean @ 2025-06-27 21:30 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 5712 bytes --]
On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
> This will allow us to return a status from the interrupt handler in a
> later commit and avoids copying it at the end of
> dspi_transfer_one_message(). For consistency make polling and DMA modes
> use the same mechanism.
>
> Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
> this isn't actually a status that was ever returned to the core layer
> but some internal state. Wherever that was used we can look at dspi->len
> instead.
>
> No functional changes intended.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
This commit doesn't work, please do not merge this patch.
You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
in one go, in a commit whose title and primary purpose is unrelated to
that. Just a mention of the type "while at it, also do that". And in
that process, that bundled refactoring introduces a subtle, but severe bug.
No, that is discouraged. Make one patch per logical change, where only
one thing is happening and which is obviously correct. It helps you and
it helps the reviewer.
Please find attached a set of 3 patches that represent a broken down and
corrected variant of this one. First 2 should be squashed together in
your next submission, they are just to illustrate the bug that you've
introduced (which can be reproduced on any SoC in XSPI mode).
The panic message is slightly confusing and does not directly point to
the issue, I'm attaching it just for the sake of having a future reference.
[ 4.154185] DSA: tree 0 setup
[ 4.157380] sja1105 spi2.0: Probed switch chip: SJA1105S
[ 4.173894] sja1105 spi2.0: configuring for fixed/sgmii link mode
[ 4.232527] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
[ 4.312798] sja1105 spi2.0 sw0p0 (uninitialized): PHY [0000:00:00.3:07] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
[ 4.443689] sja1105 spi2.0 sw0p1 (uninitialized): PHY [0000:00:00.3:00] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
[ 4.575718] sja1105 spi2.0 sw0p2 (uninitialized): PHY [0000:00:00.3:01] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
[ 4.588012] Unable to handle kernel paging request at virtual address ffff8000801ac000
[ 4.595960] Mem abort info:
[ 4.598757] ESR = 0x0000000096000007
[ 4.602515] EC = 0x25: DABT (current EL), IL = 32 bits
[ 4.607843] SET = 0, FnV = 0
[ 4.610902] EA = 0, S1PTW = 0
[ 4.614048] FSC = 0x07: level 3 translation fault
[ 4.618939] Data abort info:
[ 4.621822] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[ 4.627323] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 4.632388] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 4.637714] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000082b7a000
[ 4.644437] [ffff8000801ac000] pgd=0000000000000000, p4d=1000002080020403, pud=1000002080021403, pmd=1000002080022403, pte=0000000000000000
[ 4.657016] Internal error: Oops: 0000000096000007 [#1] SMP
[ 4.662693] Modules linked in:
[ 4.665756] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-rc3+ #30 PREEMPT
[ 4.673615] Hardware name: random LS1028A board
[ 4.679116] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 4.686103] pc : dspi_8on32_host_to_dev+0x8/0x24
[ 4.690742] lr : dspi_fifo_write+0x178/0x1cc
[ 4.695025] sp : ffff800080003eb0
[ 4.698346] x29: ffff800080003ec0 x28: ffffc25414698b00 x27: ffffc2541464c170
[ 4.705512] x26: 0000000000000001 x25: ffffc25414b06000 x24: 0000000111705fd3
[ 4.712677] x23: ffffc25414257bae x22: ffff8000801ab5e8 x21: 00000000fffffd98
[ 4.719842] x20: 0000000000000000 x19: ffff00200039a480 x18: 0000000000000006
[ 4.727007] x17: ffff3dcc6b076000 x16: ffff800080000000 x15: 0000000078b30c40
[ 4.734171] x14: 0000000000000000 x13: 0000000000000048 x12: 0000000000000128
[ 4.741335] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000100010001
[ 4.748500] x8 : ffff8000801ac000 x7 : 0000000000000000 x6 : 0000000000000000
[ 4.755664] x5 : 0000000000000000 x4 : ffffc25411a308d0 x3 : 0000000000000000
[ 4.762828] x2 : 0000000000000000 x1 : ffff800080003eb4 x0 : ffff00200039a480
[ 4.769992] Call trace:
[ 4.772441] dspi_8on32_host_to_dev+0x8/0x24 (P)
[ 4.777074] dspi_interrupt+0x6c/0xf0
[ 4.780747] __handle_irq_event_percpu+0x8c/0x160
[ 4.785470] handle_irq_event+0x48/0xa0
[ 4.789319] handle_fasteoi_irq+0xf4/0x208
[ 4.793428] generic_handle_domain_irq+0x40/0x64
[ 4.798060] gic_handle_irq+0x4c/0x110
[ 4.801820] call_on_irq_stack+0x24/0x30
[ 4.805757] el1_interrupt+0x74/0xc0
[ 4.809346] el1h_64_irq_handler+0x18/0x24
[ 4.813457] el1h_64_irq+0x6c/0x70
[ 4.816867] arch_local_irq_enable+0x8/0xc (P)
[ 4.821330] cpuidle_enter+0x38/0x50
[ 4.824914] do_idle+0x1c4/0x250
[ 4.828152] cpu_startup_entry+0x34/0x38
[ 4.832087] kernel_init+0x0/0x1a0
[ 4.835500] start_kernel+0x2ec/0x398
[ 4.839175] __primary_switched+0x88/0x90
[ 4.843200] Code: f9003008 d65f03c0 d503245f f9402c08 (b9400108)
[ 4.849313] ---[ end trace 0000000000000000 ]---
[ 4.853943] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 4.860840] SMP: stopping secondary CPUs
[ 4.864788] Kernel Offset: 0x425391a00000 from 0xffff800080000000
[ 4.870900] PHYS_OFFSET: 0xfff1000080000000
[ 4.875093] CPU features: 0x1000,000804b0,02000801,0400421b
[ 4.880683] Memory Limit: none
[ 4.883750] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
I still intend to do more testing, so please don't send the next version
just yet. Tracking down this issue took a bit more than I was planning.
[-- Attachment #2: 0001-spi-fsl-dspi-avoid-using-EINPROGRESS-error-code.patch --]
[-- Type: text/x-diff, Size: 2620 bytes --]
From 44aad50011575e720633ad1d733fd053e4a862b4 Mon Sep 17 00:00:00 2001
From: James Clark <james.clark@linaro.org>
Date: Fri, 27 Jun 2025 23:53:55 +0300
Subject: [PATCH 1/3] spi: fsl-dspi: avoid using -EINPROGRESS error code
THIS IS BUGGY because it changes the logic. More info in the next patch,
together with which it should be squashed.
Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
this isn't actually a status that was ever returned to the core layer
but some internal state. Wherever that was used we can look at dspi->len
instead.
No functional changes intended.
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/spi/spi-fsl-dspi.c | 38 ++++++++++++++++++++------------------
1 file changed, 20 insertions(+), 18 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 4bd4377551b5..c0a6c6c6459e 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -986,36 +986,38 @@ static void dspi_fifo_write(struct fsl_dspi *dspi)
dspi->progress, !dspi->irq);
}
-static int dspi_rxtx(struct fsl_dspi *dspi)
+static void dspi_rxtx(struct fsl_dspi *dspi)
{
dspi_fifo_read(dspi);
if (!dspi->len)
/* Success! */
- return 0;
+ return;
dspi_fifo_write(dspi);
-
- return -EINPROGRESS;
}
static int dspi_poll(struct fsl_dspi *dspi)
{
int tries = 1000;
+ int err = 0;
u32 spi_sr;
- do {
- regmap_read(dspi->regmap, SPI_SR, &spi_sr);
- regmap_write(dspi->regmap, SPI_SR, spi_sr);
-
- if (spi_sr & SPI_SR_CMDTCF)
+ while (dspi->len) {
+ for (tries = 1000; tries > 0; --tries) {
+ regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+ regmap_write(dspi->regmap, SPI_SR, spi_sr);
+ if (spi_sr & SPI_SR_CMDTCF)
+ break;
+ }
+ if (!tries) {
+ err = -ETIMEDOUT;
break;
- } while (--tries);
-
- if (!tries)
- return -ETIMEDOUT;
+ }
+ dspi_rxtx(dspi);
+ }
- return dspi_rxtx(dspi);
+ return err;
}
static irqreturn_t dspi_interrupt(int irq, void *dev_id)
@@ -1029,7 +1031,9 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
if (!(spi_sr & SPI_SR_CMDTCF))
return IRQ_NONE;
- if (dspi_rxtx(dspi) == 0)
+ dspi_rxtx(dspi);
+
+ if (!dspi->len)
complete(&dspi->xfer_done);
return IRQ_HANDLED;
@@ -1137,9 +1141,7 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
if (dspi->irq) {
wait_for_completion(&dspi->xfer_done);
} else {
- do {
- status = dspi_poll(dspi);
- } while (status == -EINPROGRESS);
+ status = dspi_poll(dspi);
}
}
if (status)
--
2.34.1
[-- Attachment #3: 0002-spi-fsl-dspi-fix-logic-bug-introduced-by-previous-co.patch --]
[-- Type: text/x-diff, Size: 2305 bytes --]
From de229c0b2602a2cf3d936993a7946f3cb7a80ef2 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Fri, 27 Jun 2025 23:57:53 +0300
Subject: [PATCH 2/3] spi: fsl-dspi: fix logic bug introduced by previous
commit
dspi_rxtx() is actually "rx, and then tx if necessary". Refactoring the
code to look at dspi->len outside of this function means we are losing a
necessary call to dspi_fifo_read(). This makes XSPI-based transfers
eventually hang.
I don't necessarily agree with the premise that using an errno value
privately within the driver is an anti-pattern, but let's at least make
the code functionally correct and use a boolean to track whether there
is data left to send, while still allowing data to be received.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/spi/spi-fsl-dspi.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index c0a6c6c6459e..e74ff6e9cb02 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -986,15 +986,20 @@ static void dspi_fifo_write(struct fsl_dspi *dspi)
dspi->progress, !dspi->irq);
}
-static void dspi_rxtx(struct fsl_dspi *dspi)
+/* Returns false if the buffer to be transmitted is empty, and true if
+ * there is still data to transmit.
+ */
+static bool dspi_rxtx(struct fsl_dspi *dspi)
{
dspi_fifo_read(dspi);
if (!dspi->len)
/* Success! */
- return;
+ return false;
dspi_fifo_write(dspi);
+
+ return true;
}
static int dspi_poll(struct fsl_dspi *dspi)
@@ -1003,7 +1008,7 @@ static int dspi_poll(struct fsl_dspi *dspi)
int err = 0;
u32 spi_sr;
- while (dspi->len) {
+ do {
for (tries = 1000; tries > 0; --tries) {
regmap_read(dspi->regmap, SPI_SR, &spi_sr);
regmap_write(dspi->regmap, SPI_SR, spi_sr);
@@ -1014,8 +1019,7 @@ static int dspi_poll(struct fsl_dspi *dspi)
err = -ETIMEDOUT;
break;
}
- dspi_rxtx(dspi);
- }
+ } while (dspi_rxtx(dspi));
return err;
}
@@ -1031,9 +1035,7 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
if (!(spi_sr & SPI_SR_CMDTCF))
return IRQ_NONE;
- dspi_rxtx(dspi);
-
- if (!dspi->len)
+ if (dspi_rxtx(dspi) == false)
complete(&dspi->xfer_done);
return IRQ_HANDLED;
--
2.34.1
[-- Attachment #4: 0003-spi-fsl-dspi-Store-status-directly-in-cur_msg-status.patch --]
[-- Type: text/x-diff, Size: 4068 bytes --]
From f478ca8a462881249ed65b8d279ae77a9bc1ac52 Mon Sep 17 00:00:00 2001
From: James Clark <james.clark@linaro.org>
Date: Sat, 28 Jun 2025 00:08:49 +0300
Subject: [PATCH 3/3] spi: fsl-dspi: Store status directly in cur_msg->status
This will allow us to return a status from the interrupt handler in a
later commit and avoids copying it at the end of
dspi_transfer_one_message(). For consistency make polling and DMA modes
use the same mechanism.
No functional changes intended.
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/spi/spi-fsl-dspi.c | 36 +++++++++++++++++-------------------
1 file changed, 17 insertions(+), 19 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index e74ff6e9cb02..e586694502eb 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -591,11 +591,10 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
static void dspi_setup_accel(struct fsl_dspi *dspi);
-static int dspi_dma_xfer(struct fsl_dspi *dspi)
+static void dspi_dma_xfer(struct fsl_dspi *dspi)
{
struct spi_message *message = dspi->cur_msg;
struct device *dev = &dspi->pdev->dev;
- int ret = 0;
/*
* dspi->len gets decremented by dspi_pop_tx_pushr in
@@ -612,14 +611,12 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
message->actual_length += dspi->words_in_flight *
dspi->oper_word_size;
- ret = dspi_next_xfer_dma_submit(dspi);
- if (ret) {
+ message->status = dspi_next_xfer_dma_submit(dspi);
+ if (message->status) {
dev_err(dev, "DMA transfer failed\n");
break;
}
}
-
- return ret;
}
static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
@@ -1002,7 +999,7 @@ static bool dspi_rxtx(struct fsl_dspi *dspi)
return true;
}
-static int dspi_poll(struct fsl_dspi *dspi)
+static void dspi_poll(struct fsl_dspi *dspi)
{
int tries = 1000;
int err = 0;
@@ -1021,7 +1018,7 @@ static int dspi_poll(struct fsl_dspi *dspi)
}
} while (dspi_rxtx(dspi));
- return err;
+ dspi->cur_msg->status = err;
}
static irqreturn_t dspi_interrupt(int irq, void *dev_id)
@@ -1035,8 +1032,11 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
if (!(spi_sr & SPI_SR_CMDTCF))
return IRQ_NONE;
- if (dspi_rxtx(dspi) == false)
+ if (dspi_rxtx(dspi) == false) {
+ if (dspi->cur_msg)
+ WRITE_ONCE(dspi->cur_msg->status, 0);
complete(&dspi->xfer_done);
+ }
return IRQ_HANDLED;
}
@@ -1066,7 +1066,6 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
struct spi_device *spi = message->spi;
struct spi_transfer *transfer;
bool cs = false;
- int status = 0;
u32 val = 0;
bool cs_change = false;
@@ -1126,7 +1125,7 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi->progress, !dspi->irq);
if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
- status = dspi_dma_xfer(dspi);
+ dspi_dma_xfer(dspi);
} else {
/*
* Reinitialize the completion before transferring data
@@ -1140,13 +1139,12 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi_fifo_write(dspi);
- if (dspi->irq) {
+ if (dspi->irq)
wait_for_completion(&dspi->xfer_done);
- } else {
- status = dspi_poll(dspi);
- }
+ else
+ dspi_poll(dspi);
}
- if (status)
+ if (READ_ONCE(message->status))
break;
spi_transfer_delay_exec(transfer);
@@ -1155,7 +1153,8 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
dspi_deassert_cs(spi, &cs);
}
- if (status || !cs_change) {
+ dspi->cur_msg = NULL;
+ if (message->status || !cs_change) {
/* Put DSPI in stop mode */
regmap_update_bits(dspi->regmap, SPI_MCR,
SPI_MCR_HALT, SPI_MCR_HALT);
@@ -1164,10 +1163,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
;
}
- message->status = status;
spi_finalize_current_message(ctlr);
- return status;
+ return message->status;
}
static int dspi_set_mtf(struct fsl_dspi *dspi)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-27 21:30 ` Vladimir Oltean
@ 2025-06-30 12:54 ` James Clark
2025-06-30 20:41 ` Vladimir Oltean
2025-07-21 13:25 ` James Clark
1 sibling, 1 reply; 35+ messages in thread
From: James Clark @ 2025-06-30 12:54 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 27/06/2025 10:30 pm, Vladimir Oltean wrote:
> On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
>> This will allow us to return a status from the interrupt handler in a
>> later commit and avoids copying it at the end of
>> dspi_transfer_one_message(). For consistency make polling and DMA modes
>> use the same mechanism.
>>
>> Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
>> this isn't actually a status that was ever returned to the core layer
>> but some internal state. Wherever that was used we can look at dspi->len
>> instead.
>>
>> No functional changes intended.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>
> This commit doesn't work, please do not merge this patch.
>
> You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
> in one go, in a commit whose title and primary purpose is unrelated to
> that. Just a mention of the type "while at it, also do that". And in
> that process, that bundled refactoring introduces a subtle, but severe bug.
>
> No, that is discouraged. Make one patch per logical change, where only
> one thing is happening and which is obviously correct. It helps you and
> it helps the reviewer.
>
> Please find attached a set of 3 patches that represent a broken down and
> corrected variant of this one. First 2 should be squashed together in
> your next submission, they are just to illustrate the bug that you've
> introduced (which can be reproduced on any SoC in XSPI mode).
>
Thanks for the debugging, yes it looks like the patches could be broken
down a bit.
Just for clarity, is this bug affecting host+polling mode? I can see the
logic bug in dspi_poll() which I must have tested less thoroughly, but I
can't actually see any difference in dspi_interrupt().
WRT the patches attached, we could remove dspi_rxtx(). It's only
encapsulating two function calls, and the return half way through is
something that is for the outer loop in the caller anyway. Open open
coding the dspi_fifo_read() and dspi_fifo_write() ends up being clearer IMO.
> The panic message is slightly confusing and does not directly point to
> the issue, I'm attaching it just for the sake of having a future reference.
>
The panic is a bit confusing, I didn't expect to see dspi_interrupt() if
it only affects polling mode. Do you have a reproducer? I wasn't able to
reproduce it in interrupt mode.
> [ 4.154185] DSA: tree 0 setup
> [ 4.157380] sja1105 spi2.0: Probed switch chip: SJA1105S
> [ 4.173894] sja1105 spi2.0: configuring for fixed/sgmii link mode
> [ 4.232527] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
> [ 4.312798] sja1105 spi2.0 sw0p0 (uninitialized): PHY [0000:00:00.3:07] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
> [ 4.443689] sja1105 spi2.0 sw0p1 (uninitialized): PHY [0000:00:00.3:00] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
> [ 4.575718] sja1105 spi2.0 sw0p2 (uninitialized): PHY [0000:00:00.3:01] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
> [ 4.588012] Unable to handle kernel paging request at virtual address ffff8000801ac000
> [ 4.595960] Mem abort info:
> [ 4.598757] ESR = 0x0000000096000007
> [ 4.602515] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 4.607843] SET = 0, FnV = 0
> [ 4.610902] EA = 0, S1PTW = 0
> [ 4.614048] FSC = 0x07: level 3 translation fault
> [ 4.618939] Data abort info:
> [ 4.621822] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> [ 4.627323] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 4.632388] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 4.637714] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000082b7a000
> [ 4.644437] [ffff8000801ac000] pgd=0000000000000000, p4d=1000002080020403, pud=1000002080021403, pmd=1000002080022403, pte=0000000000000000
> [ 4.657016] Internal error: Oops: 0000000096000007 [#1] SMP
> [ 4.662693] Modules linked in:
> [ 4.665756] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-rc3+ #30 PREEMPT
> [ 4.673615] Hardware name: random LS1028A board
> [ 4.679116] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 4.686103] pc : dspi_8on32_host_to_dev+0x8/0x24
> [ 4.690742] lr : dspi_fifo_write+0x178/0x1cc
> [ 4.695025] sp : ffff800080003eb0
> [ 4.698346] x29: ffff800080003ec0 x28: ffffc25414698b00 x27: ffffc2541464c170
> [ 4.705512] x26: 0000000000000001 x25: ffffc25414b06000 x24: 0000000111705fd3
> [ 4.712677] x23: ffffc25414257bae x22: ffff8000801ab5e8 x21: 00000000fffffd98
> [ 4.719842] x20: 0000000000000000 x19: ffff00200039a480 x18: 0000000000000006
> [ 4.727007] x17: ffff3dcc6b076000 x16: ffff800080000000 x15: 0000000078b30c40
> [ 4.734171] x14: 0000000000000000 x13: 0000000000000048 x12: 0000000000000128
> [ 4.741335] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000100010001
> [ 4.748500] x8 : ffff8000801ac000 x7 : 0000000000000000 x6 : 0000000000000000
> [ 4.755664] x5 : 0000000000000000 x4 : ffffc25411a308d0 x3 : 0000000000000000
> [ 4.762828] x2 : 0000000000000000 x1 : ffff800080003eb4 x0 : ffff00200039a480
> [ 4.769992] Call trace:
> [ 4.772441] dspi_8on32_host_to_dev+0x8/0x24 (P)
> [ 4.777074] dspi_interrupt+0x6c/0xf0
> [ 4.780747] __handle_irq_event_percpu+0x8c/0x160
> [ 4.785470] handle_irq_event+0x48/0xa0
> [ 4.789319] handle_fasteoi_irq+0xf4/0x208
> [ 4.793428] generic_handle_domain_irq+0x40/0x64
> [ 4.798060] gic_handle_irq+0x4c/0x110
> [ 4.801820] call_on_irq_stack+0x24/0x30
> [ 4.805757] el1_interrupt+0x74/0xc0
> [ 4.809346] el1h_64_irq_handler+0x18/0x24
> [ 4.813457] el1h_64_irq+0x6c/0x70
> [ 4.816867] arch_local_irq_enable+0x8/0xc (P)
> [ 4.821330] cpuidle_enter+0x38/0x50
> [ 4.824914] do_idle+0x1c4/0x250
> [ 4.828152] cpu_startup_entry+0x34/0x38
> [ 4.832087] kernel_init+0x0/0x1a0
> [ 4.835500] start_kernel+0x2ec/0x398
> [ 4.839175] __primary_switched+0x88/0x90
> [ 4.843200] Code: f9003008 d65f03c0 d503245f f9402c08 (b9400108)
> [ 4.849313] ---[ end trace 0000000000000000 ]---
> [ 4.853943] Kernel panic - not syncing: Oops: Fatal exception in interrupt
> [ 4.860840] SMP: stopping secondary CPUs
> [ 4.864788] Kernel Offset: 0x425391a00000 from 0xffff800080000000
> [ 4.870900] PHYS_OFFSET: 0xfff1000080000000
> [ 4.875093] CPU features: 0x1000,000804b0,02000801,0400421b
> [ 4.880683] Memory Limit: none
> [ 4.883750] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
>
> I still intend to do more testing, so please don't send the next version
> just yet. Tracking down this issue took a bit more than I was planning.
We could also revert to the original idea of adding the status int to
struct fsl_dspi. Much simpler and nothing needs to be changed except
reading it out in interrupt mode.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-30 12:54 ` James Clark
@ 2025-06-30 20:41 ` Vladimir Oltean
2025-07-01 10:02 ` James Clark
0 siblings, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-06-30 20:41 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Mon, Jun 30, 2025 at 01:54:11PM +0100, James Clark wrote:
> On 27/06/2025 10:30 pm, Vladimir Oltean wrote:
> > On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
> > > This will allow us to return a status from the interrupt handler in a
> > > later commit and avoids copying it at the end of
> > > dspi_transfer_one_message(). For consistency make polling and DMA modes
> > > use the same mechanism.
> > >
> > > Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
> > > this isn't actually a status that was ever returned to the core layer
> > > but some internal state. Wherever that was used we can look at dspi->len
> > > instead.
> > >
> > > No functional changes intended.
> > >
> > > Signed-off-by: James Clark <james.clark@linaro.org>
> > > ---
> >
> > This commit doesn't work, please do not merge this patch.
> >
> > You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
> > in one go, in a commit whose title and primary purpose is unrelated to
> > that. Just a mention of the type "while at it, also do that". And in
> > that process, that bundled refactoring introduces a subtle, but severe bug.
> >
> > No, that is discouraged. Make one patch per logical change, where only
> > one thing is happening and which is obviously correct. It helps you and
> > it helps the reviewer.
> >
> > Please find attached a set of 3 patches that represent a broken down and
> > corrected variant of this one. First 2 should be squashed together in
> > your next submission, they are just to illustrate the bug that you've
> > introduced (which can be reproduced on any SoC in XSPI mode).
> >
>
> Thanks for the debugging, yes it looks like the patches could be broken down
> a bit.
>
> Just for clarity, is this bug affecting host+polling mode? I can see the
> logic bug in dspi_poll() which I must have tested less thoroughly, but I
> can't actually see any difference in dspi_interrupt().
It should affect both, I tested your patches unmodified, i.e. interrupt
based XSPI FIFO mode (in master mode).
Assume (not real numbers, just for explanation's sake) dspi->len is 2
(2 FIFO sizes worth of 32-bit words, but let's assume for simplicity
that each dspi_pop_tx() call simply decrements the len by 1).
The correct behavior would be this:
dspi_transfer_one_message()
-> dspi->len = 2
-> dspi_fifo_write()
-> dspi_xspi_fifo_write()
-> dspi_pop_tx()
-> dspi->len = 1
-> wait_for_completion(&dspi->xfer_done)
<IRQ>
dspi_interrupt()
-> dspi_rxtx()
-> dspi_fifo_read()
-> dspi_fifo_write()
-> dspi_xspi_fifo_write()
-> dspi_pop_tx()
-> dspi->len = 0
<IRQ>
dspi_interrupt()
-> dspi_rxtx()
-> dspi_fifo_read()
-> complete(&dspi->xfer_done)
-> reinit_completion(&dspi->xfer_done)
but the behavior with your proposed logic is this:
dspi_transfer_one_message()
-> dspi->len = 2
-> dspi_fifo_write()
-> dspi_xspi_fifo_write()
-> dspi_pop_tx()
-> dspi->len = 1
-> wait_for_completion(&dspi->xfer_done)
<IRQ>
dspi_interrupt()
-> dspi_rxtx()
-> dspi_fifo_read()
-> dspi_fifo_write()
-> dspi_xspi_fifo_write()
-> dspi_pop_tx()
-> dspi->len = 0
-> complete(&dspi->xfer_done)
-> reinit_completion(&dspi->xfer_done)
<IRQ>
dspi_interrupt()
-> Second interrupt is spurious at
this point, since the process
context may have proceeded
to change pointers in
dspi->cur_transfer, etc.
Clearer now? Essentially the complete() call is premature, it needs to
be not after the dspi_fifo_write() call, but after its subsequent
dspi_fifo_read(), which comes after yet another IRQ, in the IRQ-triggered
path.
Not sure why you are not able to reproduce this, maybe luck had it that
the complete() call never woke up the process context earlier than the
second IRQ in the above case triggered?
I'm not doing anything special in particular, just booted a board with a
SPI device driver (sja1105). This transfers some sequences of relatively
large buffers (256 bytes) at probe time, maybe that exercises the
controller driver more than the average peripheral driver.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-30 20:41 ` Vladimir Oltean
@ 2025-07-01 10:02 ` James Clark
0 siblings, 0 replies; 35+ messages in thread
From: James Clark @ 2025-07-01 10:02 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 30/06/2025 9:41 pm, Vladimir Oltean wrote:
> On Mon, Jun 30, 2025 at 01:54:11PM +0100, James Clark wrote:
>> On 27/06/2025 10:30 pm, Vladimir Oltean wrote:
>>> On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
>>>> This will allow us to return a status from the interrupt handler in a
>>>> later commit and avoids copying it at the end of
>>>> dspi_transfer_one_message(). For consistency make polling and DMA modes
>>>> use the same mechanism.
>>>>
>>>> Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
>>>> this isn't actually a status that was ever returned to the core layer
>>>> but some internal state. Wherever that was used we can look at dspi->len
>>>> instead.
>>>>
>>>> No functional changes intended.
>>>>
>>>> Signed-off-by: James Clark <james.clark@linaro.org>
>>>> ---
>>>
>>> This commit doesn't work, please do not merge this patch.
>>>
>>> You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
>>> in one go, in a commit whose title and primary purpose is unrelated to
>>> that. Just a mention of the type "while at it, also do that". And in
>>> that process, that bundled refactoring introduces a subtle, but severe bug.
>>>
>>> No, that is discouraged. Make one patch per logical change, where only
>>> one thing is happening and which is obviously correct. It helps you and
>>> it helps the reviewer.
>>>
>>> Please find attached a set of 3 patches that represent a broken down and
>>> corrected variant of this one. First 2 should be squashed together in
>>> your next submission, they are just to illustrate the bug that you've
>>> introduced (which can be reproduced on any SoC in XSPI mode).
>>>
>>
>> Thanks for the debugging, yes it looks like the patches could be broken down
>> a bit.
>>
>> Just for clarity, is this bug affecting host+polling mode? I can see the
>> logic bug in dspi_poll() which I must have tested less thoroughly, but I
>> can't actually see any difference in dspi_interrupt().
>
> It should affect both, I tested your patches unmodified, i.e. interrupt
> based XSPI FIFO mode (in master mode).
>
> Assume (not real numbers, just for explanation's sake) dspi->len is 2
> (2 FIFO sizes worth of 32-bit words, but let's assume for simplicity
> that each dspi_pop_tx() call simply decrements the len by 1).
>
> The correct behavior would be this:
>
> dspi_transfer_one_message()
> -> dspi->len = 2
> -> dspi_fifo_write()
> -> dspi_xspi_fifo_write()
> -> dspi_pop_tx()
> -> dspi->len = 1
> -> wait_for_completion(&dspi->xfer_done)
> <IRQ>
> dspi_interrupt()
> -> dspi_rxtx()
> -> dspi_fifo_read()
> -> dspi_fifo_write()
> -> dspi_xspi_fifo_write()
> -> dspi_pop_tx()
> -> dspi->len = 0
> <IRQ>
> dspi_interrupt()
> -> dspi_rxtx()
> -> dspi_fifo_read()
> -> complete(&dspi->xfer_done)
> -> reinit_completion(&dspi->xfer_done)
>
> but the behavior with your proposed logic is this:
>
> dspi_transfer_one_message()
> -> dspi->len = 2
> -> dspi_fifo_write()
> -> dspi_xspi_fifo_write()
> -> dspi_pop_tx()
> -> dspi->len = 1
> -> wait_for_completion(&dspi->xfer_done)
> <IRQ>
> dspi_interrupt()
> -> dspi_rxtx()
> -> dspi_fifo_read()
> -> dspi_fifo_write()
> -> dspi_xspi_fifo_write()
> -> dspi_pop_tx()
> -> dspi->len = 0
> -> complete(&dspi->xfer_done)
> -> reinit_completion(&dspi->xfer_done)
> <IRQ>
> dspi_interrupt()
> -> Second interrupt is spurious at
> this point, since the process
> context may have proceeded
> to change pointers in
> dspi->cur_transfer, etc.
>
> Clearer now? Essentially the complete() call is premature, it needs to
> be not after the dspi_fifo_write() call, but after its subsequent
> dspi_fifo_read(), which comes after yet another IRQ, in the IRQ-triggered
> path.
>
Much clearer, thanks. Not sure how I missed that, maybe a confusion
about whether it was dspi_fifo_read() or dspi_fifo_write() that modifies
dspi->len.
> Not sure why you are not able to reproduce this, maybe luck had it that
> the complete() call never woke up the process context earlier than the
> second IRQ in the above case triggered?
>
> I'm not doing anything special in particular, just booted a board with a
> SPI device driver (sja1105). This transfers some sequences of relatively
> large buffers (256 bytes) at probe time, maybe that exercises the
> controller driver more than the average peripheral driver.
It's strange because I was stressing it quite a lot, especially with the
performance testing.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-06-27 21:30 ` Vladimir Oltean
2025-06-30 12:54 ` James Clark
@ 2025-07-21 13:25 ` James Clark
2025-07-21 13:39 ` Vladimir Oltean
1 sibling, 1 reply; 35+ messages in thread
From: James Clark @ 2025-07-21 13:25 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 27/06/2025 10:30 pm, Vladimir Oltean wrote:
> On Fri, Jun 27, 2025 at 11:21:38AM +0100, James Clark wrote:
>> This will allow us to return a status from the interrupt handler in a
>> later commit and avoids copying it at the end of
>> dspi_transfer_one_message(). For consistency make polling and DMA modes
>> use the same mechanism.
>>
>> Refactor dspi_rxtx() and dspi_poll() to not return -EINPROGRESS because
>> this isn't actually a status that was ever returned to the core layer
>> but some internal state. Wherever that was used we can look at dspi->len
>> instead.
>>
>> No functional changes intended.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>
> This commit doesn't work, please do not merge this patch.
>
> You are changing the logic in DMA mode, interrupt-based FIFO and PIO all
> in one go, in a commit whose title and primary purpose is unrelated to
> that. Just a mention of the type "while at it, also do that". And in
> that process, that bundled refactoring introduces a subtle, but severe bug.
>
> No, that is discouraged. Make one patch per logical change, where only
> one thing is happening and which is obviously correct. It helps you and
> it helps the reviewer.
>
> Please find attached a set of 3 patches that represent a broken down and
> corrected variant of this one. First 2 should be squashed together in
> your next submission, they are just to illustrate the bug that you've
> introduced (which can be reproduced on any SoC in XSPI mode).
>
> The panic message is slightly confusing and does not directly point to
> the issue, I'm attaching it just for the sake of having a future reference.
>
> [ 4.154185] DSA: tree 0 setup
> [ 4.157380] sja1105 spi2.0: Probed switch chip: SJA1105S
> [ 4.173894] sja1105 spi2.0: configuring for fixed/sgmii link mode
> [ 4.232527] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
> [ 4.312798] sja1105 spi2.0 sw0p0 (uninitialized): PHY [0000:00:00.3:07] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
> [ 4.443689] sja1105 spi2.0 sw0p1 (uninitialized): PHY [0000:00:00.3:00] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
> [ 4.575718] sja1105 spi2.0 sw0p2 (uninitialized): PHY [0000:00:00.3:01] driver [Microsemi GE VSC8502 SyncE] (irq=POLL)
> [ 4.588012] Unable to handle kernel paging request at virtual address ffff8000801ac000
> [ 4.595960] Mem abort info:
> [ 4.598757] ESR = 0x0000000096000007
> [ 4.602515] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 4.607843] SET = 0, FnV = 0
> [ 4.610902] EA = 0, S1PTW = 0
> [ 4.614048] FSC = 0x07: level 3 translation fault
> [ 4.618939] Data abort info:
> [ 4.621822] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
> [ 4.627323] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [ 4.632388] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [ 4.637714] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000082b7a000
> [ 4.644437] [ffff8000801ac000] pgd=0000000000000000, p4d=1000002080020403, pud=1000002080021403, pmd=1000002080022403, pte=0000000000000000
> [ 4.657016] Internal error: Oops: 0000000096000007 [#1] SMP
> [ 4.662693] Modules linked in:
> [ 4.665756] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.16.0-rc3+ #30 PREEMPT
> [ 4.673615] Hardware name: random LS1028A board
> [ 4.679116] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 4.686103] pc : dspi_8on32_host_to_dev+0x8/0x24
> [ 4.690742] lr : dspi_fifo_write+0x178/0x1cc
> [ 4.695025] sp : ffff800080003eb0
> [ 4.698346] x29: ffff800080003ec0 x28: ffffc25414698b00 x27: ffffc2541464c170
> [ 4.705512] x26: 0000000000000001 x25: ffffc25414b06000 x24: 0000000111705fd3
> [ 4.712677] x23: ffffc25414257bae x22: ffff8000801ab5e8 x21: 00000000fffffd98
> [ 4.719842] x20: 0000000000000000 x19: ffff00200039a480 x18: 0000000000000006
> [ 4.727007] x17: ffff3dcc6b076000 x16: ffff800080000000 x15: 0000000078b30c40
> [ 4.734171] x14: 0000000000000000 x13: 0000000000000048 x12: 0000000000000128
> [ 4.741335] x11: 0000000000000001 x10: 0000000000000000 x9 : 0000000100010001
> [ 4.748500] x8 : ffff8000801ac000 x7 : 0000000000000000 x6 : 0000000000000000
> [ 4.755664] x5 : 0000000000000000 x4 : ffffc25411a308d0 x3 : 0000000000000000
> [ 4.762828] x2 : 0000000000000000 x1 : ffff800080003eb4 x0 : ffff00200039a480
> [ 4.769992] Call trace:
> [ 4.772441] dspi_8on32_host_to_dev+0x8/0x24 (P)
> [ 4.777074] dspi_interrupt+0x6c/0xf0
> [ 4.780747] __handle_irq_event_percpu+0x8c/0x160
> [ 4.785470] handle_irq_event+0x48/0xa0
> [ 4.789319] handle_fasteoi_irq+0xf4/0x208
> [ 4.793428] generic_handle_domain_irq+0x40/0x64
> [ 4.798060] gic_handle_irq+0x4c/0x110
> [ 4.801820] call_on_irq_stack+0x24/0x30
> [ 4.805757] el1_interrupt+0x74/0xc0
> [ 4.809346] el1h_64_irq_handler+0x18/0x24
> [ 4.813457] el1h_64_irq+0x6c/0x70
> [ 4.816867] arch_local_irq_enable+0x8/0xc (P)
> [ 4.821330] cpuidle_enter+0x38/0x50
> [ 4.824914] do_idle+0x1c4/0x250
> [ 4.828152] cpu_startup_entry+0x34/0x38
> [ 4.832087] kernel_init+0x0/0x1a0
> [ 4.835500] start_kernel+0x2ec/0x398
> [ 4.839175] __primary_switched+0x88/0x90
> [ 4.843200] Code: f9003008 d65f03c0 d503245f f9402c08 (b9400108)
> [ 4.849313] ---[ end trace 0000000000000000 ]---
> [ 4.853943] Kernel panic - not syncing: Oops: Fatal exception in interrupt
> [ 4.860840] SMP: stopping secondary CPUs
> [ 4.864788] Kernel Offset: 0x425391a00000 from 0xffff800080000000
> [ 4.870900] PHYS_OFFSET: 0xfff1000080000000
> [ 4.875093] CPU features: 0x1000,000804b0,02000801,0400421b
> [ 4.880683] Memory Limit: none
> [ 4.883750] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
>
> I still intend to do more testing, so please don't send the next version
> just yet. Tracking down this issue took a bit more than I was planning.
Hi Vladimir,
Just wanted to check if you are ok for me to send a new version with
your fixes included now?
I assume from the other discussion that we don't want to always enable
DMA mode either, and we'll just leave it for s32g target mode only?
Thanks
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-07-21 13:25 ` James Clark
@ 2025-07-21 13:39 ` Vladimir Oltean
2025-07-21 14:02 ` James Clark
0 siblings, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-07-21 13:39 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Mon, Jul 21, 2025 at 02:25:55PM +0100, James Clark wrote:
> Hi Vladimir,
>
> Just wanted to check if you are ok for me to send a new version with your
> fixes included now?
Yes, sorry for not being clear, the tests which I wanted to run were
these ones:
https://lore.kernel.org/linux-spi/20250630152612.npdobwbcezl5nlym@skbuf/
> I assume from the other discussion that we don't want to always enable DMA
> mode either, and we'll just leave it for s32g target mode only?
Yeah, for sure don't enable DMA mode unconditionally. I don't have time
right now to look into Mark's can_dma() suggestion - if you don't have
either, I suppose it is what it is, and the performance improvements
brought by your enhanced DMA patches can be brought over to other SoCs
at some unspecified time in the future.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-07-21 13:39 ` Vladimir Oltean
@ 2025-07-21 14:02 ` James Clark
2025-07-21 14:04 ` Mark Brown
0 siblings, 1 reply; 35+ messages in thread
From: James Clark @ 2025-07-21 14:02 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 21/07/2025 2:39 pm, Vladimir Oltean wrote:
> On Mon, Jul 21, 2025 at 02:25:55PM +0100, James Clark wrote:
>> Hi Vladimir,
>>
>> Just wanted to check if you are ok for me to send a new version with your
>> fixes included now?
>
> Yes, sorry for not being clear, the tests which I wanted to run were
> these ones:
> https://lore.kernel.org/linux-spi/20250630152612.npdobwbcezl5nlym@skbuf/
>
Got it, thanks.
>> I assume from the other discussion that we don't want to always enable DMA
>> mode either, and we'll just leave it for s32g target mode only?
>
> Yeah, for sure don't enable DMA mode unconditionally. I don't have time
> right now to look into Mark's can_dma() suggestion - if you don't have
> either, I suppose it is what it is, and the performance improvements
> brought by your enhanced DMA patches can be brought over to other SoCs
> at some unspecified time in the future.
I think incremental is better yes, there wouldn't be much or any of this
thrown away anyway.
I'm also not sure if it would fit with can_dma as that starts mapping
stuff in the core layer. We want to avoid that because we need to write
that control word for each SPI word. We could still change mode
conditionally in the DSPI driver though, or make can_dma more flexible.
But on top of this for sure.
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status
2025-07-21 14:02 ` James Clark
@ 2025-07-21 14:04 ` Mark Brown
0 siblings, 0 replies; 35+ messages in thread
From: Mark Brown @ 2025-07-21 14:04 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 477 bytes --]
On Mon, Jul 21, 2025 at 03:02:09PM +0100, James Clark wrote:
> I'm also not sure if it would fit with can_dma as that starts mapping stuff
> in the core layer. We want to avoid that because we need to write that
> control word for each SPI word. We could still change mode conditionally in
> the DSPI driver though, or make can_dma more flexible. But on top of this
> for sure.
Even if you need to open code the infrastructure the idea of a copybreak
should still be useful.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4 3/6] spi: spi-fsl-dspi: Stub out DMA functions
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
2025-06-27 10:21 ` [PATCH v4 1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer James Clark
2025-06-27 10:21 ` [PATCH v4 2/6] spi: spi-fsl-dspi: Store status directly in cur_msg->status James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 10:21 ` [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA James Clark
` (4 subsequent siblings)
7 siblings, 0 replies; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
This will allow the build to succeed with !CONFIG_HAS_DMA, either due to
a randconfig build test or when the target only uses one of the non-DMA
transfer modes which this driver supports.
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 34 ++++++++++++++++++++++++----------
1 file changed, 24 insertions(+), 10 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 65567745fe9a..feff475dddfc 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -373,6 +373,8 @@ struct fsl_dspi {
void (*dev_to_host)(struct fsl_dspi *dspi, u32 rxdata);
};
+static void dspi_setup_accel(struct fsl_dspi *dspi);
+
static bool is_s32g_dspi(struct fsl_dspi *data)
{
return data->devtype_data == &devtype_data[S32G] ||
@@ -468,6 +470,16 @@ static u32 dspi_pop_tx(struct fsl_dspi *dspi)
return txdata;
}
+/* Push one word to the RX buffer from the POPR register (RX FIFO) */
+static void dspi_push_rx(struct fsl_dspi *dspi, u32 rxdata)
+{
+ if (!dspi->rx)
+ return;
+ dspi->dev_to_host(dspi, rxdata);
+}
+
+#if IS_ENABLED(CONFIG_DMA_ENGINE)
+
/* Prepare one TX FIFO entry (txdata plus cmd) */
static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
{
@@ -481,14 +493,6 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
return cmd << 16 | data;
}
-/* Push one word to the RX buffer from the POPR register (RX FIFO) */
-static void dspi_push_rx(struct fsl_dspi *dspi, u32 rxdata)
-{
- if (!dspi->rx)
- return;
- dspi->dev_to_host(dspi, rxdata);
-}
-
static void dspi_tx_dma_callback(void *arg)
{
struct fsl_dspi *dspi = arg;
@@ -589,8 +593,6 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
return 0;
}
-static void dspi_setup_accel(struct fsl_dspi *dspi);
-
static void dspi_dma_xfer(struct fsl_dspi *dspi)
{
struct spi_message *message = dspi->cur_msg;
@@ -722,6 +724,18 @@ static void dspi_release_dma(struct fsl_dspi *dspi)
dma_release_channel(dma->chan_rx);
}
}
+#else
+static void dspi_dma_xfer(struct fsl_dspi *dspi)
+{
+ dspi->cur_msg->status = -EINVAL;
+}
+static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
+{
+ dev_err(&dspi->pdev->dev, "DMA support not enabled in kernel\n");
+ return -EINVAL;
+}
+static void dspi_release_dma(struct fsl_dspi *dspi) {}
+#endif
static void hz_to_spi_baud(char *pbr, char *br, int speed_hz,
unsigned long clkrate, bool mtf_enabled)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
` (2 preceding siblings ...)
2025-06-27 10:21 ` [PATCH v4 3/6] spi: spi-fsl-dspi: Stub out DMA functions James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 19:38 ` Frank Li
2025-06-27 10:21 ` [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size James Clark
` (3 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
Using coherent memory here isn't functionally necessary, we're only
either sending data to the device or reading from it. This means
explicit synchronizations are only required around those points and the
change is fairly trivial.
This gives us around a 10% increase in throughput for large DMA
transfers and no loss for small transfers.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 55 +++++++++++++++++++++++++++++-----------------
1 file changed, 35 insertions(+), 20 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index feff475dddfc..e7856f9c9440 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -493,11 +493,19 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
return cmd << 16 | data;
}
+static int dspi_dma_transfer_size(struct fsl_dspi *dspi)
+{
+ return dspi->words_in_flight * DMA_SLAVE_BUSWIDTH_4_BYTES;
+}
+
static void dspi_tx_dma_callback(void *arg)
{
struct fsl_dspi *dspi = arg;
struct fsl_dspi_dma *dma = dspi->dma;
+ struct device *dev = &dspi->pdev->dev;
+ dma_sync_single_for_cpu(dev, dma->tx_dma_phys,
+ dspi_dma_transfer_size(dspi), DMA_TO_DEVICE);
complete(&dma->cmd_tx_complete);
}
@@ -505,9 +513,13 @@ static void dspi_rx_dma_callback(void *arg)
{
struct fsl_dspi *dspi = arg;
struct fsl_dspi_dma *dma = dspi->dma;
+ struct device *dev = &dspi->pdev->dev;
int i;
if (dspi->rx) {
+ dma_sync_single_for_cpu(dev, dma->rx_dma_phys,
+ dspi_dma_transfer_size(dspi),
+ DMA_FROM_DEVICE);
for (i = 0; i < dspi->words_in_flight; i++)
dspi_push_rx(dspi, dspi->dma->rx_dma_buf[i]);
}
@@ -517,6 +529,7 @@ static void dspi_rx_dma_callback(void *arg)
static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
{
+ size_t size = dspi_dma_transfer_size(dspi);
struct device *dev = &dspi->pdev->dev;
struct fsl_dspi_dma *dma = dspi->dma;
int time_left;
@@ -525,10 +538,9 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
for (i = 0; i < dspi->words_in_flight; i++)
dspi->dma->tx_dma_buf[i] = dspi_pop_tx_pushr(dspi);
+ dma_sync_single_for_device(dev, dma->tx_dma_phys, size, DMA_TO_DEVICE);
dma->tx_desc = dmaengine_prep_slave_single(dma->chan_tx,
- dma->tx_dma_phys,
- dspi->words_in_flight *
- DMA_SLAVE_BUSWIDTH_4_BYTES,
+ dma->tx_dma_phys, size,
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!dma->tx_desc) {
@@ -543,10 +555,10 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
return -EINVAL;
}
+ dma_sync_single_for_device(dev, dma->rx_dma_phys, size,
+ DMA_FROM_DEVICE);
dma->rx_desc = dmaengine_prep_slave_single(dma->chan_rx,
- dma->rx_dma_phys,
- dspi->words_in_flight *
- DMA_SLAVE_BUSWIDTH_4_BYTES,
+ dma->rx_dma_phys, size,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!dma->rx_desc) {
@@ -643,17 +655,17 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
goto err_tx_channel;
}
- dma->tx_dma_buf = dma_alloc_coherent(dma->chan_tx->device->dev,
- dma_bufsize, &dma->tx_dma_phys,
- GFP_KERNEL);
+ dma->tx_dma_buf = dma_alloc_noncoherent(dma->chan_tx->device->dev,
+ dma_bufsize, &dma->tx_dma_phys,
+ DMA_TO_DEVICE, GFP_KERNEL);
if (!dma->tx_dma_buf) {
ret = -ENOMEM;
goto err_tx_dma_buf;
}
- dma->rx_dma_buf = dma_alloc_coherent(dma->chan_rx->device->dev,
- dma_bufsize, &dma->rx_dma_phys,
- GFP_KERNEL);
+ dma->rx_dma_buf = dma_alloc_noncoherent(dma->chan_rx->device->dev,
+ dma_bufsize, &dma->rx_dma_phys,
+ DMA_FROM_DEVICE, GFP_KERNEL);
if (!dma->rx_dma_buf) {
ret = -ENOMEM;
goto err_rx_dma_buf;
@@ -688,11 +700,12 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
return 0;
err_slave_config:
- dma_free_coherent(dma->chan_rx->device->dev,
- dma_bufsize, dma->rx_dma_buf, dma->rx_dma_phys);
+ dma_free_noncoherent(dma->chan_rx->device->dev, dma_bufsize,
+ dma->rx_dma_buf, dma->rx_dma_phys,
+ DMA_FROM_DEVICE);
err_rx_dma_buf:
- dma_free_coherent(dma->chan_tx->device->dev,
- dma_bufsize, dma->tx_dma_buf, dma->tx_dma_phys);
+ dma_free_noncoherent(dma->chan_tx->device->dev, dma_bufsize,
+ dma->tx_dma_buf, dma->tx_dma_phys, DMA_TO_DEVICE);
err_tx_dma_buf:
dma_release_channel(dma->chan_tx);
err_tx_channel:
@@ -713,14 +726,16 @@ static void dspi_release_dma(struct fsl_dspi *dspi)
return;
if (dma->chan_tx) {
- dma_free_coherent(dma->chan_tx->device->dev, dma_bufsize,
- dma->tx_dma_buf, dma->tx_dma_phys);
+ dma_free_noncoherent(dma->chan_tx->device->dev, dma_bufsize,
+ dma->tx_dma_buf, dma->tx_dma_phys,
+ DMA_TO_DEVICE);
dma_release_channel(dma->chan_tx);
}
if (dma->chan_rx) {
- dma_free_coherent(dma->chan_rx->device->dev, dma_bufsize,
- dma->rx_dma_buf, dma->rx_dma_phys);
+ dma_free_noncoherent(dma->chan_rx->device->dev, dma_bufsize,
+ dma->rx_dma_buf, dma->rx_dma_phys,
+ DMA_FROM_DEVICE);
dma_release_channel(dma->chan_rx);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA
2025-06-27 10:21 ` [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA James Clark
@ 2025-06-27 19:38 ` Frank Li
0 siblings, 0 replies; 35+ messages in thread
From: Frank Li @ 2025-06-27 19:38 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:40AM +0100, James Clark wrote:
> Using coherent memory here isn't functionally necessary, we're only
> either sending data to the device or reading from it. This means
> explicit synchronizations are only required around those points and the
> change is fairly trivial.
>
> This gives us around a 10% increase in throughput for large DMA
> transfers and no loss for small transfers.
>
> Suggested-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
> ---
> drivers/spi/spi-fsl-dspi.c | 55 +++++++++++++++++++++++++++++-----------------
> 1 file changed, 35 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index feff475dddfc..e7856f9c9440 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -493,11 +493,19 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
> return cmd << 16 | data;
> }
>
> +static int dspi_dma_transfer_size(struct fsl_dspi *dspi)
> +{
> + return dspi->words_in_flight * DMA_SLAVE_BUSWIDTH_4_BYTES;
> +}
> +
> static void dspi_tx_dma_callback(void *arg)
> {
> struct fsl_dspi *dspi = arg;
> struct fsl_dspi_dma *dma = dspi->dma;
> + struct device *dev = &dspi->pdev->dev;
>
> + dma_sync_single_for_cpu(dev, dma->tx_dma_phys,
> + dspi_dma_transfer_size(dspi), DMA_TO_DEVICE);
> complete(&dma->cmd_tx_complete);
> }
>
> @@ -505,9 +513,13 @@ static void dspi_rx_dma_callback(void *arg)
> {
> struct fsl_dspi *dspi = arg;
> struct fsl_dspi_dma *dma = dspi->dma;
> + struct device *dev = &dspi->pdev->dev;
> int i;
>
> if (dspi->rx) {
> + dma_sync_single_for_cpu(dev, dma->rx_dma_phys,
> + dspi_dma_transfer_size(dspi),
> + DMA_FROM_DEVICE);
> for (i = 0; i < dspi->words_in_flight; i++)
> dspi_push_rx(dspi, dspi->dma->rx_dma_buf[i]);
> }
> @@ -517,6 +529,7 @@ static void dspi_rx_dma_callback(void *arg)
>
> static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
> {
> + size_t size = dspi_dma_transfer_size(dspi);
> struct device *dev = &dspi->pdev->dev;
> struct fsl_dspi_dma *dma = dspi->dma;
> int time_left;
> @@ -525,10 +538,9 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
> for (i = 0; i < dspi->words_in_flight; i++)
> dspi->dma->tx_dma_buf[i] = dspi_pop_tx_pushr(dspi);
>
> + dma_sync_single_for_device(dev, dma->tx_dma_phys, size, DMA_TO_DEVICE);
> dma->tx_desc = dmaengine_prep_slave_single(dma->chan_tx,
> - dma->tx_dma_phys,
> - dspi->words_in_flight *
> - DMA_SLAVE_BUSWIDTH_4_BYTES,
> + dma->tx_dma_phys, size,
> DMA_MEM_TO_DEV,
> DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> if (!dma->tx_desc) {
> @@ -543,10 +555,10 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
> return -EINVAL;
> }
>
> + dma_sync_single_for_device(dev, dma->rx_dma_phys, size,
> + DMA_FROM_DEVICE);
> dma->rx_desc = dmaengine_prep_slave_single(dma->chan_rx,
> - dma->rx_dma_phys,
> - dspi->words_in_flight *
> - DMA_SLAVE_BUSWIDTH_4_BYTES,
> + dma->rx_dma_phys, size,
> DMA_DEV_TO_MEM,
> DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> if (!dma->rx_desc) {
> @@ -643,17 +655,17 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
> goto err_tx_channel;
> }
>
> - dma->tx_dma_buf = dma_alloc_coherent(dma->chan_tx->device->dev,
> - dma_bufsize, &dma->tx_dma_phys,
> - GFP_KERNEL);
> + dma->tx_dma_buf = dma_alloc_noncoherent(dma->chan_tx->device->dev,
> + dma_bufsize, &dma->tx_dma_phys,
> + DMA_TO_DEVICE, GFP_KERNEL);
> if (!dma->tx_dma_buf) {
> ret = -ENOMEM;
> goto err_tx_dma_buf;
> }
>
> - dma->rx_dma_buf = dma_alloc_coherent(dma->chan_rx->device->dev,
> - dma_bufsize, &dma->rx_dma_phys,
> - GFP_KERNEL);
> + dma->rx_dma_buf = dma_alloc_noncoherent(dma->chan_rx->device->dev,
> + dma_bufsize, &dma->rx_dma_phys,
> + DMA_FROM_DEVICE, GFP_KERNEL);
> if (!dma->rx_dma_buf) {
> ret = -ENOMEM;
> goto err_rx_dma_buf;
> @@ -688,11 +700,12 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
> return 0;
>
> err_slave_config:
> - dma_free_coherent(dma->chan_rx->device->dev,
> - dma_bufsize, dma->rx_dma_buf, dma->rx_dma_phys);
> + dma_free_noncoherent(dma->chan_rx->device->dev, dma_bufsize,
> + dma->rx_dma_buf, dma->rx_dma_phys,
> + DMA_FROM_DEVICE);
> err_rx_dma_buf:
> - dma_free_coherent(dma->chan_tx->device->dev,
> - dma_bufsize, dma->tx_dma_buf, dma->tx_dma_phys);
> + dma_free_noncoherent(dma->chan_tx->device->dev, dma_bufsize,
> + dma->tx_dma_buf, dma->tx_dma_phys, DMA_TO_DEVICE);
> err_tx_dma_buf:
> dma_release_channel(dma->chan_tx);
> err_tx_channel:
> @@ -713,14 +726,16 @@ static void dspi_release_dma(struct fsl_dspi *dspi)
> return;
>
> if (dma->chan_tx) {
> - dma_free_coherent(dma->chan_tx->device->dev, dma_bufsize,
> - dma->tx_dma_buf, dma->tx_dma_phys);
> + dma_free_noncoherent(dma->chan_tx->device->dev, dma_bufsize,
> + dma->tx_dma_buf, dma->tx_dma_phys,
> + DMA_TO_DEVICE);
> dma_release_channel(dma->chan_tx);
> }
>
> if (dma->chan_rx) {
> - dma_free_coherent(dma->chan_rx->device->dev, dma_bufsize,
> - dma->rx_dma_buf, dma->rx_dma_phys);
> + dma_free_noncoherent(dma->chan_rx->device->dev, dma_bufsize,
> + dma->rx_dma_buf, dma->rx_dma_phys,
> + DMA_FROM_DEVICE);
> dma_release_channel(dma->chan_rx);
> }
> }
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
` (3 preceding siblings ...)
2025-06-27 10:21 ` [PATCH v4 4/6] spi: spi-fsl-dspi: Use non-coherent memory for DMA James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 19:44 ` Frank Li
2025-07-01 14:47 ` Vladimir Oltean
2025-06-27 10:21 ` [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors James Clark
` (2 subsequent siblings)
7 siblings, 2 replies; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
From: Larisa Grigore <larisa.grigore@nxp.com>
When the device is configured as a target, the host won't stop sending
data while we're draining the buffer which leads to FIFO underflows
and corruption.
Increase the DMA buffer size to the maximum words that edma can
transfer once to reduce the chance of this happening.
While we're here, also change the buffer size for host mode back to a
page as it was before commit a957499bd437 ("spi: spi-fsl-dspi: Fix
bits-per-word acceleration in DMA mode"). dma_alloc_noncoherent()
allocations are backed by a full page anyway, so we might as well use it
all.
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 42 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 38 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index e7856f9c9440..46d3cae9efed 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -493,6 +493,39 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
return cmd << 16 | data;
}
+static int dspi_dma_bufsize(struct fsl_dspi *dspi)
+{
+ if (spi_controller_is_target(dspi->ctlr)) {
+ /*
+ * In target mode we have to be ready to receive the maximum
+ * that can possibly be transferred at once by EDMA without any
+ * FIFO underflows. This is CITER * SSIZE, where SSIZE is a max
+ * of 4 when transferring to a peripheral.
+ */
+ return GENMASK(14, 0) * DMA_SLAVE_BUSWIDTH_4_BYTES;
+ }
+
+ return PAGE_SIZE;
+}
+
+static int dspi_dma_max_datawords(struct fsl_dspi *dspi)
+{
+ /*
+ * Transfers look like this so we always use a full DMA word regardless
+ * of SPI word size:
+ *
+ * 31 16 15 0
+ * -----------------------------------------
+ * | CONTROL WORD | 16-bit DATA |
+ * -----------------------------------------
+ * or
+ * -----------------------------------------
+ * | CONTROL WORD | UNUSED | 8-bit DATA |
+ * -----------------------------------------
+ */
+ return dspi_dma_bufsize(dspi) / DMA_SLAVE_BUSWIDTH_4_BYTES;
+}
+
static int dspi_dma_transfer_size(struct fsl_dspi *dspi)
{
return dspi->words_in_flight * DMA_SLAVE_BUSWIDTH_4_BYTES;
@@ -608,6 +641,7 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
static void dspi_dma_xfer(struct fsl_dspi *dspi)
{
struct spi_message *message = dspi->cur_msg;
+ int max_words = dspi_dma_max_datawords(dspi);
struct device *dev = &dspi->pdev->dev;
/*
@@ -619,8 +653,8 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
dspi_setup_accel(dspi);
dspi->words_in_flight = dspi->len / dspi->oper_word_size;
- if (dspi->words_in_flight > dspi->devtype_data->fifo_size)
- dspi->words_in_flight = dspi->devtype_data->fifo_size;
+ if (dspi->words_in_flight > max_words)
+ dspi->words_in_flight = max_words;
message->actual_length += dspi->words_in_flight *
dspi->oper_word_size;
@@ -635,7 +669,7 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
{
- int dma_bufsize = dspi->devtype_data->fifo_size * 2;
+ int dma_bufsize = dspi_dma_bufsize(dspi);
struct device *dev = &dspi->pdev->dev;
struct dma_slave_config cfg;
struct fsl_dspi_dma *dma;
@@ -719,7 +753,7 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
static void dspi_release_dma(struct fsl_dspi *dspi)
{
- int dma_bufsize = dspi->devtype_data->fifo_size * 2;
+ int dma_bufsize = dspi_dma_bufsize(dspi);
struct fsl_dspi_dma *dma = dspi->dma;
if (!dma)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-06-27 10:21 ` [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size James Clark
@ 2025-06-27 19:44 ` Frank Li
2025-06-30 8:59 ` James Clark
2025-07-01 14:47 ` Vladimir Oltean
1 sibling, 1 reply; 35+ messages in thread
From: Frank Li @ 2025-06-27 19:44 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:41AM +0100, James Clark wrote:
> From: Larisa Grigore <larisa.grigore@nxp.com>
>
> When the device is configured as a target, the host won't stop sending
> data while we're draining the buffer which leads to FIFO underflows
> and corruption.
>
> Increase the DMA buffer size to the maximum words that edma can
> transfer once to reduce the chance of this happening.
>
> While we're here, also change the buffer size for host mode back to a
> page as it was before commit a957499bd437 ("spi: spi-fsl-dspi: Fix
> bits-per-word acceleration in DMA mode"). dma_alloc_noncoherent()
> allocations are backed by a full page anyway, so we might as well use it
> all.
>
> Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> drivers/spi/spi-fsl-dspi.c | 42 ++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 38 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index e7856f9c9440..46d3cae9efed 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -493,6 +493,39 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
> return cmd << 16 | data;
> }
>
> +static int dspi_dma_bufsize(struct fsl_dspi *dspi)
> +{
> + if (spi_controller_is_target(dspi->ctlr)) {
> + /*
> + * In target mode we have to be ready to receive the maximum
> + * that can possibly be transferred at once by EDMA without any
> + * FIFO underflows. This is CITER * SSIZE, where SSIZE is a max
> + * of 4 when transferring to a peripheral.
> + */
> + return GENMASK(14, 0) * DMA_SLAVE_BUSWIDTH_4_BYTES;
> + }
> +
> + return PAGE_SIZE;
> +}
> +
> +static int dspi_dma_max_datawords(struct fsl_dspi *dspi)
> +{
> + /*
> + * Transfers look like this so we always use a full DMA word regardless
> + * of SPI word size:
> + *
> + * 31 16 15 0
> + * -----------------------------------------
> + * | CONTROL WORD | 16-bit DATA |
> + * -----------------------------------------
> + * or
> + * -----------------------------------------
> + * | CONTROL WORD | UNUSED | 8-bit DATA |
> + * -----------------------------------------
> + */
> + return dspi_dma_bufsize(dspi) / DMA_SLAVE_BUSWIDTH_4_BYTES;
> +}
> +
> static int dspi_dma_transfer_size(struct fsl_dspi *dspi)
> {
> return dspi->words_in_flight * DMA_SLAVE_BUSWIDTH_4_BYTES;
> @@ -608,6 +641,7 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
> static void dspi_dma_xfer(struct fsl_dspi *dspi)
> {
> struct spi_message *message = dspi->cur_msg;
> + int max_words = dspi_dma_max_datawords(dspi);
> struct device *dev = &dspi->pdev->dev;
>
> /*
> @@ -619,8 +653,8 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
> dspi_setup_accel(dspi);
>
> dspi->words_in_flight = dspi->len / dspi->oper_word_size;
> - if (dspi->words_in_flight > dspi->devtype_data->fifo_size)
> - dspi->words_in_flight = dspi->devtype_data->fifo_size;
> + if (dspi->words_in_flight > max_words)
> + dspi->words_in_flight = max_words;
you touch this line
dspi->words_in_flight = min(dspi->words_in_flight, max_words);
Frank
>
> message->actual_length += dspi->words_in_flight *
> dspi->oper_word_size;
> @@ -635,7 +669,7 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
>
> static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
> {
> - int dma_bufsize = dspi->devtype_data->fifo_size * 2;
> + int dma_bufsize = dspi_dma_bufsize(dspi);
> struct device *dev = &dspi->pdev->dev;
> struct dma_slave_config cfg;
> struct fsl_dspi_dma *dma;
> @@ -719,7 +753,7 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
>
> static void dspi_release_dma(struct fsl_dspi *dspi)
> {
> - int dma_bufsize = dspi->devtype_data->fifo_size * 2;
> + int dma_bufsize = dspi_dma_bufsize(dspi);
> struct fsl_dspi_dma *dma = dspi->dma;
>
> if (!dma)
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-06-27 19:44 ` Frank Li
@ 2025-06-30 8:59 ` James Clark
0 siblings, 0 replies; 35+ messages in thread
From: James Clark @ 2025-06-30 8:59 UTC (permalink / raw)
To: Frank Li
Cc: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
On 27/06/2025 8:44 pm, Frank Li wrote:
> On Fri, Jun 27, 2025 at 11:21:41AM +0100, James Clark wrote:
>> From: Larisa Grigore <larisa.grigore@nxp.com>
>>
>> When the device is configured as a target, the host won't stop sending
>> data while we're draining the buffer which leads to FIFO underflows
>> and corruption.
>>
>> Increase the DMA buffer size to the maximum words that edma can
>> transfer once to reduce the chance of this happening.
>>
>> While we're here, also change the buffer size for host mode back to a
>> page as it was before commit a957499bd437 ("spi: spi-fsl-dspi: Fix
>> bits-per-word acceleration in DMA mode"). dma_alloc_noncoherent()
>> allocations are backed by a full page anyway, so we might as well use it
>> all.
>>
>> Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>> drivers/spi/spi-fsl-dspi.c | 42 ++++++++++++++++++++++++++++++++++++++----
>> 1 file changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
>> index e7856f9c9440..46d3cae9efed 100644
>> --- a/drivers/spi/spi-fsl-dspi.c
>> +++ b/drivers/spi/spi-fsl-dspi.c
>> @@ -493,6 +493,39 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
>> return cmd << 16 | data;
>> }
>>
>> +static int dspi_dma_bufsize(struct fsl_dspi *dspi)
>> +{
>> + if (spi_controller_is_target(dspi->ctlr)) {
>> + /*
>> + * In target mode we have to be ready to receive the maximum
>> + * that can possibly be transferred at once by EDMA without any
>> + * FIFO underflows. This is CITER * SSIZE, where SSIZE is a max
>> + * of 4 when transferring to a peripheral.
>> + */
>> + return GENMASK(14, 0) * DMA_SLAVE_BUSWIDTH_4_BYTES;
>> + }
>> +
>> + return PAGE_SIZE;
>> +}
>> +
>> +static int dspi_dma_max_datawords(struct fsl_dspi *dspi)
>> +{
>> + /*
>> + * Transfers look like this so we always use a full DMA word regardless
>> + * of SPI word size:
>> + *
>> + * 31 16 15 0
>> + * -----------------------------------------
>> + * | CONTROL WORD | 16-bit DATA |
>> + * -----------------------------------------
>> + * or
>> + * -----------------------------------------
>> + * | CONTROL WORD | UNUSED | 8-bit DATA |
>> + * -----------------------------------------
>> + */
>> + return dspi_dma_bufsize(dspi) / DMA_SLAVE_BUSWIDTH_4_BYTES;
>> +}
>> +
>> static int dspi_dma_transfer_size(struct fsl_dspi *dspi)
>> {
>> return dspi->words_in_flight * DMA_SLAVE_BUSWIDTH_4_BYTES;
>> @@ -608,6 +641,7 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
>> static void dspi_dma_xfer(struct fsl_dspi *dspi)
>> {
>> struct spi_message *message = dspi->cur_msg;
>> + int max_words = dspi_dma_max_datawords(dspi);
>> struct device *dev = &dspi->pdev->dev;
>>
>> /*
>> @@ -619,8 +653,8 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
>> dspi_setup_accel(dspi);
>>
>> dspi->words_in_flight = dspi->len / dspi->oper_word_size;
>> - if (dspi->words_in_flight > dspi->devtype_data->fifo_size)
>> - dspi->words_in_flight = dspi->devtype_data->fifo_size;
>> + if (dspi->words_in_flight > max_words)
>> + dspi->words_in_flight = max_words;
>
> you touch this line
>
> dspi->words_in_flight = min(dspi->words_in_flight, max_words);
>
> Frank
Ack
>>
>> message->actual_length += dspi->words_in_flight *
>> dspi->oper_word_size;
>> @@ -635,7 +669,7 @@ static void dspi_dma_xfer(struct fsl_dspi *dspi)
>>
>> static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
>> {
>> - int dma_bufsize = dspi->devtype_data->fifo_size * 2;
>> + int dma_bufsize = dspi_dma_bufsize(dspi);
>> struct device *dev = &dspi->pdev->dev;
>> struct dma_slave_config cfg;
>> struct fsl_dspi_dma *dma;
>> @@ -719,7 +753,7 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
>>
>> static void dspi_release_dma(struct fsl_dspi *dspi)
>> {
>> - int dma_bufsize = dspi->devtype_data->fifo_size * 2;
>> + int dma_bufsize = dspi_dma_bufsize(dspi);
>> struct fsl_dspi_dma *dma = dspi->dma;
>>
>> if (!dma)
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-06-27 10:21 ` [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size James Clark
2025-06-27 19:44 ` Frank Li
@ 2025-07-01 14:47 ` Vladimir Oltean
2025-07-01 15:08 ` James Clark
2025-07-01 15:09 ` Arnd Bergmann
1 sibling, 2 replies; 35+ messages in thread
From: Vladimir Oltean @ 2025-07-01 14:47 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:41AM +0100, James Clark wrote:
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index e7856f9c9440..46d3cae9efed 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -493,6 +493,39 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
> return cmd << 16 | data;
> }
>
> +static int dspi_dma_bufsize(struct fsl_dspi *dspi)
> +{
> + if (spi_controller_is_target(dspi->ctlr)) {
> + /*
> + * In target mode we have to be ready to receive the maximum
> + * that can possibly be transferred at once by EDMA without any
> + * FIFO underflows. This is CITER * SSIZE, where SSIZE is a max
> + * of 4 when transferring to a peripheral.
> + */
> + return GENMASK(14, 0) * DMA_SLAVE_BUSWIDTH_4_BYTES;
Is this really a constant that can be hardcoded? Should this be queried
from the EDMA driver somehow?
I'm not well versed in the dmaengine/dma-mapping API at all, but I see
fsl_edma_probe() makes a call to dma_set_max_seg_size(), which consumer
drivers such as DSPI can query using dma_get_max_seg_size(). To the
untrained eye, and from a great distance, it looks like the value you're
interested in. Apologies if that isn't the case.
> + }
> +
> + return PAGE_SIZE;
> +}
The other question is: what's fundamentally different between the host
and target operating modes, such that we return different values? Why
not the same?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-07-01 14:47 ` Vladimir Oltean
@ 2025-07-01 15:08 ` James Clark
2025-07-01 15:09 ` Arnd Bergmann
1 sibling, 0 replies; 35+ messages in thread
From: James Clark @ 2025-07-01 15:08 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 01/07/2025 3:47 pm, Vladimir Oltean wrote:
> On Fri, Jun 27, 2025 at 11:21:41AM +0100, James Clark wrote:
>> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
>> index e7856f9c9440..46d3cae9efed 100644
>> --- a/drivers/spi/spi-fsl-dspi.c
>> +++ b/drivers/spi/spi-fsl-dspi.c
>> @@ -493,6 +493,39 @@ static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
>> return cmd << 16 | data;
>> }
>>
>> +static int dspi_dma_bufsize(struct fsl_dspi *dspi)
>> +{
>> + if (spi_controller_is_target(dspi->ctlr)) {
>> + /*
>> + * In target mode we have to be ready to receive the maximum
>> + * that can possibly be transferred at once by EDMA without any
>> + * FIFO underflows. This is CITER * SSIZE, where SSIZE is a max
>> + * of 4 when transferring to a peripheral.
>> + */
>> + return GENMASK(14, 0) * DMA_SLAVE_BUSWIDTH_4_BYTES;
>
> Is this really a constant that can be hardcoded? Should this be queried
> from the EDMA driver somehow?
>
> I'm not well versed in the dmaengine/dma-mapping API at all, but I see
> fsl_edma_probe() makes a call to dma_set_max_seg_size(), which consumer
> drivers such as DSPI can query using dma_get_max_seg_size(). To the
> untrained eye, and from a great distance, it looks like the value you're
> interested in. Apologies if that isn't the case.
>
You're probably right, and there's no particular reason to hard code it
if it can be queried. I'll have a look at this.
>> + }
>> +
>> + return PAGE_SIZE;
>> +}
>
> The other question is: what's fundamentally different between the host
> and target operating modes, such that we return different values? Why
> not the same?
This is missing from the commit message, but the reason is because it's
a large allocation (256K with both tx and rx buffers) that should be
avoided unless absolutely necessary so we wanted to limit it to only
target devices.
The other reason to not allocate it dynamically based on the size of the
message is because we assumed that it was better to do large contiguous
allocations at boot time. If it's delayed until the device is used then
the allocation might fail due to memory fragmentation.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size
2025-07-01 14:47 ` Vladimir Oltean
2025-07-01 15:08 ` James Clark
@ 2025-07-01 15:09 ` Arnd Bergmann
1 sibling, 0 replies; 35+ messages in thread
From: Arnd Bergmann @ 2025-07-01 15:09 UTC (permalink / raw)
To: Vladimir Oltean, James Clark
Cc: Vladimir Oltean, Mark Brown, Larisa Grigore, Frank Li,
Christoph Hellwig, linux-spi, imx, linux-kernel
On Tue, Jul 1, 2025, at 16:47, Vladimir Oltean wrote:
> On Fri, Jun 27, 2025 at 11:21:41AM +0100, James Clark wrote:
>> + }
>> +
>> + return PAGE_SIZE;
>> +}
>
> The other question is: what's fundamentally different between the host
> and target operating modes, such that we return different values? Why
> not the same?
In host mode, the driver is able to split up a transfer into smaller
chunks, while in target mode the length of the transfer is determined
by the remote host and can be larger than whatever default buffer
size we pick.
Using PAGE_SIZE as the default host buffer makes sense since that
is the smallest underlying size for dma_alloc_noncoherent, and
larger buffers would be fairly wasteful.
Endpoint mode should only be enabled if it's actually being used
and in that case the allocation is as large as possible.
Arnd
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
` (4 preceding siblings ...)
2025-06-27 10:21 ` [PATCH v4 5/6] spi: spi-fsl-dspi: Increase DMA buffer size James Clark
@ 2025-06-27 10:21 ` James Clark
2025-06-27 19:56 ` Frank Li
2025-06-30 11:40 ` (subset) [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements Mark Brown
2025-06-30 15:26 ` Vladimir Oltean
7 siblings, 1 reply; 35+ messages in thread
From: James Clark @ 2025-06-27 10:21 UTC (permalink / raw)
To: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Frank Li, Christoph Hellwig
Cc: linux-spi, imx, linux-kernel, James Clark
In target mode, the host sending more data than can be consumed would be
a common problem for any message exceeding the FIFO or DMA buffer size.
Cancel the whole message as soon as this condition is hit as the message
will be corrupted.
Only do this for target mode in a DMA transfer, it's not likely these
flags will be set in host mode so it's not worth adding extra checks. In
IRQ and polling modes we use the same transfer functions for hosts and
targets so the error flags always get checked. This is slightly
inconsistent but it's not worth doing the check conditionally because it
may catch some host programming errors in the future.
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/spi/spi-fsl-dspi.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 46d3cae9efed..2c2a263c5276 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -478,6 +478,17 @@ static void dspi_push_rx(struct fsl_dspi *dspi, u32 rxdata)
dspi->dev_to_host(dspi, rxdata);
}
+static int dspi_fifo_error(struct fsl_dspi *dspi, u32 spi_sr)
+{
+ if (spi_sr & (SPI_SR_TFUF | SPI_SR_RFOF)) {
+ dev_err_ratelimited(&dspi->pdev->dev, "FIFO errors:%s%s\n",
+ spi_sr & SPI_SR_TFUF ? " TX underflow," : "",
+ spi_sr & SPI_SR_RFOF ? " RX overflow," : "");
+ return -EIO;
+ }
+ return 0;
+}
+
#if IS_ENABLED(CONFIG_DMA_ENGINE)
/* Prepare one TX FIFO entry (txdata plus cmd) */
@@ -566,6 +577,7 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
struct device *dev = &dspi->pdev->dev;
struct fsl_dspi_dma *dma = dspi->dma;
int time_left;
+ u32 spi_sr;
int i;
for (i = 0; i < dspi->words_in_flight; i++)
@@ -614,7 +626,8 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
if (spi_controller_is_target(dspi->ctlr)) {
wait_for_completion_interruptible(&dspi->dma->cmd_rx_complete);
- return 0;
+ regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+ return dspi_fifo_error(dspi, spi_sr);
}
time_left = wait_for_completion_timeout(&dspi->dma->cmd_tx_complete,
@@ -1067,6 +1080,9 @@ static void dspi_poll(struct fsl_dspi *dspi)
regmap_read(dspi->regmap, SPI_SR, &spi_sr);
regmap_write(dspi->regmap, SPI_SR, spi_sr);
+ dspi->cur_msg->status = dspi_fifo_error(dspi, spi_sr);
+ if (dspi->cur_msg->status)
+ return;
if (spi_sr & SPI_SR_CMDTCF)
break;
}
@@ -1085,6 +1101,7 @@ static void dspi_poll(struct fsl_dspi *dspi)
static irqreturn_t dspi_interrupt(int irq, void *dev_id)
{
struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
+ int status;
u32 spi_sr;
regmap_read(dspi->regmap, SPI_SR, &spi_sr);
@@ -1093,6 +1110,14 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
if (!(spi_sr & SPI_SR_CMDTCF))
return IRQ_NONE;
+ status = dspi_fifo_error(dspi, spi_sr);
+ if (status) {
+ if (dspi->cur_msg)
+ WRITE_ONCE(dspi->cur_msg->status, status);
+ complete(&dspi->xfer_done);
+ return IRQ_HANDLED;
+ }
+
dspi_rxtx(dspi);
if (!dspi->len) {
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors
2025-06-27 10:21 ` [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors James Clark
@ 2025-06-27 19:56 ` Frank Li
2025-06-27 21:41 ` Mark Brown
0 siblings, 1 reply; 35+ messages in thread
From: Frank Li @ 2025-06-27 19:56 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:42AM +0100, James Clark wrote:
> In target mode, the host sending more data than can be consumed would be
> a common problem for any message exceeding the FIFO or DMA buffer size.
> Cancel the whole message as soon as this condition is hit as the message
> will be corrupted.
>
> Only do this for target mode in a DMA transfer, it's not likely these
> flags will be set in host mode
"not likely", I think SPI controller should stop CLK if FIFO empty and full.
It should be "never" happen.
Only check FIFO error flags at target mode because it never happen at host mode.
You can remove below whole paragram.
Frank
> so it's not worth adding extra checks. In
> IRQ and polling modes we use the same transfer functions for hosts and
> targets so the error flags always get checked. This is slightly
> inconsistent but it's not worth doing the check conditionally because it
> may catch some host programming errors in the future.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> drivers/spi/spi-fsl-dspi.c | 27 ++++++++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
> index 46d3cae9efed..2c2a263c5276 100644
> --- a/drivers/spi/spi-fsl-dspi.c
> +++ b/drivers/spi/spi-fsl-dspi.c
> @@ -478,6 +478,17 @@ static void dspi_push_rx(struct fsl_dspi *dspi, u32 rxdata)
> dspi->dev_to_host(dspi, rxdata);
> }
>
> +static int dspi_fifo_error(struct fsl_dspi *dspi, u32 spi_sr)
> +{
> + if (spi_sr & (SPI_SR_TFUF | SPI_SR_RFOF)) {
> + dev_err_ratelimited(&dspi->pdev->dev, "FIFO errors:%s%s\n",
> + spi_sr & SPI_SR_TFUF ? " TX underflow," : "",
> + spi_sr & SPI_SR_RFOF ? " RX overflow," : "");
> + return -EIO;
> + }
> + return 0;
> +}
> +
> #if IS_ENABLED(CONFIG_DMA_ENGINE)
>
> /* Prepare one TX FIFO entry (txdata plus cmd) */
> @@ -566,6 +577,7 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
> struct device *dev = &dspi->pdev->dev;
> struct fsl_dspi_dma *dma = dspi->dma;
> int time_left;
> + u32 spi_sr;
> int i;
>
> for (i = 0; i < dspi->words_in_flight; i++)
> @@ -614,7 +626,8 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
>
> if (spi_controller_is_target(dspi->ctlr)) {
> wait_for_completion_interruptible(&dspi->dma->cmd_rx_complete);
> - return 0;
> + regmap_read(dspi->regmap, SPI_SR, &spi_sr);
> + return dspi_fifo_error(dspi, spi_sr);
> }
>
> time_left = wait_for_completion_timeout(&dspi->dma->cmd_tx_complete,
> @@ -1067,6 +1080,9 @@ static void dspi_poll(struct fsl_dspi *dspi)
> regmap_read(dspi->regmap, SPI_SR, &spi_sr);
> regmap_write(dspi->regmap, SPI_SR, spi_sr);
>
> + dspi->cur_msg->status = dspi_fifo_error(dspi, spi_sr);
> + if (dspi->cur_msg->status)
> + return;
> if (spi_sr & SPI_SR_CMDTCF)
> break;
> }
> @@ -1085,6 +1101,7 @@ static void dspi_poll(struct fsl_dspi *dspi)
> static irqreturn_t dspi_interrupt(int irq, void *dev_id)
> {
> struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
> + int status;
> u32 spi_sr;
>
> regmap_read(dspi->regmap, SPI_SR, &spi_sr);
> @@ -1093,6 +1110,14 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
> if (!(spi_sr & SPI_SR_CMDTCF))
> return IRQ_NONE;
>
> + status = dspi_fifo_error(dspi, spi_sr);
> + if (status) {
> + if (dspi->cur_msg)
> + WRITE_ONCE(dspi->cur_msg->status, status);
> + complete(&dspi->xfer_done);
> + return IRQ_HANDLED;
> + }
> +
> dspi_rxtx(dspi);
>
> if (!dspi->len) {
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors
2025-06-27 19:56 ` Frank Li
@ 2025-06-27 21:41 ` Mark Brown
2025-06-30 10:46 ` James Clark
0 siblings, 1 reply; 35+ messages in thread
From: Mark Brown @ 2025-06-27 21:41 UTC (permalink / raw)
To: Frank Li
Cc: James Clark, Vladimir Oltean, Vladimir Oltean, Arnd Bergmann,
Larisa Grigore, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]
On Fri, Jun 27, 2025 at 03:56:43PM -0400, Frank Li wrote:
> On Fri, Jun 27, 2025 at 11:21:42AM +0100, James Clark wrote:
> > In target mode, the host sending more data than can be consumed would be
> > a common problem for any message exceeding the FIFO or DMA buffer size.
> > Cancel the whole message as soon as this condition is hit as the message
> > will be corrupted.
> > Only do this for target mode in a DMA transfer, it's not likely these
> > flags will be set in host mode
> "not likely", I think SPI controller should stop CLK if FIFO empty and full.
> It should be "never" happen.
> Only check FIFO error flags at target mode because it never happen at host mode.
> You can remove below whole paragram.
I agree it *should* never happen in host mode but it can be good
practice to look in case something gets messed up. That said extra
device accesses in the hot path are probably going to be noticable so
likely not worth it, but in the dspi_poll() case:
> @@ -1067,6 +1080,9 @@ static void dspi_poll(struct fsl_dspi *dspi)
> regmap_read(dspi->regmap, SPI_SR, &spi_sr);
> regmap_write(dspi->regmap, SPI_SR, spi_sr);
>
> + dspi->cur_msg->status = dspi_fifo_error(dspi, spi_sr);
> + if (dspi->cur_msg->status)
> + return;
we're reading the register anyway so the overhead is much lower.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors
2025-06-27 21:41 ` Mark Brown
@ 2025-06-30 10:46 ` James Clark
0 siblings, 0 replies; 35+ messages in thread
From: James Clark @ 2025-06-30 10:46 UTC (permalink / raw)
To: Mark Brown, Frank Li
Cc: Vladimir Oltean, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Christoph Hellwig, linux-spi, imx, linux-kernel
On 27/06/2025 10:41 pm, Mark Brown wrote:
> On Fri, Jun 27, 2025 at 03:56:43PM -0400, Frank Li wrote:
>> On Fri, Jun 27, 2025 at 11:21:42AM +0100, James Clark wrote:
>
>>> In target mode, the host sending more data than can be consumed would be
>>> a common problem for any message exceeding the FIFO or DMA buffer size.
>>> Cancel the whole message as soon as this condition is hit as the message
>>> will be corrupted.
>
>>> Only do this for target mode in a DMA transfer, it's not likely these
>>> flags will be set in host mode
>
>> "not likely", I think SPI controller should stop CLK if FIFO empty and full.
>> It should be "never" happen.
>
>> Only check FIFO error flags at target mode because it never happen at host mode.
>
>> You can remove below whole paragram.
>
> I agree it *should* never happen in host mode but it can be good
> practice to look in case something gets messed up. That said extra
> device accesses in the hot path are probably going to be noticable so
> likely not worth it, but in the dspi_poll() case:
>
Yeah the point was to be defensive. Even though it shouldn't happen I
don't see an issue with checking error flags.
We could add a condition to only do it in target mode, but it doesn't
make the code more readable, and it wouldn't be any faster than just
checking the flags. So I'm not sure what problem we're trying to solve
by removing it.
>> @@ -1067,6 +1080,9 @@ static void dspi_poll(struct fsl_dspi *dspi)
>> regmap_read(dspi->regmap, SPI_SR, &spi_sr);
>> regmap_write(dspi->regmap, SPI_SR, spi_sr);
>>
>> + dspi->cur_msg->status = dspi_fifo_error(dspi, spi_sr);
>> + if (dspi->cur_msg->status)
>> + return;
>
> we're reading the register anyway so the overhead is much lower.
In both poll and interrupt mode we're already reading it. Only DMA mode
didn't have a read and I didn't add a check for error flags there anyway.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: (subset) [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
` (5 preceding siblings ...)
2025-06-27 10:21 ` [PATCH v4 6/6] spi: spi-fsl-dspi: Report FIFO overflows as errors James Clark
@ 2025-06-30 11:40 ` Mark Brown
2025-06-30 15:26 ` Vladimir Oltean
7 siblings, 0 replies; 35+ messages in thread
From: Mark Brown @ 2025-06-30 11:40 UTC (permalink / raw)
To: Vladimir Oltean, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, James Clark
Cc: linux-spi, imx, linux-kernel
On Fri, 27 Jun 2025 11:21:36 +0100, James Clark wrote:
> Improve usability of target mode by reporting FIFO errors and increasing
> the buffer size when DMA is used. While we're touching DMA stuff also
> switch to non-coherent memory, although this is unrelated to target
> mode.
>
> The first commit is marked as a fix because it can fix intermittent
> issues with existing transfers, rather than the later fixes which
> improve larger than FIFO target mode transfers which would have never
> worked.
>
> [...]
Applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
Thanks!
[1/6] spi: spi-fsl-dspi: Clear completion counter before initiating transfer
commit: fa60c094c19b97e103d653f528f8d9c178b6a5f5
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-06-27 10:21 [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements James Clark
` (6 preceding siblings ...)
2025-06-30 11:40 ` (subset) [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements Mark Brown
@ 2025-06-30 15:26 ` Vladimir Oltean
2025-07-01 12:42 ` James Clark
7 siblings, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-06-30 15:26 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Fri, Jun 27, 2025 at 11:21:36AM +0100, James Clark wrote:
> Improve usability of target mode by reporting FIFO errors and increasing
> the buffer size when DMA is used. While we're touching DMA stuff also
> switch to non-coherent memory, although this is unrelated to target
> mode.
>
> The first commit is marked as a fix because it can fix intermittent
> issues with existing transfers, rather than the later fixes which
> improve larger than FIFO target mode transfers which would have never
> worked.
>
> With the combination of the commit to increase the DMA buffer size and
> the commit to use non-coherent memory, the host mode performance figures
> are as follows on S32G3:
>
> # spidev_test --device /dev/spidev1.0 --bpw 8 --size <test_size> --cpha --iter 10000000 --speed 10000000
>
> Coherent (4096 byte transfers): 6534 kbps
> Non-coherent: 7347 kbps
>
> Coherent (16 byte transfers): 447 kbps
> Non-coherent: 448 kbps
>
> Just for comparison running the same test in XSPI mode:
>
> 4096 byte transfers: 2143 kbps
> 16 byte transfers: 637 kbps
>
> These tests required hacking S32G3 to use DMA in host mode, although
> the figures should be representative of target mode too where DMA is
> used. And the other devices that use DMA in host mode should see similar
> improvements.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
My test numbers on LS1028A:
Baseline XSPI (unmodified driver):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 2710.6kbps, rx 2710.6kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 3217.5kbps, rx 3217.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 5118.4kbps, rx 5118.4kbps
Baseline DMA (modified just DSPI_XSPI_MODE -> DSPI_DMA_MODE):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 1359.5kbps, rx 1359.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 1461.1kbps, rx 1461.1kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 1664.6kbps, rx 1664.6kbps
Intermediary LS1028A DMA mode (using non-coherent buffers but still
small DMA buffers, i.e. holding just 1 FIFO size worth of data):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 1345.1kbps, rx 1345.1kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 1522.5kbps, rx 1522.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 1690.8kbps, rx 1690.8kbps
Final LS1028A DMA mode (with the patch to send large messages as a
single DMA buffer applied):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 2247.0kbps, rx 2247.0kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 3477.4kbps, rx 3477.4kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 8978.4kbps, rx 8978.4kbps
So after your patch set, DMA mode on LS1028A becomes more performant and
should replace XSPI. This is an outstanding result. That can be done as
follow-up work.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-06-30 15:26 ` Vladimir Oltean
@ 2025-07-01 12:42 ` James Clark
2025-07-01 13:18 ` Mark Brown
2025-07-01 13:57 ` Vladimir Oltean
0 siblings, 2 replies; 35+ messages in thread
From: James Clark @ 2025-07-01 12:42 UTC (permalink / raw)
To: Vladimir Oltean
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On 30/06/2025 4:26 pm, Vladimir Oltean wrote:
> On Fri, Jun 27, 2025 at 11:21:36AM +0100, James Clark wrote:
>> Improve usability of target mode by reporting FIFO errors and increasing
>> the buffer size when DMA is used. While we're touching DMA stuff also
>> switch to non-coherent memory, although this is unrelated to target
>> mode.
>>
>> The first commit is marked as a fix because it can fix intermittent
>> issues with existing transfers, rather than the later fixes which
>> improve larger than FIFO target mode transfers which would have never
>> worked.
>>
>> With the combination of the commit to increase the DMA buffer size and
>> the commit to use non-coherent memory, the host mode performance figures
>> are as follows on S32G3:
>>
>> # spidev_test --device /dev/spidev1.0 --bpw 8 --size <test_size> --cpha --iter 10000000 --speed 10000000
>>
>> Coherent (4096 byte transfers): 6534 kbps
>> Non-coherent: 7347 kbps
>>
>> Coherent (16 byte transfers): 447 kbps
>> Non-coherent: 448 kbps
>>
>> Just for comparison running the same test in XSPI mode:
>>
>> 4096 byte transfers: 2143 kbps
>> 16 byte transfers: 637 kbps
>>
>> These tests required hacking S32G3 to use DMA in host mode, although
>> the figures should be representative of target mode too where DMA is
>> used. And the other devices that use DMA in host mode should see similar
>> improvements.
>>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>
> My test numbers on LS1028A:
>
> Baseline XSPI (unmodified driver):
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
> rate: tx 2710.6kbps, rx 2710.6kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
> rate: tx 3217.5kbps, rx 3217.5kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
> rate: tx 5118.4kbps, rx 5118.4kbps
>
> Baseline DMA (modified just DSPI_XSPI_MODE -> DSPI_DMA_MODE):
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
> rate: tx 1359.5kbps, rx 1359.5kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
> rate: tx 1461.1kbps, rx 1461.1kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
> rate: tx 1664.6kbps, rx 1664.6kbps
>
> Intermediary LS1028A DMA mode (using non-coherent buffers but still
> small DMA buffers, i.e. holding just 1 FIFO size worth of data):
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
> rate: tx 1345.1kbps, rx 1345.1kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
> rate: tx 1522.5kbps, rx 1522.5kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
> rate: tx 1690.8kbps, rx 1690.8kbps
>
> Final LS1028A DMA mode (with the patch to send large messages as a
> single DMA buffer applied):
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
> rate: tx 2247.0kbps, rx 2247.0kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
> rate: tx 3477.4kbps, rx 3477.4kbps
> $ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
> rate: tx 8978.4kbps, rx 8978.4kbps
>
> So after your patch set, DMA mode on LS1028A becomes more performant and
> should replace XSPI. This is an outstanding result. That can be done as
> follow-up work.
I wonder if latency could be higher despite increased throughput? It
probably wouldn't be a big enough increase that anyone would care. And
based on the structure of the driver if throughput is higher the latency
might even be lower.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 12:42 ` James Clark
@ 2025-07-01 13:18 ` Mark Brown
2025-07-01 13:57 ` Vladimir Oltean
1 sibling, 0 replies; 35+ messages in thread
From: Mark Brown @ 2025-07-01 13:18 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 782 bytes --]
On Tue, Jul 01, 2025 at 01:42:46PM +0100, James Clark wrote:
> On 30/06/2025 4:26 pm, Vladimir Oltean wrote:
> > On Fri, Jun 27, 2025 at 11:21:36AM +0100, James Clark wrote:
> > So after your patch set, DMA mode on LS1028A becomes more performant and
> > should replace XSPI. This is an outstanding result. That can be done as
> > follow-up work.
> I wonder if latency could be higher despite increased throughput? It
> probably wouldn't be a big enough increase that anyone would care. And based
> on the structure of the driver if throughput is higher the latency might
> even be lower.
Some CAN users are likely to care, some of them care a lot about
throughput so will be happy but AIUI some are very focused on round trip
times. That's a fairly specialist use case though.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 12:42 ` James Clark
2025-07-01 13:18 ` Mark Brown
@ 2025-07-01 13:57 ` Vladimir Oltean
2025-07-01 14:36 ` Mark Brown
1 sibling, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-07-01 13:57 UTC (permalink / raw)
To: James Clark
Cc: Vladimir Oltean, Mark Brown, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Tue, Jul 01, 2025 at 01:42:46PM +0100, James Clark wrote:
> I wonder if latency could be higher despite increased throughput? It
> probably wouldn't be a big enough increase that anyone would care. And based
> on the structure of the driver if throughput is higher the latency might
> even be lower.
Actually, I do have a metric for that, sort of. I have a SPI-controlled
Ethernet switch with support for IEEE 1588, and synchronizing its
hardware clock over SPI benefits greatly from having a high precision
software timestamping point for the SPI transfers themselves.
Essentially, with XSPI FIFO mode we are able to provide a timestamping
granularity of $(FIFO size) words, see the spi_take_timestamp_pre() and
spi_take_timestamp_post() calls. Whereas with DMA, we let the core take
a message-level software timestamp which is much coarser, because at
driver level we can't guarantee a much more precise transmission time
interval for a particular requested byte. See __spi_pump_transfer_message().
If you're not familiar with phc2sys, an interpretation of the logs below
is as follows.
phc2sys synchronizes the sw2p0 (target) clock to CLOCK_REALTIME (the
source clock). "delay" is the time it took for the kernel to read the
target clock once, and the system clock twice (before and after).
When software timestamps the SPI transfer that reads the hardware time,
this is called a "cross timestamp". The smaller and less jittery this
delay, the more stable the cross-timestamp and the better will software
be able to discipline the target clock (aka the smaller the offsets will
be).
Before:
$ phc2sys -s CLOCK_REALTIME -c sw2p0 -O 0 -m
phc2sys[38.432]: sw2p0 sys offset -1741272972548124929 s0 freq +0 delay 6720
phc2sys[39.434]: sw2p0 sys offset -1741272972548179141 s1 freq -54094 delay 5960
phc2sys[40.436]: sw2p0 sys offset 190 s2 freq -53904 delay 6001
phc2sys[41.437]: sw2p0 sys offset 306 s2 freq -53731 delay 6520
phc2sys[42.438]: sw2p0 sys offset 275 s2 freq -53670 delay 6401
phc2sys[43.441]: sw2p0 sys offset 350 s2 freq -53513 delay 6881
phc2sys[44.442]: sw2p0 sys offset -302 s2 freq -54060 delay 6321
phc2sys[45.444]: sw2p0 sys offset 35 s2 freq -53814 delay 6761
phc2sys[46.446]: sw2p0 sys offset -103 s2 freq -53941 delay 6481
phc2sys[47.447]: sw2p0 sys offset -43 s2 freq -53912 delay 6361
phc2sys[48.450]: sw2p0 sys offset 314 s2 freq -53568 delay 6960
phc2sys[49.451]: sw2p0 sys offset -310 s2 freq -54098 delay 6441
phc2sys[50.453]: sw2p0 sys offset -86 s2 freq -53967 delay 6321
phc2sys[51.455]: sw2p0 sys offset -5 s2 freq -53911 delay 6401
phc2sys[52.457]: sw2p0 sys offset -2 s2 freq -53910 delay 6320
phc2sys[53.458]: sw2p0 sys offset 77 s2 freq -53832 delay 6400
phc2sys[54.459]: sw2p0 sys offset -112 s2 freq -53997 delay 6240
phc2sys[55.461]: sw2p0 sys offset 66 s2 freq -53853 delay 6480
phc2sys[56.463]: sw2p0 sys offset -33 s2 freq -53932 delay 6441
phc2sys[57.465]: sw2p0 sys offset -33 s2 freq -53942 delay 6441
phc2sys[58.467]: sw2p0 sys offset 17 s2 freq -53902 delay 6440
phc2sys[59.468]: sw2p0 sys offset -14 s2 freq -53928 delay 6520
phc2sys[60.470]: sw2p0 sys offset -133 s2 freq -54051 delay 6281
phc2sys[61.472]: sw2p0 sys offset 8 s2 freq -53950 delay 6400
phc2sys[62.473]: sw2p0 sys offset 25 s2 freq -53931 delay 6400
phc2sys[63.474]: sw2p0 sys offset -113 s2 freq -54061 delay 6040
phc2sys[64.476]: sw2p0 sys offset 44 s2 freq -53938 delay 6281
phc2sys[65.477]: sw2p0 sys offset -17 s2 freq -53986 delay 6320
phc2sys[66.479]: sw2p0 sys offset -86 s2 freq -54060 delay 5841
phc2sys[67.480]: sw2p0 sys offset 141 s2 freq -53859 delay 6361
phc2sys[68.481]: sw2p0 sys offset -11 s2 freq -53968 delay 6320
phc2sys[69.483]: sw2p0 sys offset -15 s2 freq -53976 delay 6321
phc2sys[70.484]: sw2p0 sys offset -109 s2 freq -54074 delay 5960
phc2sys[71.486]: sw2p0 sys offset 115 s2 freq -53883 delay 6520
phc2sys[72.488]: sw2p0 sys offset -86 s2 freq -54049 delay 6280
phc2sys[73.489]: sw2p0 sys offset 234 s2 freq -53755 delay 6801
phc2sys[74.491]: sw2p0 sys offset -219 s2 freq -54138 delay 6361
^Cphc2sys[74.923]: sw2p0 sys offset -174 s2 freq -54159 delay 6360
After:
$ phc2sys -s CLOCK_REALTIME -c sw2p0 -O 0 -m
phc2sys[753.479]: sw2p0 sys offset 1882248595 s0 freq +32000000 delay 150440
phc2sys[754.482]: sw2p0 sys offset 1850232103 s1 freq +46787 delay 141960
phc2sys[755.483]: sw2p0 sys offset -33278 s2 freq +13509 delay 143160
phc2sys[756.485]: sw2p0 sys offset -5074 s2 freq +31729 delay 150040
phc2sys[757.486]: sw2p0 sys offset 11060 s2 freq +46341 delay 140240
phc2sys[758.488]: sw2p0 sys offset 4804 s2 freq +43403 delay 151320
phc2sys[759.489]: sw2p0 sys offset 10358 s2 freq +50398 delay 141879
phc2sys[760.491]: sw2p0 sys offset 409 s2 freq +43557 delay 148840
phc2sys[761.493]: sw2p0 sys offset 3863 s2 freq +47133 delay 143360
phc2sys[762.494]: sw2p0 sys offset 259 s2 freq +44688 delay 145840
phc2sys[763.496]: sw2p0 sys offset 1849 s2 freq +46356 delay 141000
phc2sys[764.497]: sw2p0 sys offset -1800 s2 freq +43262 delay 144160
phc2sys[765.499]: sw2p0 sys offset -184 s2 freq +44338 delay 139880
phc2sys[766.501]: sw2p0 sys offset -1677 s2 freq +42790 delay 146120
phc2sys[767.502]: sw2p0 sys offset 2529 s2 freq +46492 delay 141040
phc2sys[768.504]: sw2p0 sys offset -4368 s2 freq +40354 delay 151240
phc2sys[769.505]: sw2p0 sys offset 1112 s2 freq +44524 delay 147680
phc2sys[770.507]: sw2p0 sys offset 3002 s2 freq +46747 delay 142960
phc2sys[771.509]: sw2p0 sys offset -899 s2 freq +43747 delay 145440
phc2sys[772.510]: sw2p0 sys offset -2003 s2 freq +42373 delay 148360
phc2sys[773.512]: sw2p0 sys offset 3675 s2 freq +47450 delay 141440
phc2sys[774.514]: sw2p0 sys offset -1417 s2 freq +43461 delay 144960
phc2sys[775.515]: sw2p0 sys offset 802 s2 freq +45255 delay 142559
phc2sys[776.517]: sw2p0 sys offset 1368 s2 freq +46061 delay 140040
phc2sys[777.518]: sw2p0 sys offset -1897 s2 freq +43207 delay 141840
phc2sys[778.520]: sw2p0 sys offset -774 s2 freq +43761 delay 141680
phc2sys[779.522]: sw2p0 sys offset -1715 s2 freq +42587 delay 145199
phc2sys[780.523]: sw2p0 sys offset 4045 s2 freq +47833 delay 134839
phc2sys[781.525]: sw2p0 sys offset -4809 s2 freq +40192 delay 146840
phc2sys[782.526]: sw2p0 sys offset 363 s2 freq +43922 delay 144759
phc2sys[783.528]: sw2p0 sys offset 3328 s2 freq +46996 delay 140240
phc2sys[784.530]: sw2p0 sys offset -293 s2 freq +44373 delay 142480
phc2sys[785.531]: sw2p0 sys offset 46 s2 freq +44624 delay 142000
phc2sys[786.533]: sw2p0 sys offset -3422 s2 freq +41170 delay 148080
phc2sys[787.534]: sw2p0 sys offset 2932 s2 freq +46497 delay 140720
phc2sys[788.536]: sw2p0 sys offset -1961 s2 freq +42484 delay 147040
phc2sys[789.537]: sw2p0 sys offset -945 s2 freq +42912 delay 149160
phc2sys[790.539]: sw2p0 sys offset 3221 s2 freq +46794 delay 143040
phc2sys[791.541]: sw2p0 sys offset 41 s2 freq +44580 delay 144160
phc2sys[792.542]: sw2p0 sys offset -748 s2 freq +43804 delay 145120
Here, the synchronization offsets in DMA mode are an order of magnitude
worse, so yeah, initial enthusiasm definitely curbed now.
For me, what matters is not the latency itself, but the ability to
cross-timestamp one byte within the SPI transfer with high granularity,
and for the uncertainty of that timestamp to be as small and constant as
possible.
For that reason, I can post a third output log, taken in XSPI FIFO mode
but with "ctlr->ptp_sts_supported = true" removed. That causes the core
to take message-level software timestamps, which are a better indicator
of latency.
You can see that in FIFO mode, the minimum is much smaller (108 us) but
the spread is larger (the maximum is 209 us). In DMA mode, the latencies
are much more stable. But despite this, XSPI is still better for the
ability to zoom in on the particular byte of interest.
$ phc2sys -s CLOCK_REALTIME -c sw2p0 -O 0 -m
phc2sys[246.568]: sw2p0 sys offset 2872475 s0 freq -88840 delay 131332
phc2sys[247.571]: sw2p0 sys offset 2874267 s1 freq -87052 delay 194739
phc2sys[248.572]: sw2p0 sys offset 71966 s2 freq -15086 delay 114971
phc2sys[249.573]: sw2p0 sys offset 34792 s2 freq -30670 delay 108331
phc2sys[250.575]: sw2p0 sys offset -39553 s2 freq -94578 delay 208580
phc2sys[251.577]: sw2p0 sys offset 50369 s2 freq -16521 delay 107410
phc2sys[252.578]: sw2p0 sys offset 1597 s2 freq -50183 delay 128292
phc2sys[253.579]: sw2p0 sys offset 6685 s2 freq -44616 delay 107810
phc2sys[254.581]: sw2p0 sys offset -4102 s2 freq -53397 delay 108530
phc2sys[255.582]: sw2p0 sys offset -7256 s2 freq -57782 delay 112051
phc2sys[256.584]: sw2p0 sys offset -2910 s2 freq -55613 delay 108610
phc2sys[257.586]: sw2p0 sys offset -52981 s2 freq -106557 delay 209460
phc2sys[258.587]: sw2p0 sys offset 49914 s2 freq -19556 delay 107130
phc2sys[259.589]: sw2p0 sys offset -29913 s2 freq -84409 delay 195699
phc2sys[260.591]: sw2p0 sys offset 42439 s2 freq -21031 delay 110411
phc2sys[261.592]: sw2p0 sys offset 3048 s2 freq -47690 delay 120571
phc2sys[262.594]: sw2p0 sys offset -853 s2 freq -50676 delay 113291
phc2sys[263.596]: sw2p0 sys offset -35260 s2 freq -85339 delay 173937
phc2sys[264.597]: sw2p0 sys offset 26479 s2 freq -34178 delay 110570
phc2sys[265.599]: sw2p0 sys offset -36802 s2 freq -89516 delay 195699
phc2sys[266.601]: sw2p0 sys offset 39945 s2 freq -23809 delay 110571
phc2sys[267.603]: sw2p0 sys offset -32036 s2 freq -83807 delay 194858
phc2sys[268.604]: sw2p0 sys offset 37721 s2 freq -23661 delay 110570
phc2sys[269.606]: sw2p0 sys offset 5110 s2 freq -44955 delay 112571
phc2sys[270.607]: sw2p0 sys offset -3526 s2 freq -52058 delay 109570
phc2sys[271.608]: sw2p0 sys offset -7856 s2 freq -57446 delay 112491
phc2sys[272.610]: sw2p0 sys offset -5259 s2 freq -57206 delay 112051
phc2sys[273.612]: sw2p0 sys offset -43272 s2 freq -96797 delay 194178
phc2sys[274.613]: sw2p0 sys offset 40708 s2 freq -25798 delay 108291
phc2sys[275.615]: sw2p0 sys offset -38753 s2 freq -93047 delay 208900
phc2sys[276.616]: sw2p0 sys offset 47948 s2 freq -17972 delay 111050
phc2sys[277.618]: sw2p0 sys offset 10692 s2 freq -40843 delay 111131
phc2sys[278.619]: sw2p0 sys offset -2179 s2 freq -50507 delay 108530
phc2sys[279.620]: sw2p0 sys offset -8143 s2 freq -57124 delay 111571
phc2sys[280.623]: sw2p0 sys offset -49486 s2 freq -100910 delay 199179
phc2sys[281.625]: sw2p0 sys offset -3684 s2 freq -69954 delay 199419
phc2sys[282.626]: sw2p0 sys offset 54475 s2 freq -12900 delay 111651
phc2sys[283.628]: sw2p0 sys offset -36562 s2 freq -87595 delay 209420
^Cphc2sys[284.181]: sw2p0 sys offset -11239 s2 freq -73240 delay 194499
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 13:57 ` Vladimir Oltean
@ 2025-07-01 14:36 ` Mark Brown
2025-07-01 14:53 ` Vladimir Oltean
0 siblings, 1 reply; 35+ messages in thread
From: Mark Brown @ 2025-07-01 14:36 UTC (permalink / raw)
To: Vladimir Oltean
Cc: James Clark, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 647 bytes --]
On Tue, Jul 01, 2025 at 04:57:47PM +0300, Vladimir Oltean wrote:
> Here, the synchronization offsets in DMA mode are an order of magnitude
> worse, so yeah, initial enthusiasm definitely curbed now.
> For me, what matters is not the latency itself, but the ability to
> cross-timestamp one byte within the SPI transfer with high granularity,
> and for the uncertainty of that timestamp to be as small and constant as
> possible.
This is sounding like a copybreak type situation with the mode selected
depending on how big the transfer is, that's a very common pattern.
I've not looked how easy it is to flip this hardware between modes
though.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 14:36 ` Mark Brown
@ 2025-07-01 14:53 ` Vladimir Oltean
2025-07-01 15:16 ` Mark Brown
0 siblings, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-07-01 14:53 UTC (permalink / raw)
To: Mark Brown
Cc: James Clark, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Tue, Jul 01, 2025 at 03:36:21PM +0100, Mark Brown wrote:
> On Tue, Jul 01, 2025 at 04:57:47PM +0300, Vladimir Oltean wrote:
>
> > Here, the synchronization offsets in DMA mode are an order of magnitude
> > worse, so yeah, initial enthusiasm definitely curbed now.
>
> > For me, what matters is not the latency itself, but the ability to
> > cross-timestamp one byte within the SPI transfer with high granularity,
> > and for the uncertainty of that timestamp to be as small and constant as
> > possible.
>
> This is sounding like a copybreak type situation with the mode selected
> depending on how big the transfer is, that's a very common pattern.
> I've not looked how easy it is to flip this hardware between modes
> though.
I suppose one could try using FIFO mode for transfers which request
timestamping and DMA for transfers which don't. I don't have an insight
into what impact that will have on the driver, but I suspect at the very
least one will have to transform "DSPI_DMA_MODE" into "dspi->dma_available"
and "dspi->dma_in_use", and reconfigure the SPI_RSER register (interrupt
routing, to DMA engine or to CPUs) at every transfer rather than at dspi_init().
The question is whether you would be willing to see and maintain such
complexity increase, when AFAIK, the LS1028A FIFO mode passes its
requirements.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 14:53 ` Vladimir Oltean
@ 2025-07-01 15:16 ` Mark Brown
2025-07-01 15:24 ` Vladimir Oltean
0 siblings, 1 reply; 35+ messages in thread
From: Mark Brown @ 2025-07-01 15:16 UTC (permalink / raw)
To: Vladimir Oltean
Cc: James Clark, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 955 bytes --]
On Tue, Jul 01, 2025 at 05:53:12PM +0300, Vladimir Oltean wrote:
> I suppose one could try using FIFO mode for transfers which request
> timestamping and DMA for transfers which don't. I don't have an insight
> into what impact that will have on the driver, but I suspect at the very
> least one will have to transform "DSPI_DMA_MODE" into "dspi->dma_available"
> and "dspi->dma_in_use", and reconfigure the SPI_RSER register (interrupt
> routing, to DMA engine or to CPUs) at every transfer rather than at dspi_init().
> The question is whether you would be willing to see and maintain such
> complexity increase, when AFAIK, the LS1028A FIFO mode passes its
> requirements.
Switching between modes is incredibly common, usually between PIO (for
very short transfers) and DMA, that's no problem. Factoring in
timestamping seems like a reasonable signal I guess, might trip someone
who was trying to benchmark things up but probably not normal users.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 15:16 ` Mark Brown
@ 2025-07-01 15:24 ` Vladimir Oltean
2025-07-01 15:30 ` Mark Brown
0 siblings, 1 reply; 35+ messages in thread
From: Vladimir Oltean @ 2025-07-01 15:24 UTC (permalink / raw)
To: Mark Brown
Cc: James Clark, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
On Tue, Jul 01, 2025 at 04:16:50PM +0100, Mark Brown wrote:
> On Tue, Jul 01, 2025 at 05:53:12PM +0300, Vladimir Oltean wrote:
>
> > I suppose one could try using FIFO mode for transfers which request
> > timestamping and DMA for transfers which don't. I don't have an insight
> > into what impact that will have on the driver, but I suspect at the very
> > least one will have to transform "DSPI_DMA_MODE" into "dspi->dma_available"
> > and "dspi->dma_in_use", and reconfigure the SPI_RSER register (interrupt
> > routing, to DMA engine or to CPUs) at every transfer rather than at dspi_init().
>
> > The question is whether you would be willing to see and maintain such
> > complexity increase, when AFAIK, the LS1028A FIFO mode passes its
> > requirements.
>
> Switching between modes is incredibly common, usually between PIO (for
> very short transfers) and DMA, that's no problem. Factoring in
> timestamping seems like a reasonable signal I guess, might trip someone
> who was trying to benchmark things up but probably not normal users.
Ah, ok, I vaguely remember something being discussed about can_dma()
on previous iterations of this patch, but in a different context.
Then that's an avenue to explore, I guess. Looking at that method's
prototype, I suppose dspi could simply return can_dma = false "if (xfer->ptp_sts)"
(timestamp requested), i.e. no core involvement in the decision process at all?
Sounds interesting, but I can't promise I'll follow up with patches.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements
2025-07-01 15:24 ` Vladimir Oltean
@ 2025-07-01 15:30 ` Mark Brown
0 siblings, 0 replies; 35+ messages in thread
From: Mark Brown @ 2025-07-01 15:30 UTC (permalink / raw)
To: Vladimir Oltean
Cc: James Clark, Vladimir Oltean, Arnd Bergmann, Larisa Grigore,
Frank Li, Christoph Hellwig, linux-spi, imx, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 880 bytes --]
On Tue, Jul 01, 2025 at 06:24:33PM +0300, Vladimir Oltean wrote:
> On Tue, Jul 01, 2025 at 04:16:50PM +0100, Mark Brown wrote:
> > Switching between modes is incredibly common, usually between PIO (for
> > very short transfers) and DMA, that's no problem. Factoring in
> > timestamping seems like a reasonable signal I guess, might trip someone
> > who was trying to benchmark things up but probably not normal users.
> Ah, ok, I vaguely remember something being discussed about can_dma()
> on previous iterations of this patch, but in a different context.
> Then that's an avenue to explore, I guess. Looking at that method's
> prototype, I suppose dspi could simply return can_dma = false "if (xfer->ptp_sts)"
> (timestamp requested), i.e. no core involvement in the decision process at all?
Yes, exactly. It can base the decision on whatever amuses it about the
transfer.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread