Linux SPI subsystem development
 help / color / mirror / Atom feed
* [PATCH 0/5] spi: dw: use threaded interrupt
@ 2026-06-15  4:40 Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 1/5] spi: dw: fix first spi transfer with dma always fallback to PIO Jisheng Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

To avoid blocking for an excessive amount of time, eventually impacting
on system responsiveness, hard interrupt handlers should finish
executing in as little time as possible.

Use threaded interrupt and move the SPI transfer handling to an
interrupt thread.

After that, since the dw_reader() and dw_writer() are called in
threaded ISR now, so we can delay the unmasking interrupts until no
rx and tx action is taken, thus reduce the interrupt numbers further.

patch1 fixes a trival bug which prevents dma usage for the first
transfer.
patch2, patch3 and patch4 are small clean ups
patch5 does the conversion and optimization.

Here are the performance numbers:

Tested with below two cmds
./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1

./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1000
./rtla timerlat top -q -k -P f:95

The first cmd is to check the interrupt numbers optmizaion result, the
2nd cmd group is to check the threaded interrupt improvement.

Before the patch:
each 32KB spi spidev_test transfer triggers 33118 interrupts

spidev_test reports ~22090kbps
and rtla reports:
                                     Timer Latency
  0 00:00:37   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
  0 #9958      |        1         0        67    103394 |        6         4      2198    105031
  1 #36902     |        1         0         1        18 |        5         4         5        29

After the patch:
each 32KB spi spidev_test transfer only triggers 3 interrupts

spidev_test reports ~23520kbps
and now rtla reports:
                                    Timer Latency
  0 00:00:58   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
  0 #58362     |        1         0         0        29 |        6         3         4        56
  1 #58363     |        1         0         1        23 |        6         4         5        68

In summary:
before the patch	after the patch
33118 interrutps	3 interrupts	 	reduced by 11038 times!
103394 us max latency	29 us max latency	reduced by 3564 times!
22090 kbps		23520 kbps		improved by 6.5%	


Jisheng Zhang (5):
  spi: dw: fix first spi transfer with dma always fallback to PIO
  spi: dw: use the correct error msg if request_irq() fails
  spi: dw: use DW_SPI_ISR directly
  spi: dw: use DW_SPI_INT_MASK instead of hardcoded 0xff
  spi: dw: use threaded interrupt and optimize the threaded ISR

 drivers/spi/spi-dw-core.c | 108 +++++++++++++++++++++++---------------
 drivers/spi/spi-dw-dma.c  |   3 +-
 2 files changed, 67 insertions(+), 44 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] spi: dw: fix first spi transfer with dma always fallback to PIO
  2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
@ 2026-06-15  4:40 ` Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 2/5] spi: dw: use the correct error msg if request_irq() fails Jisheng Zhang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

Even with proper dma engine support, the first spi transfer always
fallback to PIO, the reason is the dws->n_bytes is 0 after
initialization, so the dw_spi_can_dma() calling from __spi_map_msg()
return false, thus both tx_sg_mapped and rx_sg_mapped are false, so
for the first spi transfer, the spi_xfer_is_dma_mapped() reports false
thus fallback to PIO.

Although this brings no harm, we can simply fix this issue by
calcuating the "n_bytes" from xfer->bits_per_word.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/spi/spi-dw-dma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index fe726b9b1780..bd70a7ed8067 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -247,11 +247,12 @@ static bool dw_spi_can_dma(struct spi_controller *ctlr,
 {
 	struct dw_spi *dws = spi_controller_get_devdata(ctlr);
 	enum dma_slave_buswidth dma_bus_width;
+	u8 n_bytes = roundup_pow_of_two(BITS_TO_BYTES(xfer->bits_per_word));
 
 	if (xfer->len <= dws->fifo_len)
 		return false;
 
-	dma_bus_width = dw_spi_dma_convert_width(dws->n_bytes);
+	dma_bus_width = dw_spi_dma_convert_width(n_bytes);
 
 	return dws->dma_addr_widths & BIT(dma_bus_width);
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] spi: dw: use the correct error msg if request_irq() fails
  2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 1/5] spi: dw: fix first spi transfer with dma always fallback to PIO Jisheng Zhang
@ 2026-06-15  4:40 ` Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 3/5] spi: dw: use DW_SPI_ISR directly Jisheng Zhang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

If request_irq() fails, report "can not request IRQ" rather than "can
not get IRQ" which may be misread as platform_get_irq() failure.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index b47637888c5c..02e0b4b66a5b 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -947,7 +947,7 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
 	ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev),
 			  ctlr);
 	if (ret < 0 && ret != -ENOTCONN) {
-		dev_err(dev, "can not get IRQ\n");
+		dev_err(dev, "can not request IRQ\n");
 		goto err_free_ctlr;
 	}
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] spi: dw: use DW_SPI_ISR directly
  2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 1/5] spi: dw: fix first spi transfer with dma always fallback to PIO Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 2/5] spi: dw: use the correct error msg if request_irq() fails Jisheng Zhang
@ 2026-06-15  4:40 ` Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 4/5] spi: dw: use DW_SPI_INT_MASK instead of hardcoded 0xff Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR Jisheng Zhang
  4 siblings, 0 replies; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

The DW_SPI_ISR register reports the masked interrupts, no need to mask
again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 02e0b4b66a5b..aa2e51d0f959 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -252,7 +252,7 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
 {
 	struct spi_controller *ctlr = dev_id;
 	struct dw_spi *dws = spi_controller_get_devdata(ctlr);
-	u16 irq_status = dw_readl(dws, DW_SPI_ISR) & DW_SPI_INT_MASK;
+	u16 irq_status = dw_readl(dws, DW_SPI_ISR);
 
 	if (!irq_status)
 		return IRQ_NONE;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] spi: dw: use DW_SPI_INT_MASK instead of hardcoded 0xff
  2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
                   ` (2 preceding siblings ...)
  2026-06-15  4:40 ` [PATCH 3/5] spi: dw: use DW_SPI_ISR directly Jisheng Zhang
@ 2026-06-15  4:40 ` Jisheng Zhang
  2026-06-15  4:40 ` [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR Jisheng Zhang
  4 siblings, 0 replies; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

The Interrupt Mask Register valid bits is bit[5:0] which is well
defined with DW_SPI_INT_MASK, use it instead of the incorrect(but no
harm) and hardcoded 0xff.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/spi/spi-dw-core.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index aa2e51d0f959..feac17655847 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -228,7 +228,7 @@ static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
 	 */
 	dw_reader(dws);
 	if (!dws->rx_len) {
-		dw_spi_mask_intr(dws, 0xff);
+		dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
 		spi_finalize_current_transfer(dws->ctlr);
 	} else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
 		dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
@@ -258,7 +258,7 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
 		return IRQ_NONE;
 
 	if (!ctlr->cur_msg) {
-		dw_spi_mask_intr(dws, 0xff);
+		dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
 		return IRQ_HANDLED;
 	}
 
@@ -445,7 +445,7 @@ static int dw_spi_transfer_one(struct spi_controller *ctlr,
 	dws->dma_mapped = spi_xfer_is_dma_mapped(ctlr, spi, transfer);
 
 	/* For poll mode just disable all interrupts */
-	dw_spi_mask_intr(dws, 0xff);
+	dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
 
 	if (dws->dma_mapped) {
 		ret = dws->dma_ops->dma_setup(dws, transfer);
@@ -704,7 +704,7 @@ static int dw_spi_exec_mem_op(struct spi_mem *mem, const struct spi_mem_op *op)
 
 	dw_spi_update_config(dws, mem->spi, &cfg);
 
-	dw_spi_mask_intr(dws, 0xff);
+	dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
 
 	dw_spi_enable_chip(dws, 1);
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR
  2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
                   ` (3 preceding siblings ...)
  2026-06-15  4:40 ` [PATCH 4/5] spi: dw: use DW_SPI_INT_MASK instead of hardcoded 0xff Jisheng Zhang
@ 2026-06-15  4:40 ` Jisheng Zhang
  2026-06-15  5:37   ` Christophe JAILLET
  4 siblings, 1 reply; 7+ messages in thread
From: Jisheng Zhang @ 2026-06-15  4:40 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, linux-kernel

To avoid blocking for an excessive amount of time, eventually impacting
on system responsiveness, hard interrupt handlers should finish
executing in as little time as possible.

Use threaded interrupt and move the SPI transfer handling to an
interrupt thread.

After that, since the dw_reader() and dw_writer() are called in
threaded ISR now, so we can delay the unmasking interrupts until no
rx and tx action is taken, thus reduce the interrupt numbers further.

Tested with below two cmds
./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1

./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1000
./rtla timerlat top -q -k -P f:95

The first cmd is to check the interrupt numbers optmizaion result, the
2nd cmd group is to check the threaded interrupt improvement.

Before the patch:
each 32KB spi spidev_test transfer triggers 33118 interrupts

spidev_test reports ~22090kbps
and rtla reports:
                                     Timer Latency
  0 00:00:37   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
  0 #9958      |        1         0        67    103394 |        6         4      2198    105031
  1 #36902     |        1         0         1        18 |        5         4         5        29

After the patch:
each 32KB spi spidev_test transfer only triggers 3 interrupts

spidev_test reports ~23520kbps
and now rtla reports:
                                    Timer Latency
  0 00:00:58   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
  0 #58362     |        1         0         0        29 |        6         3         4        56
  1 #58363     |        1         0         1        23 |        6         4         5        68

In summary:
before the patch	after the patch
33118 interrutps	3 interrupts	 	reduced by 11038 times!
103394 us max latency	29 us max latency	reduced by 3564 times!
22090 kbps		23520 kbps		improved by 6.5%	

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 drivers/spi/spi-dw-core.c | 100 +++++++++++++++++++++++---------------
 1 file changed, 61 insertions(+), 39 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index feac17655847..cc4cc058ee72 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -132,10 +132,11 @@ static inline u32 dw_spi_rx_max(struct dw_spi *dws)
 	return min_t(u32, dws->rx_len, dw_readl(dws, DW_SPI_RXFLR));
 }
 
-static void dw_writer(struct dw_spi *dws)
+static u32 dw_writer(struct dw_spi *dws)
 {
 	u32 max = dw_spi_tx_max(dws);
 	u32 txw = 0;
+	u32 tx = 0;
 
 	while (max--) {
 		if (dws->tx) {
@@ -150,13 +151,16 @@ static void dw_writer(struct dw_spi *dws)
 		}
 		dw_write_io_reg(dws, DW_SPI_DR, txw);
 		--dws->tx_len;
+		++tx;
 	}
+	return tx;
 }
 
-static void dw_reader(struct dw_spi *dws)
+static u32 dw_reader(struct dw_spi *dws)
 {
 	u32 max = dw_spi_rx_max(dws);
 	u32 rxw;
+	u32 rx = 0;
 
 	while (max--) {
 		rxw = dw_read_io_reg(dws, DW_SPI_DR);
@@ -171,7 +175,9 @@ static void dw_reader(struct dw_spi *dws)
 			dws->rx += dws->n_bytes;
 		}
 		--dws->rx_len;
+		++rx;
 	}
+	return rx;
 }
 
 int dw_spi_check_status(struct dw_spi *dws, bool raw)
@@ -210,42 +216,59 @@ int dw_spi_check_status(struct dw_spi *dws, bool raw)
 }
 EXPORT_SYMBOL_NS_GPL(dw_spi_check_status, "SPI_DW_CORE");
 
-static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
+static irqreturn_t dw_spi_irq_thread_fn(int irq, void *dev_id)
 {
-	u16 irq_status = dw_readl(dws, DW_SPI_ISR);
+	struct spi_controller *ctlr = dev_id;
+	struct dw_spi *dws = spi_controller_get_devdata(ctlr);
+	u16 irq_status = dw_readl(dws, DW_SPI_RISR);
+	u32 rx, tx, imask, mask = 0;
+
+	do {
+		/*
+		 * Read data from the Rx FIFO every time we've got a chance executing
+		 * this method. If there is nothing left to receive, terminate the
+		 * procedure. Otherwise adjust the Rx FIFO Threshold level if it's a
+		 * final stage of the transfer. By doing so we'll get the next IRQ
+		 * right when the leftover incoming data is received.
+		 */
+		rx = dw_reader(dws);
+		if (!dws->rx_len) {
+			mask = DW_SPI_INT_MASK;
+			spi_finalize_current_transfer(dws->ctlr);
+		} else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
+			dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
+		}
+
+		/*
+		 * Send data out if Tx FIFO Empty IRQ is received. The IRQ will be
+		 * disabled after the data transmission is finished so not to
+		 * have the TXE IRQ flood at the final stage of the transfer.
+		 */
+		if (irq_status & DW_SPI_INT_TXEI) {
+			tx = dw_writer(dws);
+			if (!dws->tx_len)
+				mask = DW_SPI_INT_TXEI;
+		}
+	} while (rx != 0 || tx != 0);
+
+	imask = DW_SPI_INT_TXEI | DW_SPI_INT_TXOI |
+		DW_SPI_INT_RXUI | DW_SPI_INT_RXOI | DW_SPI_INT_RXFI;
+	imask &= ~mask;
+	dw_spi_umask_intr(dws, imask);
 
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
+{
 	if (dw_spi_check_status(dws, false)) {
 		spi_finalize_current_transfer(dws->ctlr);
 		return IRQ_HANDLED;
 	}
 
-	/*
-	 * Read data from the Rx FIFO every time we've got a chance executing
-	 * this method. If there is nothing left to receive, terminate the
-	 * procedure. Otherwise adjust the Rx FIFO Threshold level if it's a
-	 * final stage of the transfer. By doing so we'll get the next IRQ
-	 * right when the leftover incoming data is received.
-	 */
-	dw_reader(dws);
-	if (!dws->rx_len) {
-		dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
-		spi_finalize_current_transfer(dws->ctlr);
-	} else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
-		dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
-	}
-
-	/*
-	 * Send data out if Tx FIFO Empty IRQ is received. The IRQ will be
-	 * disabled after the data transmission is finished so not to
-	 * have the TXE IRQ flood at the final stage of the transfer.
-	 */
-	if (irq_status & DW_SPI_INT_TXEI) {
-		dw_writer(dws);
-		if (!dws->tx_len)
-			dw_spi_mask_intr(dws, DW_SPI_INT_TXEI);
-	}
+	dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
 
-	return IRQ_HANDLED;
+	return IRQ_WAKE_THREAD;
 }
 
 static irqreturn_t dw_spi_irq(int irq, void *dev_id)
@@ -944,13 +967,6 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
 	/* Basic HW init */
 	dw_spi_hw_init(dev, dws);
 
-	ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev),
-			  ctlr);
-	if (ret < 0 && ret != -ENOTCONN) {
-		dev_err(dev, "can not request IRQ\n");
-		goto err_free_ctlr;
-	}
-
 	dw_spi_init_mem_ops(dws);
 
 	ctlr->mode_bits = SPI_CPOL | SPI_CPHA;
@@ -990,7 +1006,7 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
 	if (dws->dma_ops && dws->dma_ops->dma_init) {
 		ret = dws->dma_ops->dma_init(dev, dws);
 		if (ret == -EPROBE_DEFER) {
-			goto err_free_irq;
+			goto err_free_ctlr;
 		} else if (ret) {
 			dev_warn(dev, "DMA init failed\n");
 		} else {
@@ -999,6 +1015,13 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
 		}
 	}
 
+	ret = request_threaded_irq(dws->irq, dw_spi_irq, dw_spi_irq_thread_fn,
+				   IRQF_SHARED, dev_name(dev), ctlr);
+	if (ret < 0 && ret != -ENOTCONN) {
+		dev_err(dev, "can not request IRQ\n");
+		goto err_free_ctlr;
+	}
+
 	ret = spi_register_controller(ctlr);
 	if (ret) {
 		dev_err_probe(dev, ret, "problem registering spi controller\n");
@@ -1012,7 +1035,6 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
 	if (dws->dma_ops && dws->dma_ops->dma_exit)
 		dws->dma_ops->dma_exit(dws);
 	dw_spi_enable_chip(dws, 0);
-err_free_irq:
 	free_irq(dws->irq, ctlr);
 err_free_ctlr:
 	spi_controller_put(ctlr);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR
  2026-06-15  4:40 ` [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR Jisheng Zhang
@ 2026-06-15  5:37   ` Christophe JAILLET
  0 siblings, 0 replies; 7+ messages in thread
From: Christophe JAILLET @ 2026-06-15  5:37 UTC (permalink / raw)
  To: jszhang; +Cc: broonie, linux-kernel, linux-spi

Le 15/06/2026 à 06:40, Jisheng Zhang a écrit :
> To avoid blocking for an excessive amount of time, eventually impacting
> on system responsiveness, hard interrupt handlers should finish
> executing in as little time as possible.
> 
> Use threaded interrupt and move the SPI transfer handling to an
> interrupt thread.
> 
> After that, since the dw_reader() and dw_writer() are called in
> threaded ISR now, so we can delay the unmasking interrupts until no
> rx and tx action is taken, thus reduce the interrupt numbers further.
> 
> Tested with below two cmds
> ./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1
> 
> ./spidev_test -D /dev/spidev1.3 -s 30000000 -S 327680 -I 1000
> ./rtla timerlat top -q -k -P f:95
> 
> The first cmd is to check the interrupt numbers optmizaion result, the
> 2nd cmd group is to check the threaded interrupt improvement.
> 
> Before the patch:
> each 32KB spi spidev_test transfer triggers 33118 interrupts
> 
> spidev_test reports ~22090kbps
> and rtla reports:
>                                       Timer Latency
>    0 00:00:37   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
> CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
>    0 #9958      |        1         0        67    103394 |        6         4      2198    105031
>    1 #36902     |        1         0         1        18 |        5         4         5        29
> 
> After the patch:
> each 32KB spi spidev_test transfer only triggers 3 interrupts
> 
> spidev_test reports ~23520kbps
> and now rtla reports:
>                                      Timer Latency
>    0 00:00:58   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
> CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
>    0 #58362     |        1         0         0        29 |        6         3         4        56
>    1 #58363     |        1         0         1        23 |        6         4         5        68
> 
> In summary:
> before the patch	after the patch
> 33118 interrutps	3 interrupts	 	reduced by 11038 times!
> 103394 us max latency	29 us max latency	reduced by 3564 times!
> 22090 kbps		23520 kbps		improved by 6.5%	
> 
> Signed-off-by: Jisheng Zhang <jszhang-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>   drivers/spi/spi-dw-core.c | 100 +++++++++++++++++++++++---------------
>   1 file changed, 61 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
> index feac17655847..cc4cc058ee72 100644
> --- a/drivers/spi/spi-dw-core.c
> +++ b/drivers/spi/spi-dw-core.c
> @@ -132,10 +132,11 @@ static inline u32 dw_spi_rx_max(struct dw_spi *dws)
>   	return min_t(u32, dws->rx_len, dw_readl(dws, DW_SPI_RXFLR));
>   }
>   
> -static void dw_writer(struct dw_spi *dws)
> +static u32 dw_writer(struct dw_spi *dws)
>   {
>   	u32 max = dw_spi_tx_max(dws);
>   	u32 txw = 0;
> +	u32 tx = 0;
>   
>   	while (max--) {
>   		if (dws->tx) {
> @@ -150,13 +151,16 @@ static void dw_writer(struct dw_spi *dws)
>   		}
>   		dw_write_io_reg(dws, DW_SPI_DR, txw);
>   		--dws->tx_len;
> +		++tx;
>   	}
> +	return tx;
>   }
>   
> -static void dw_reader(struct dw_spi *dws)
> +static u32 dw_reader(struct dw_spi *dws)
>   {
>   	u32 max = dw_spi_rx_max(dws);
>   	u32 rxw;
> +	u32 rx = 0;
>   
>   	while (max--) {
>   		rxw = dw_read_io_reg(dws, DW_SPI_DR);
> @@ -171,7 +175,9 @@ static void dw_reader(struct dw_spi *dws)
>   			dws->rx += dws->n_bytes;
>   		}
>   		--dws->rx_len;
> +		++rx;
>   	}
> +	return rx;
>   }
>   
>   int dw_spi_check_status(struct dw_spi *dws, bool raw)
> @@ -210,42 +216,59 @@ int dw_spi_check_status(struct dw_spi *dws, bool raw)
>   }
>   EXPORT_SYMBOL_NS_GPL(dw_spi_check_status, "SPI_DW_CORE");
>   
> -static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
> +static irqreturn_t dw_spi_irq_thread_fn(int irq, void *dev_id)
>   {
> -	u16 irq_status = dw_readl(dws, DW_SPI_ISR);
> +	struct spi_controller *ctlr = dev_id;
> +	struct dw_spi *dws = spi_controller_get_devdata(ctlr);
> +	u16 irq_status = dw_readl(dws, DW_SPI_RISR);
> +	u32 rx, tx, imask, mask = 0;
> +
> +	do {
> +		/*
> +		 * Read data from the Rx FIFO every time we've got a chance executing
> +		 * this method. If there is nothing left to receive, terminate the
> +		 * procedure. Otherwise adjust the Rx FIFO Threshold level if it's a
> +		 * final stage of the transfer. By doing so we'll get the next IRQ
> +		 * right when the leftover incoming data is received.
> +		 */
> +		rx = dw_reader(dws);
> +		if (!dws->rx_len) {
> +			mask = DW_SPI_INT_MASK;
> +			spi_finalize_current_transfer(dws->ctlr);
> +		} else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
> +			dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
> +		}
> +
> +		/*
> +		 * Send data out if Tx FIFO Empty IRQ is received. The IRQ will be
> +		 * disabled after the data transmission is finished so not to
> +		 * have the TXE IRQ flood at the final stage of the transfer.
> +		 */
> +		if (irq_status & DW_SPI_INT_TXEI) {
> +			tx = dw_writer(dws);
> +			if (!dws->tx_len)
> +				mask = DW_SPI_INT_TXEI;
> +		}
> +	} while (rx != 0 || tx != 0);
> +
> +	imask = DW_SPI_INT_TXEI | DW_SPI_INT_TXOI |
> +		DW_SPI_INT_RXUI | DW_SPI_INT_RXOI | DW_SPI_INT_RXFI;
> +	imask &= ~mask;
> +	dw_spi_umask_intr(dws, imask);
>   
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t dw_spi_transfer_handler(struct dw_spi *dws)
> +{
>   	if (dw_spi_check_status(dws, false)) {
>   		spi_finalize_current_transfer(dws->ctlr);
>   		return IRQ_HANDLED;
>   	}
>   
> -	/*
> -	 * Read data from the Rx FIFO every time we've got a chance executing
> -	 * this method. If there is nothing left to receive, terminate the
> -	 * procedure. Otherwise adjust the Rx FIFO Threshold level if it's a
> -	 * final stage of the transfer. By doing so we'll get the next IRQ
> -	 * right when the leftover incoming data is received.
> -	 */
> -	dw_reader(dws);
> -	if (!dws->rx_len) {
> -		dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
> -		spi_finalize_current_transfer(dws->ctlr);
> -	} else if (dws->rx_len <= dw_readl(dws, DW_SPI_RXFTLR)) {
> -		dw_writel(dws, DW_SPI_RXFTLR, dws->rx_len - 1);
> -	}
> -
> -	/*
> -	 * Send data out if Tx FIFO Empty IRQ is received. The IRQ will be
> -	 * disabled after the data transmission is finished so not to
> -	 * have the TXE IRQ flood at the final stage of the transfer.
> -	 */
> -	if (irq_status & DW_SPI_INT_TXEI) {
> -		dw_writer(dws);
> -		if (!dws->tx_len)
> -			dw_spi_mask_intr(dws, DW_SPI_INT_TXEI);
> -	}
> +	dw_spi_mask_intr(dws, DW_SPI_INT_MASK);
>   
> -	return IRQ_HANDLED;
> +	return IRQ_WAKE_THREAD;
>   }
>   
>   static irqreturn_t dw_spi_irq(int irq, void *dev_id)
> @@ -944,13 +967,6 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
>   	/* Basic HW init */
>   	dw_spi_hw_init(dev, dws);
>   
> -	ret = request_irq(dws->irq, dw_spi_irq, IRQF_SHARED, dev_name(dev),
> -			  ctlr);
> -	if (ret < 0 && ret != -ENOTCONN) {
> -		dev_err(dev, "can not request IRQ\n");
> -		goto err_free_ctlr;
> -	}
> -
>   	dw_spi_init_mem_ops(dws);
>   
>   	ctlr->mode_bits = SPI_CPOL | SPI_CPHA;
> @@ -990,7 +1006,7 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
>   	if (dws->dma_ops && dws->dma_ops->dma_init) {
>   		ret = dws->dma_ops->dma_init(dev, dws);
>   		if (ret == -EPROBE_DEFER) {
> -			goto err_free_irq;
> +			goto err_free_ctlr;
>   		} else if (ret) {
>   			dev_warn(dev, "DMA init failed\n");
>   		} else {
> @@ -999,6 +1015,13 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
>   		}
>   	}
>   
> +	ret = request_threaded_irq(dws->irq, dw_spi_irq, dw_spi_irq_thread_fn,
> +				   IRQF_SHARED, dev_name(dev), ctlr);
> +	if (ret < 0 && ret != -ENOTCONN) {
> +		dev_err(dev, "can not request IRQ\n");
> +		goto err_free_ctlr;

Hi,

I guess that the error handling path should be updated to move 
free_irq() a few lines above.

Here, IIUC, we should jump to err_dma_exit to undo the dma_init stuff, 
but without calling free_irq().

> +	}
> +
>   	ret = spi_register_controller(ctlr);
>   	if (ret) {
>   		dev_err_probe(dev, ret, "problem registering spi controller\n");
> @@ -1012,7 +1035,6 @@ int dw_spi_add_controller(struct device *dev, struct dw_spi *dws)
>   	if (dws->dma_ops && dws->dma_ops->dma_exit)
>   		dws->dma_ops->dma_exit(dws);
>   	dw_spi_enable_chip(dws, 0);
> -err_free_irq:
>   	free_irq(dws->irq, ctlr);
>   err_free_ctlr:
>   	spi_controller_put(ctlr);


Also, but completly unrelated to this patch, dw_spi_enable_chip(0) and 
dw_spi_enable_chip(1) seem to be paired.

Is it in purpose that at [1], it is left disabled?

[1]: 
https://elixir.bootlin.com/linux/v7.1-rc7/source/drivers/spi/spi-dw-core.c#L453

Just my 2c,

CJ

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-15  5:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-15  4:40 [PATCH 0/5] spi: dw: use threaded interrupt Jisheng Zhang
2026-06-15  4:40 ` [PATCH 1/5] spi: dw: fix first spi transfer with dma always fallback to PIO Jisheng Zhang
2026-06-15  4:40 ` [PATCH 2/5] spi: dw: use the correct error msg if request_irq() fails Jisheng Zhang
2026-06-15  4:40 ` [PATCH 3/5] spi: dw: use DW_SPI_ISR directly Jisheng Zhang
2026-06-15  4:40 ` [PATCH 4/5] spi: dw: use DW_SPI_INT_MASK instead of hardcoded 0xff Jisheng Zhang
2026-06-15  4:40 ` [PATCH 5/5] spi: dw: use threaded interrupt and optimize the threaded ISR Jisheng Zhang
2026-06-15  5:37   ` Christophe JAILLET

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox