The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH 0/2] serial: sc16is7xx: fix TX inter-frame gaps on SPI UARTs
@ 2026-06-23 11:22 Paul Mbewe
  2026-06-23 11:22 ` [PATCH 1/2] serial: sc16is7xx: fix TX gap caused by kfifo circular buffer wrap-around Paul Mbewe
  2026-06-23 11:22 ` [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns Paul Mbewe
  0 siblings, 2 replies; 13+ messages in thread
From: Paul Mbewe @ 2026-06-23 11:22 UTC (permalink / raw)
  To: linux-serial, linux-kernel; +Cc: gregkh, jirislaby, hvilleneuve, Paul Mbewe

This series fixes two related TX path issues in the sc16is7xx SPI UART
driver that cause visible inter-frame gaps on the wire, breaking Modbus
RTU communication.

Patch 1 fixes a data loss / gap caused by kfifo_out_linear_ptr() only
returning one contiguous segment of the circular buffer. When TX data
wraps around the kfifo boundary, the remainder is delayed until the next
TX interrupt, producing a silence gap that violates the Modbus 1.5
character-time limit.

Patch 2 raises the TX FIFO trigger level from 8 to 32 free spaces via
the TLR register. The default trigger of 8 causes excessive SPI
round-trips on slow single-core hosts, leading to TX FIFO underruns
between bursts.

Tested on SC16IS752 (SPI, 1 MHz) on i.MX6ULL driving RS-485 at 115200
baud. Oscilloscope confirmed both gap types are eliminated after the
fix.

Paul Mbewe (2):
  serial: sc16is7xx: fix TX gap caused by kfifo circular buffer
    wrap-around
  serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent
    underruns

 drivers/tty/serial/sc16is7xx.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] serial: sc16is7xx: fix TX gap caused by kfifo circular buffer wrap-around
  2026-06-23 11:22 [PATCH 0/2] serial: sc16is7xx: fix TX inter-frame gaps on SPI UARTs Paul Mbewe
@ 2026-06-23 11:22 ` Paul Mbewe
  2026-06-23 11:22 ` [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns Paul Mbewe
  1 sibling, 0 replies; 13+ messages in thread
From: Paul Mbewe @ 2026-06-23 11:22 UTC (permalink / raw)
  To: linux-serial, linux-kernel
  Cc: gregkh, jirislaby, hvilleneuve, Paul Mbewe, stable,
	Tobias Gannert, Joachim Knorr

kfifo_out_linear_ptr() returns only one contiguous linear segment of the
circular kfifo buffer. When transmit data wraps around the end of the
buffer, only the first segment (up to the buffer end) is sent. The
remaining data at the start of the buffer is not sent until the next TX
interrupt fires, resulting in a visible inter-frame gap on the wire.

This gap violates the Modbus RTU 1.5 character-time inter-character
silence limit. Receivers interpret any silence exceeding 1.5 character
times as an end-of-frame marker, splitting a single valid frame into
two malformed fragments and corrupting communication on the bus.

The incomplete transfer also causes unnecessary TX interrupts: instead
of draining the full available FIFO space in one pass, the driver fires
an extra interrupt per wrap-around just to send the remaining bytes.

The pre-kfifo code handled wrap-around by copying bytes one at a time
from the circ_buf into a linear staging buffer. The conversion to kfifo
replaced this with a single kfifo_out_linear_ptr() call, losing the
wrap-around handling. The max310x driver (a similar SPI UART) correctly
handles this with a while loop.

Fix this by calling kfifo_out_linear_ptr() in a loop, advancing through
all contiguous segments until the available TX FIFO space is exhausted
or the kfifo is empty.

Tested on SC16IS752 (SPI) driving RS-485 at 115200 baud 8N1 on an
i.MX6ULL based board. Oscilloscope confirmed mid-frame breaks at the
kfifo wrap-around boundary before the fix; no breaks observed after.

Fixes: 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo")
Cc: stable@vger.kernel.org
Reported-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
Tested-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
Reviewed-by: Joachim Knorr <joachim.knorr@ziehl-abegg.de>
Signed-off-by: Paul Mbewe <paultyson.mbewe@ziehl-abegg.de>
---
 drivers/tty/serial/sc16is7xx.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 1a2c4c14f6aa..395a219280be 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -730,9 +730,17 @@ static void sc16is7xx_handle_tx(struct uart_port *port)
 		txlen = 0;
 	}
 
-	txlen = kfifo_out_linear_ptr(&tport->xmit_fifo, &tail, txlen);
-	sc16is7xx_fifo_write(port, tail, txlen);
-	uart_xmit_advance(port, txlen);
+	/* Handle circular buffer wrap-around by sending in contiguous segments */
+	while (txlen > 0 && !kfifo_is_empty(&tport->xmit_fifo)) {
+		unsigned int to_send;
+
+		to_send = kfifo_out_linear_ptr(&tport->xmit_fifo, &tail, txlen);
+		if (!to_send)
+			break;
+		sc16is7xx_fifo_write(port, tail, to_send);
+		uart_xmit_advance(port, to_send);
+		txlen -= to_send;
+	}
 
 	uart_port_lock_irqsave(port, &flags);
 	if (kfifo_len(&tport->xmit_fifo) < WAKEUP_CHARS)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 11:22 [PATCH 0/2] serial: sc16is7xx: fix TX inter-frame gaps on SPI UARTs Paul Mbewe
  2026-06-23 11:22 ` [PATCH 1/2] serial: sc16is7xx: fix TX gap caused by kfifo circular buffer wrap-around Paul Mbewe
@ 2026-06-23 11:22 ` Paul Mbewe
  2026-06-23 12:45   ` David Laight
  1 sibling, 1 reply; 13+ messages in thread
From: Paul Mbewe @ 2026-06-23 11:22 UTC (permalink / raw)
  To: linux-serial, linux-kernel
  Cc: gregkh, jirislaby, hvilleneuve, Paul Mbewe, stable,
	Tobias Gannert, Joachim Knorr

The THRI interrupt (IER[1]) fires when the TX FIFO free space reaches the
configured threshold. With the reset default (TLR=0), the chip falls back
to the FCR TX trigger of 8 free spaces (FCR[5:4]=00), causing THRI to
assert after every 8 bytes drain from the FIFO.

At 115200 baud 8N1, 8 bytes drain in 694 us. On slow single-core SPI
hosts, the combined latency of an SPI IIR read, TXLVL read, and 8-byte
THR write per interrupt, plus kthread scheduling jitter, can exceed this
window on a loaded system. When the kthread cannot refill the FIFO within
694 us, the FIFO empties and produces an idle gap on the TX line.

This violates the Modbus RTU specification, which treats any intra-frame
silence longer than 1.5 character times (~130 us at 115200 baud) as a
frame boundary, causing receivers to fragment frames and report CRC errors.
Oscilloscope measurements confirmed a 757 us inter-burst gap during
continuous transmission without this fix.

Setting the TX trigger to 32 free spaces (half FIFO) via TLR[3:0]=8
widens the refill window to 2778 us at 115200 baud, reducing THRI events
per 256-byte frame from ~32 to ~8 and eliminating the underrun.

Only TLR[3:0] is written; TLR[7:4] is left at zero, so the RX trigger
retains its FCR default. Only TX interrupt timing is affected.

While increasing the SPI clock would also reduce per-round-trip latency,
the driver should work correctly regardless of SPI speed. The fix belongs
in the driver.

Tested on i.MX6ULL (ARM Cortex-A7, single-core) with SC16IS752IBS over
SPI at 1 MHz, 115200 baud 8N1, 256-byte Modbus RTU frames under production
load:

  IRQ thread (irq/134-spi2.0):  ~15-17%  ->  ~5%    (~67% reduction)
  sys CPU:                       ~51-61%  ->  ~19-28% (~55% reduction)
  load average:                  ~2.0-2.2 ->  ~0.65-1.3

No mid-frame gaps observed after fix. Without fix, oscilloscope confirmed
757 us inter-burst gaps causing Modbus frame fragmentation.

Cc: stable@vger.kernel.org
Reported-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
Tested-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
Reviewed-by: Joachim Knorr <joachim.knorr@ziehl-abegg.de>
Signed-off-by: Paul Mbewe <paultyson.mbewe@ziehl-abegg.de>
---
 drivers/tty/serial/sc16is7xx.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 395a219280be..476e0dd3fa7f 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -1196,6 +1196,16 @@ static int sc16is7xx_startup(struct uart_port *port)
 			     SC16IS7XX_TCR_RX_RESUME(24) |
 			     SC16IS7XX_TCR_RX_HALT(48));
 
+	/*
+	 * Set TX FIFO trigger level to 32 spaces (half FIFO) via TLR. The reset
+	 * default (TLR=0) falls back to the FCR TX trigger of 8 free spaces,
+	 * requiring ~8 SPI round-trips per 64-byte FIFO load. On slow single-core
+	 * SPI hosts, this accumulated latency can cause a TX FIFO underrun gap
+	 * between bursts.
+	 */
+	sc16is7xx_port_write(port, SC16IS7XX_TLR_REG,
+			     SC16IS7XX_TLR_TX_TRIGGER(32));
+
 	/* Disable TCR/TLR access */
 	sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 11:22 ` [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns Paul Mbewe
@ 2026-06-23 12:45   ` David Laight
  2026-06-23 14:01     ` Paul Mbewe
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2026-06-23 12:45 UTC (permalink / raw)
  To: Paul Mbewe
  Cc: linux-serial, linux-kernel, gregkh, jirislaby, hvilleneuve,
	stable, Tobias Gannert, Joachim Knorr

On Tue, 23 Jun 2026 13:22:25 +0200
Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:

> The THRI interrupt (IER[1]) fires when the TX FIFO free space reaches the
> configured threshold. With the reset default (TLR=0), the chip falls back
> to the FCR TX trigger of 8 free spaces (FCR[5:4]=00), causing THRI to
> assert after every 8 bytes drain from the FIFO.
> 
> At 115200 baud 8N1, 8 bytes drain in 694 us. On slow single-core SPI
> hosts, the combined latency of an SPI IIR read, TXLVL read, and 8-byte
> THR write per interrupt, plus kthread scheduling jitter, can exceed this
> window on a loaded system. When the kthread cannot refill the FIFO within
> 694 us, the FIFO empties and produces an idle gap on the TX line.

That seems strange.
The earlier interrupt (8 free fifo spaces) ought to make it less likely
that the fifo will underrun.

Are you sure it isn't the other way around?
So the interrupt happens when there are 8 bytes left in the fifo.
With a 64 byte fifo each interrupt would fill at least 56 bytes.
Increasing it to 32 (bytes left in the fifo) gives more time for the
interrupt latency (etc), but reduces the number of bytes written by
each isr to at least 32.

The other possibility is some error passing the level sensitive 'fifo empty'
state through to scheduling the kthread.

It may well be that the system interrupt latency is small enough that you
can take the interrupt at 'half full', and that the associated reduction
in the number of interrupts makes the system behave better.
But this isn't what the commit message says.

	David

> 
> This violates the Modbus RTU specification, which treats any intra-frame
> silence longer than 1.5 character times (~130 us at 115200 baud) as a
> frame boundary, causing receivers to fragment frames and report CRC errors.
> Oscilloscope measurements confirmed a 757 us inter-burst gap during
> continuous transmission without this fix.
> 
> Setting the TX trigger to 32 free spaces (half FIFO) via TLR[3:0]=8
> widens the refill window to 2778 us at 115200 baud, reducing THRI events
> per 256-byte frame from ~32 to ~8 and eliminating the underrun.
> 
> Only TLR[3:0] is written; TLR[7:4] is left at zero, so the RX trigger
> retains its FCR default. Only TX interrupt timing is affected.
> 
> While increasing the SPI clock would also reduce per-round-trip latency,
> the driver should work correctly regardless of SPI speed. The fix belongs
> in the driver.
> 
> Tested on i.MX6ULL (ARM Cortex-A7, single-core) with SC16IS752IBS over
> SPI at 1 MHz, 115200 baud 8N1, 256-byte Modbus RTU frames under production
> load:
> 
>   IRQ thread (irq/134-spi2.0):  ~15-17%  ->  ~5%    (~67% reduction)
>   sys CPU:                       ~51-61%  ->  ~19-28% (~55% reduction)
>   load average:                  ~2.0-2.2 ->  ~0.65-1.3
> 
> No mid-frame gaps observed after fix. Without fix, oscilloscope confirmed
> 757 us inter-burst gaps causing Modbus frame fragmentation.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
> Tested-by: Tobias Gannert <tobias.gannert@ziehl-abegg.de>
> Reviewed-by: Joachim Knorr <joachim.knorr@ziehl-abegg.de>
> Signed-off-by: Paul Mbewe <paultyson.mbewe@ziehl-abegg.de>
> ---
>  drivers/tty/serial/sc16is7xx.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
> index 395a219280be..476e0dd3fa7f 100644
> --- a/drivers/tty/serial/sc16is7xx.c
> +++ b/drivers/tty/serial/sc16is7xx.c
> @@ -1196,6 +1196,16 @@ static int sc16is7xx_startup(struct uart_port *port)
>  			     SC16IS7XX_TCR_RX_RESUME(24) |
>  			     SC16IS7XX_TCR_RX_HALT(48));
>  
> +	/*
> +	 * Set TX FIFO trigger level to 32 spaces (half FIFO) via TLR. The reset
> +	 * default (TLR=0) falls back to the FCR TX trigger of 8 free spaces,
> +	 * requiring ~8 SPI round-trips per 64-byte FIFO load. On slow single-core
> +	 * SPI hosts, this accumulated latency can cause a TX FIFO underrun gap
> +	 * between bursts.
> +	 */
> +	sc16is7xx_port_write(port, SC16IS7XX_TLR_REG,
> +			     SC16IS7XX_TLR_TX_TRIGGER(32));
> +
>  	/* Disable TCR/TLR access */
>  	sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
>  


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 12:45   ` David Laight
@ 2026-06-23 14:01     ` Paul Mbewe
  2026-06-23 15:07       ` David Laight
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Mbewe @ 2026-06-23 14:01 UTC (permalink / raw)
  To: david.laight.linux
  Cc: linux-serial, linux-kernel, gregkh, jirislaby, hvilleneuve,
	stable, tobias.gannert, joachim.knorr, Paul Mbewe

Hi David,

Thanks for the detailed review.

According to the SC16IS7xx datasheet, the TX trigger level is defined
in terms of free FIFO spaces, not remaining data. So with the default
configuration (FCR[5:4] = 00), the THRI interrupt fires when the FIFO
has 8 free entries, i.e. when it still contains 56 bytes.

While this in theory leaves enough data in the FIFO, in practice the
system has to service many small refill cycles (~8 bytes per interrupt).
On slow SPI hosts, each cycle involves threaded interrupt handling and
multiple SPI transactions, and the cumulative latency plus scheduling
jitter can exceed the available margin between refills under load.

Increasing the trigger level to 32 free spaces reduces the number of
refill cycles significantly (from ~8 per FIFO load to ~2), and increases
the amount of data written per cycle. This reduces scheduling pressure
and, in practice, avoids the FIFO draining to empty between bursts.

The current commit message focuses too much on the "refill window" and
does not explain this aspect clearly. I can rephrase it to better
describe the interaction between trigger level, refill granularity and
system latency.

Thanks,
Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 14:01     ` Paul Mbewe
@ 2026-06-23 15:07       ` David Laight
  2026-06-23 17:13         ` Paul Mbewe
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2026-06-23 15:07 UTC (permalink / raw)
  To: Paul Mbewe
  Cc: linux-serial, linux-kernel, gregkh, jirislaby, hvilleneuve,
	stable, tobias.gannert, joachim.knorr

On Tue, 23 Jun 2026 16:01:55 +0200
Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:

> Hi David,
> 
> Thanks for the detailed review.
> 
> According to the SC16IS7xx datasheet, the TX trigger level is defined
> in terms of free FIFO spaces, not remaining data. So with the default
> configuration (FCR[5:4] = 00), the THRI interrupt fires when the FIFO
> has 8 free entries, i.e. when it still contains 56 bytes.
> 
> While this in theory leaves enough data in the FIFO, in practice the
> system has to service many small refill cycles (~8 bytes per interrupt).
> On slow SPI hosts, each cycle involves threaded interrupt handling and
> multiple SPI transactions, and the cumulative latency plus scheduling
> jitter can exceed the available margin between refills under load.

But that cost/time is much the same regardless of the trigger level.
Changing the level from 8 to 32 significantly reduces the allowable
latency.

> Increasing the trigger level to 32 free spaces reduces the number of
> refill cycles significantly (from ~8 per FIFO load to ~2), and increases
> the amount of data written per cycle. This reduces scheduling pressure
> and, in practice, avoids the FIFO draining to empty between bursts.

But shouldn't it should all catch up.
The isr thread should start finding more than 8 bytes space in the fifo,
write enough bytes to fill it and the next interrupt should happen about
8 byte times after the previous one finishes.
That does rely on nothing 'going wrong' between the hardware interrupt
and the isr thread.

What priority does the isr thread run at?
If it isn't running at an RT priority then the scheduler might decide
to reduce its priority which could easily generate what you are seeing.

I'll agree that changing the threshold reduces system load - so should
give extra time for 'other work'.
But I don't really see why it be the correct way to stop underruns.

	David

> 
> The current commit message focuses too much on the "refill window" and
> does not explain this aspect clearly. I can rephrase it to better
> describe the interaction between trigger level, refill granularity and
> system latency.
> 
> Thanks,
> Paul
> 
> _______________________________________ 
> 
> ZIEHL-ABEGG 
> 
> Executive Board: Joachim Ley (Chairman), Marco Altherr, Wolfgang Mayer
> Supervisory Board: Dennis Ziehl (Chairman) 
> 
> Court of Registry: District Court Stuttgart HRB 746188
> Company Seat: Künzelsau, Germany 
> 
> Der Inhalt dieser E-Mail und/oder jegliche Anhänge können vertrauliche Mitteilungen enthalten und sind ausschließlich für den bezeichneten Adressaten bestimmt.
> Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veröffentlichung,
> Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail einschließlich Anhängen unzulässig ist.
> Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. 
> 
> The content of this e-mail and/or attachments may contain confidential information and is intended solely for the named recipient.
> If you are not the intended recipient of this e-mail or on its distribution list, please note that any type of disclosure, publication,
> copying or distribution of the content of this e-mail including attachments is strictly forbidden.
> In this case, we would kindly ask you to notify the sender of the e-mail.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 15:07       ` David Laight
@ 2026-06-23 17:13         ` Paul Mbewe
  2026-06-23 18:42           ` David Laight
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Mbewe @ 2026-06-23 17:13 UTC (permalink / raw)
  To: david.laight.linux
  Cc: linux-serial, linux-kernel, gregkh, jirislaby, hvilleneuve,
	stable, tobias.gannert, joachim.knorr, Paul Mbewe

Hi David,

On the test system, the relevant threads already run with RT priority:

  irq/134-spi2.0     SCHED_FIFO priority 50
  sc16is7xx          SCHED_FIFO priority 50

So this is not caused by the IRQ thread running as a normal SCHED_OTHER
task. That does not remove all latency sources, of course; on this
single-core SPI system the threaded IRQ can still be affected by other
RT/kernel activity, IRQ/preemption-disabled sections, SPI transfer time,
or lock contention.

I agree, changing the TX trigger from 8 to 32 free spaces reduces the
per-interrupt time-to-empty margin. With an 8-space trigger the FIFO
still contains 56 bytes when THRI asserts, while with a 32-space trigger
it contains 32 bytes. So the v1 commit message is misleading when it
describes this as increasing the refill window.

The observed effect is instead that the 8-space trigger causes many
small TX refill events. Each event has roughly the same cost, as you
said. If the handler runs in time, it can catch up by seeing more than
8 free spaces and writing more data. The failure happens when one event
is delayed long enough for the FIFO to drain.

Using a 32-space trigger reduces the number of refill events and the
associated IRQ/SPI load. It also reduces the chance that one delayed
event lets the FIFO drain completely. On the tested setup this reduced
irq/134-spi2.0 CPU usage from about 15-17% to about 5%, sys CPU from
about 51-61% to about 19-28%, and load average from about 2.0-2.2 to
about 0.65-1.3. With that change, the observed TX gaps disappeared.

So I agree the commit message should be reworked to describe this as
reducing TX refill events and IRQ/SPI load, not as increasing the
per-interrupt latency margin.

If changing the default trigger globally is considered too broad, I can
also look at making the TX trigger configurable or limiting the change to
SPI-backed devices.

Thanks
Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 17:13         ` Paul Mbewe
@ 2026-06-23 18:42           ` David Laight
  2026-06-25 12:59             ` Maarten Brock
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2026-06-23 18:42 UTC (permalink / raw)
  To: Paul Mbewe
  Cc: linux-serial, linux-kernel, gregkh, jirislaby, hvilleneuve,
	stable, tobias.gannert, joachim.knorr

On Tue, 23 Jun 2026 19:13:28 +0200
Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:

> Hi David,
> 
> On the test system, the relevant threads already run with RT priority:
> 
>   irq/134-spi2.0     SCHED_FIFO priority 50
>   sc16is7xx          SCHED_FIFO priority 50
> 
> So this is not caused by the IRQ thread running as a normal SCHED_OTHER
> task. That does not remove all latency sources, of course; on this
> single-core SPI system the threaded IRQ can still be affected by other
> RT/kernel activity, IRQ/preemption-disabled sections, SPI transfer time,
> or lock contention.
> 
> I agree, changing the TX trigger from 8 to 32 free spaces reduces the
> per-interrupt time-to-empty margin. With an 8-space trigger the FIFO
> still contains 56 bytes when THRI asserts, while with a 32-space trigger
> it contains 32 bytes. So the v1 commit message is misleading when it
> describes this as increasing the refill window.
> 
> The observed effect is instead that the 8-space trigger causes many
> small TX refill events. Each event has roughly the same cost, as you
> said. If the handler runs in time, it can catch up by seeing more than
> 8 free spaces and writing more data. The failure happens when one event
> is delayed long enough for the FIFO to drain.
> 
> Using a 32-space trigger reduces the number of refill events and the
> associated IRQ/SPI load. It also reduces the chance that one delayed
> event lets the FIFO drain completely. On the tested setup this reduced
> irq/134-spi2.0 CPU usage from about 15-17% to about 5%, sys CPU from
> about 51-61% to about 19-28%, and load average from about 2.0-2.2 to
> about 0.65-1.3. With that change, the observed TX gaps disappeared.

Even with those figures (and the cpu % differences seem reasonable)
I still don't see why it stops the underruns.

An interesting exercise would be to call to ftrace_stop() when the
tx fifo is unexpectedly nearly empty.
Then enable all the ftrace scheduler and interrupt events (they don't
normally slow things down that much) and run the test with ftrace enabled.
When the trace gets stopped (you can just 'cat tracing_on' to find out
when it has been stopped) just 'less trace' to see what happened.
I think that will show that the isr thread is scheduled but doesn't
run for a while because something else is 'hogging' the cpu.

I've used this set of events - seemed to be useful.
    Clear old events
echo >set_event

    System call entry and exit:
echo 'syscalls:*' >>set_event

    IRQ entry and exit:
echo irq:irq_handler_entry irq:irq_handler_exit >>set_event
echo irq:softirq_entry irq:softirq_exit >>set_event

    Scheduler process switch events:
echo sched:sched_switch >>set_event
echo sched:sched_wakeup sched:sched_wakeup_new >>set_event
echo sched:sched_migrate_task >>set_event

	David
  
> 
> So I agree the commit message should be reworked to describe this as
> reducing TX refill events and IRQ/SPI load, not as increasing the
> per-interrupt latency margin.
> 
> If changing the default trigger globally is considered too broad, I can
> also look at making the TX trigger configurable or limiting the change to
> SPI-backed devices.
> 
> Thanks
> Paul
> 
> _______________________________________ 
> 
> ZIEHL-ABEGG 
> 
> Executive Board: Joachim Ley (Chairman), Marco Altherr, Wolfgang Mayer
> Supervisory Board: Dennis Ziehl (Chairman) 
> 
> Court of Registry: District Court Stuttgart HRB 746188
> Company Seat: Künzelsau, Germany 
> 
> Der Inhalt dieser E-Mail und/oder jegliche Anhänge können vertrauliche Mitteilungen enthalten und sind ausschließlich für den bezeichneten Adressaten bestimmt.
> Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veröffentlichung,
> Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail einschließlich Anhängen unzulässig ist.
> Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. 
> 
> The content of this e-mail and/or attachments may contain confidential information and is intended solely for the named recipient.
> If you are not the intended recipient of this e-mail or on its distribution list, please note that any type of disclosure, publication,
> copying or distribution of the content of this e-mail including attachments is strictly forbidden.
> In this case, we would kindly ask you to notify the sender of the e-mail.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-23 18:42           ` David Laight
@ 2026-06-25 12:59             ` Maarten Brock
  2026-07-01 16:40               ` Paul Mbewe
  0 siblings, 1 reply; 13+ messages in thread
From: Maarten Brock @ 2026-06-25 12:59 UTC (permalink / raw)
  To: David Laight, Paul Mbewe
  Cc: linux-serial@vger.kernel.org, linux-kernel@vger.kernel.org,
	gregkh@linuxfoundation.org, jirislaby@kernel.org,
	hvilleneuve@dimonoff.com, stable@vger.kernel.org,
	tobias.gannert@ziehl-abegg.de, joachim.knorr@ziehl-abegg.de

> From: David Laight <david.laight.linux@gmail.com>
> Sent: Tuesday 23 June 2026 20:42
> 
> On Tue, 23 Jun 2026 19:13:28 +0200
> Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:
> 
> > Hi David,
> >
> > On the test system, the relevant threads already run with RT priority:
> >
> >   irq/134-spi2.0     SCHED_FIFO priority 50
> >   sc16is7xx          SCHED_FIFO priority 50
> >
> > So this is not caused by the IRQ thread running as a normal SCHED_OTHER
> > task. That does not remove all latency sources, of course; on this
> > single-core SPI system the threaded IRQ can still be affected by other
> > RT/kernel activity, IRQ/preemption-disabled sections, SPI transfer time,
> > or lock contention.
> >
> > I agree, changing the TX trigger from 8 to 32 free spaces reduces the
> > per-interrupt time-to-empty margin. With an 8-space trigger the FIFO
> > still contains 56 bytes when THRI asserts, while with a 32-space trigger
> > it contains 32 bytes. So the v1 commit message is misleading when it
> > describes this as increasing the refill window.

I would say plain wrong and as such cannot be accepted.

> > The observed effect is instead that the 8-space trigger causes many
> > small TX refill events. Each event has roughly the same cost, as you
> > said. If the handler runs in time, it can catch up by seeing more than
> > 8 free spaces and writing more data. The failure happens when one event
> > is delayed long enough for the FIFO to drain.

True.

> > Using a 32-space trigger reduces the number of refill events and the
> > associated IRQ/SPI load. It also reduces the chance that one delayed
> > event lets the FIFO drain completely.

Not true. The necessary delay to trigger a drained FIFO is reduced. So the
chance that such a delay occurs is increased.

> > On the tested setup this reduced
> > irq/134-spi2.0 CPU usage from about 15-17% to about 5%, sys CPU from
> > about 51-61% to about 19-28%, and load average from about 2.0-2.2 to
> > about 0.65-1.3. With that change, the observed TX gaps disappeared.
> 
> Even with those figures (and the cpu % differences seem reasonable)
> I still don't see why it stops the underruns.

It probably stops the underruns because the lower CPU load from the IRQ
gives the CPU more time to handle the overhead and other high priority
tasks. It seems like the scheduling overhead is bigger than the actual
interrupt handling.
 
> >
> > So I agree the commit message should be reworked to describe this as
> > reducing TX refill events and IRQ/SPI load, not as increasing the
> > per-interrupt latency margin.
> >
> > If changing the default trigger globally is considered too broad, I can
> > also look at making the TX trigger configurable or limiting the change to
> > SPI-backed devices.

I consider this to be the right approach. The user is the one that knows
about the platform and its properties (#cores, cpu freq, scheduling latency,
spi freq, uart baudrate).

Still, I think it is good to change the default to halfway the fifo depth.
It reduces the interrupt frequency by a factor 4 (8/32) at the cost of a
less than halved allowed latency (32/56).

And this is probably equally useful for the RX FIFO when data comes in at
full speed. With a halved allowed latency before overruns happen the
interrupt frequency can also be divided by 4 resulting in an similar lower
cpu load.

> >
> > Thanks
> > Paul

Kind regards,
Maarten


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-06-25 12:59             ` Maarten Brock
@ 2026-07-01 16:40               ` Paul Mbewe
  2026-07-01 17:41                 ` David Laight
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Mbewe @ 2026-07-01 16:40 UTC (permalink / raw)
  To: david.laight.linux
  Cc: Maarten.Brock, linux-serial, linux-kernel, gregkh, jirislaby,
	hvilleneuve, tobias.gannert, joachim.knorr

Hi David, Maarten,

I rechecked this with ftrace, and I think the right explanation is
narrower than my v1 commit message.

The underrun trace shows the SC16IS7xx refill path running with the
hardware TX FIFO already empty while the tty xmit FIFO still had data
queued, for example:

  sc16is7xx delayed refill: line=0 txlvl=64 pending=172

So the conclusions for me are:

  - The driver can catch up if the FIFO is only partially drained when
    the THRI path runs. TXLVL will report more than the nominal trigger
    amount and the driver can write more bytes.

  - Once TXLVL is 64, the FIFO has already emptied and the TX gap has
    already occurred.

  - The v1 explanation was wrong to describe a 32-free-space trigger as
    increasing the per-interrupt refill window. It does not. With the
    default trigger, the FIFO has about 56 character times left when THRI
    fires; with a 32-free-space trigger, it has about 32 character times
    left.

  - As Maarten pointed out, the benefit is the other side of the
    trade-off: the refill event rate is reduced by 4x, from roughly
    1 refill per 8 transmitted bytes to 1 refill per 32 transmitted bytes.
    
  - On the tested SPI-backed system, that lower refill rate reduces the
    IRQ/SPI load enough to avoid the observed underruns.

The change is limited to SPI-attached parts so the I2C case is
unaffected.

For v2 I will:

  - rework the commit message around Maarten's trade-off calculation:
    4x fewer TX refill events, at the cost of reducing the time-to-empty
    margin from 56 to 32 character times;
  - limit the TX trigger change to SPI-backed SC16IS7xx devices;
  - leave the RX trigger level unchanged.

If maintainers would prefer this to be board-configurable instead, I can
look at a follow-up DT binding/property, but I would prefer to keep v2
focused on the TX/SPI case.

Thanks,
Paul

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-07-01 16:40               ` Paul Mbewe
@ 2026-07-01 17:41                 ` David Laight
  2026-07-02 12:17                   ` Maarten Brock
  0 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2026-07-01 17:41 UTC (permalink / raw)
  To: Paul Mbewe
  Cc: Maarten.Brock, linux-serial, linux-kernel, gregkh, jirislaby,
	hvilleneuve, tobias.gannert, joachim.knorr

On Wed, 1 Jul 2026 18:40:56 +0200
Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:

> Hi David, Maarten,
> 
> I rechecked this with ftrace, and I think the right explanation is
> narrower than my v1 commit message.
> 
> The underrun trace shows the SC16IS7xx refill path running with the
> hardware TX FIFO already empty while the tty xmit FIFO still had data
> queued, for example:
> 
>   sc16is7xx delayed refill: line=0 txlvl=64 pending=172
> 
> So the conclusions for me are:
> 
>   - The driver can catch up if the FIFO is only partially drained when
>     the THRI path runs. TXLVL will report more than the nominal trigger
>     amount and the driver can write more bytes.
> 
>   - Once TXLVL is 64, the FIFO has already emptied and the TX gap has
>     already occurred.

Right, but there must be something happening after the previous time the
fifo was filled that makes this one happen too late.
Either the hardware ISR was very late or it didn't schedule the thread.

Thinks.....
Maybe you are getting a fifo empty interrupt (rather than the fifo threshold)
interrupt, with the threshold one getting lost.
So what happens if the threshold is reached while the threaded ISR is still
running?
When that happens the ISR needs to loop, either in software or by re-enabling
a level-sensitive interrupt so that another hardware interrupt happens
immediately.
I'm not at all sure how that works with threaded ISR on the SPI interface.
Especially since you really don't want to read any more SPI registers
than absolutely necessary (you need to avoid PCIe reads as well!).

For a 'normal' ethernet driver rx path it is important to check the next
rx ring entry a second time after enabling the hardware interrupt.
Otherwise you miss a wakeup.
I think you are getting (effectively) the same problem.

You might need to use (say) ktime_get_ns() to work out how many bytes
were transmitted between the fifo space being read and the end of the
fifo writes to see if you need to do a second refill.
(Or, rather, to detect that you definitely don't need to do one.)

With the threshold set to 32 it is much less likely that it will be hit
during the fifo writes, so you are much less likely to lose an interrupt.
But scheduling delays could still make it happen. 

	David

> 
>   - The v1 explanation was wrong to describe a 32-free-space trigger as
>     increasing the per-interrupt refill window. It does not. With the
>     default trigger, the FIFO has about 56 character times left when THRI
>     fires; with a 32-free-space trigger, it has about 32 character times
>     left.
> 
>   - As Maarten pointed out, the benefit is the other side of the
>     trade-off: the refill event rate is reduced by 4x, from roughly
>     1 refill per 8 transmitted bytes to 1 refill per 32 transmitted bytes.
>    
>   - On the tested SPI-backed system, that lower refill rate reduces the
>     IRQ/SPI load enough to avoid the observed underruns.
> 
> The change is limited to SPI-attached parts so the I2C case is
> unaffected.
> 
> For v2 I will:
> 
>   - rework the commit message around Maarten's trade-off calculation:
>     4x fewer TX refill events, at the cost of reducing the time-to-empty
>     margin from 56 to 32 character times;
>   - limit the TX trigger change to SPI-backed SC16IS7xx devices;
>   - leave the RX trigger level unchanged.
> 
> If maintainers would prefer this to be board-configurable instead, I can
> look at a follow-up DT binding/property, but I would prefer to keep v2
> focused on the TX/SPI case.
> 
> Thanks,
> Paul
> 
> _______________________________________ 
> 
> ZIEHL-ABEGG 
> 
> Executive Board: Joachim Ley (Chairman), Marco Altherr, Wolfgang Mayer
> Supervisory Board: Dennis Ziehl (Chairman) 
> 
> Court of Registry: District Court Stuttgart HRB 746188
> Company Seat: Künzelsau, Germany 
> 
> Der Inhalt dieser E-Mail und/oder jegliche Anhänge können vertrauliche Mitteilungen enthalten und sind ausschließlich für den bezeichneten Adressaten bestimmt.
> Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veröffentlichung,
> Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail einschließlich Anhängen unzulässig ist.
> Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. 
> 
> The content of this e-mail and/or attachments may contain confidential information and is intended solely for the named recipient.
> If you are not the intended recipient of this e-mail or on its distribution list, please note that any type of disclosure, publication,
> copying or distribution of the content of this e-mail including attachments is strictly forbidden.
> In this case, we would kindly ask you to notify the sender of the e-mail.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-07-01 17:41                 ` David Laight
@ 2026-07-02 12:17                   ` Maarten Brock
  2026-07-02 21:24                     ` David Laight
  0 siblings, 1 reply; 13+ messages in thread
From: Maarten Brock @ 2026-07-02 12:17 UTC (permalink / raw)
  To: David Laight, Paul Mbewe, Crescent Hsieh
  Cc: linux-serial@vger.kernel.org, linux-kernel@vger.kernel.org,
	gregkh@linuxfoundation.org, jirislaby@kernel.org,
	hvilleneuve@dimonoff.com, tobias.gannert@ziehl-abegg.de,
	joachim.knorr@ziehl-abegg.de

> From: David Laight <david.laight.linux@gmail.com>
> Sent: Wednesday 1 July 2026 19:42
> 
> On Wed, 1 Jul 2026 18:40:56 +0200
> Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:
> 
> > Hi David, Maarten,
> >
> > I rechecked this with ftrace, and I think the right explanation is
> > narrower than my v1 commit message.
> >
> > The underrun trace shows the SC16IS7xx refill path running with the
> > hardware TX FIFO already empty while the tty xmit FIFO still had data
> > queued, for example:
> >
> >   sc16is7xx delayed refill: line=0 txlvl=64 pending=172
> >
> > So the conclusions for me are:
> >
> >   - The driver can catch up if the FIFO is only partially drained when
> >     the THRI path runs. TXLVL will report more than the nominal trigger
> >     amount and the driver can write more bytes.
> >
> >   - Once TXLVL is 64, the FIFO has already emptied and the TX gap has
> >     already occurred.
> 
> Right, but there must be something happening after the previous time the
> fifo was filled that makes this one happen too late.
> Either the hardware ISR was very late or it didn't schedule the thread.
> 
> Thinks.....
> Maybe you are getting a fifo empty interrupt (rather than the fifo threshold)
> interrupt, with the threshold one getting lost.
> So what happens if the threshold is reached while the threaded ISR is still
> running?
> When that happens the ISR needs to loop, either in software or by re-enabling
> a level-sensitive interrupt so that another hardware interrupt happens
> immediately.

The loop is already present in sc16is7xx_irq() with 
do { ... } while (keep_polling);

> I'm not at all sure how that works with threaded ISR on the SPI interface.
> Especially since you really don't want to read any more SPI registers
> than absolutely necessary (you need to avoid PCIe reads as well!).
> 
> For a 'normal' ethernet driver rx path it is important to check the next
> rx ring entry a second time after enabling the hardware interrupt.
> Otherwise you miss a wakeup.
> I think you are getting (effectively) the same problem.
> 
> You might need to use (say) ktime_get_ns() to work out how many bytes
> were transmitted between the fifo space being read and the end of the
> fifo writes to see if you need to do a second refill.
> (Or, rather, to detect that you definitely don't need to do one.)
> 
> With the threshold set to 32 it is much less likely that it will be hit
> during the fifo writes, so you are much less likely to lose an interrupt.
> But scheduling delays could still make it happen.
> 
> 	David
> 
> >
> >   - The v1 explanation was wrong to describe a 32-free-space trigger as
> >     increasing the per-interrupt refill window. It does not. With the
> >     default trigger, the FIFO has about 56 character times left when THRI
> >     fires; with a 32-free-space trigger, it has about 32 character times
> >     left.
> >
> >   - As Maarten pointed out, the benefit is the other side of the
> >     trade-off: the refill event rate is reduced by 4x, from roughly
> >     1 refill per 8 transmitted bytes to 1 refill per 32 transmitted bytes.
> >
> >   - On the tested SPI-backed system, that lower refill rate reduces the
> >     IRQ/SPI load enough to avoid the observed underruns.
> >
> > The change is limited to SPI-attached parts so the I2C case is
> > unaffected.
> >
> > For v2 I will:
> >
> >   - rework the commit message around Maarten's trade-off calculation:
> >     4x fewer TX refill events, at the cost of reducing the time-to-empty
> >     margin from 56 to 32 character times;
> >   - limit the TX trigger change to SPI-backed SC16IS7xx devices;
> >   - leave the RX trigger level unchanged.

I see no reason to limit this to SPI or to the transmit case only.

> > If maintainers would prefer this to be board-configurable instead, I can
> > look at a follow-up DT binding/property, but I would prefer to keep v2
> > focused on the TX/SPI case.
> >
> > Thanks,
> > Paul

Further, I think this should be combined with this recent patch proposal from Crescent Hsieh:
[PATCH v2 14/15] serial: 8250: allow UART drivers to override rx_trig_bytes handling

which introduces a way to set/get the rxtrig count per port. A similar thing can be added for txtrig.

Kind regards,
Maarten Brock


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns
  2026-07-02 12:17                   ` Maarten Brock
@ 2026-07-02 21:24                     ` David Laight
  0 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2026-07-02 21:24 UTC (permalink / raw)
  To: Maarten Brock
  Cc: Paul Mbewe, Crescent Hsieh, linux-serial@vger.kernel.org,
	linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	jirislaby@kernel.org, hvilleneuve@dimonoff.com,
	tobias.gannert@ziehl-abegg.de, joachim.knorr@ziehl-abegg.de

On Thu, 2 Jul 2026 12:17:01 +0000
Maarten Brock <Maarten.Brock@sttls.nl> wrote:

> > From: David Laight <david.laight.linux@gmail.com>
> > Sent: Wednesday 1 July 2026 19:42
> > 
> > On Wed, 1 Jul 2026 18:40:56 +0200
> > Paul Mbewe <paultyson.mbewe@ziehl-abegg.de> wrote:
> >   
> > > Hi David, Maarten,
> > >
> > > I rechecked this with ftrace, and I think the right explanation is
> > > narrower than my v1 commit message.
> > >
> > > The underrun trace shows the SC16IS7xx refill path running with the
> > > hardware TX FIFO already empty while the tty xmit FIFO still had data
> > > queued, for example:
> > >
> > >   sc16is7xx delayed refill: line=0 txlvl=64 pending=172
> > >
> > > So the conclusions for me are:
> > >
> > >   - The driver can catch up if the FIFO is only partially drained when
> > >     the THRI path runs. TXLVL will report more than the nominal trigger
> > >     amount and the driver can write more bytes.
> > >
> > >   - Once TXLVL is 64, the FIFO has already emptied and the TX gap has
> > >     already occurred.  
> > 
> > Right, but there must be something happening after the previous time the
> > fifo was filled that makes this one happen too late.
> > Either the hardware ISR was very late or it didn't schedule the thread.
> > 
> > Thinks.....
> > Maybe you are getting a fifo empty interrupt (rather than the fifo threshold)
> > interrupt, with the threshold one getting lost.
> > So what happens if the threshold is reached while the threaded ISR is still
> > running?
> > When that happens the ISR needs to loop, either in software or by re-enabling
> > a level-sensitive interrupt so that another hardware interrupt happens
> > immediately.  
> 
> The loop is already present in sc16is7xx_irq() with 
> do { ... } while (keep_polling);

That loop also processes multiple interrupt sources.

I've been looking at the code (but don't have the data sheet).
I'd look at what:
	sc16is7xx_port_read(port, SC16IS7XX_TXLVL_REG);
returns both at the bottom of sc16is7xx_handle_tx() and much later
when the IER is re-written by the other work item.
You might find trace_printf() useful - writes into the ftrace buffer.

Did ftrace show you a hardware interrupt at the time when the fifo would
have been expected to have 8 free entries?
Also how many byte times later was the interrupt (real and threaded) that
found the fifo empty?

I also noticed that the code re-enabled the tx interrupt at the end of
the threaded isr - is this actually needed? or is it just there because
the same code is called to send the first buffer.
If this is needed then it is possible that it won't generate an interrupt
if the fifo already has enough space.

There is also a question of how the hardware interrupt itself is handled.
I think there will be a single level-sensitive output from the device
itself.
This will trigger the hardware interrupt which then schedules the threaded
isr, but the hardware interrupt has to be masked until the threaded isr
finishes.
When the hardware interrupt is re-enabled you need the hardware ISR
to run if the irq line is still (or again) asserted.
If that is edge triggered things will go wrong in the way you are seeing.

	David


> 
> > I'm not at all sure how that works with threaded ISR on the SPI interface.
> > Especially since you really don't want to read any more SPI registers
> > than absolutely necessary (you need to avoid PCIe reads as well!).
> > 
> > For a 'normal' ethernet driver rx path it is important to check the next
> > rx ring entry a second time after enabling the hardware interrupt.
> > Otherwise you miss a wakeup.
> > I think you are getting (effectively) the same problem.
> > 
> > You might need to use (say) ktime_get_ns() to work out how many bytes
> > were transmitted between the fifo space being read and the end of the
> > fifo writes to see if you need to do a second refill.
> > (Or, rather, to detect that you definitely don't need to do one.)
> > 
> > With the threshold set to 32 it is much less likely that it will be hit
> > during the fifo writes, so you are much less likely to lose an interrupt.
> > But scheduling delays could still make it happen.
> > 
> > 	David
> >   
> > >
> > >   - The v1 explanation was wrong to describe a 32-free-space trigger as
> > >     increasing the per-interrupt refill window. It does not. With the
> > >     default trigger, the FIFO has about 56 character times left when THRI
> > >     fires; with a 32-free-space trigger, it has about 32 character times
> > >     left.
> > >
> > >   - As Maarten pointed out, the benefit is the other side of the
> > >     trade-off: the refill event rate is reduced by 4x, from roughly
> > >     1 refill per 8 transmitted bytes to 1 refill per 32 transmitted bytes.
> > >
> > >   - On the tested SPI-backed system, that lower refill rate reduces the
> > >     IRQ/SPI load enough to avoid the observed underruns.
> > >
> > > The change is limited to SPI-attached parts so the I2C case is
> > > unaffected.
> > >
> > > For v2 I will:
> > >
> > >   - rework the commit message around Maarten's trade-off calculation:
> > >     4x fewer TX refill events, at the cost of reducing the time-to-empty
> > >     margin from 56 to 32 character times;
> > >   - limit the TX trigger change to SPI-backed SC16IS7xx devices;
> > >   - leave the RX trigger level unchanged.  
> 
> I see no reason to limit this to SPI or to the transmit case only.
> 
> > > If maintainers would prefer this to be board-configurable instead, I can
> > > look at a follow-up DT binding/property, but I would prefer to keep v2
> > > focused on the TX/SPI case.
> > >
> > > Thanks,
> > > Paul  
> 
> Further, I think this should be combined with this recent patch proposal from Crescent Hsieh:
> [PATCH v2 14/15] serial: 8250: allow UART drivers to override rx_trig_bytes handling
> 
> which introduces a way to set/get the rxtrig count per port. A similar thing can be added for txtrig.
> 
> Kind regards,
> Maarten Brock
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-07-02 21:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23 11:22 [PATCH 0/2] serial: sc16is7xx: fix TX inter-frame gaps on SPI UARTs Paul Mbewe
2026-06-23 11:22 ` [PATCH 1/2] serial: sc16is7xx: fix TX gap caused by kfifo circular buffer wrap-around Paul Mbewe
2026-06-23 11:22 ` [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns Paul Mbewe
2026-06-23 12:45   ` David Laight
2026-06-23 14:01     ` Paul Mbewe
2026-06-23 15:07       ` David Laight
2026-06-23 17:13         ` Paul Mbewe
2026-06-23 18:42           ` David Laight
2026-06-25 12:59             ` Maarten Brock
2026-07-01 16:40               ` Paul Mbewe
2026-07-01 17:41                 ` David Laight
2026-07-02 12:17                   ` Maarten Brock
2026-07-02 21:24                     ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox