Linux Serial subsystem development
 help / color / mirror / Atom feed
* 8250_dw system pause due to IRQ load
@ 2026-06-12  5:38 Craig McQueen
  2026-06-16  7:44 ` Craig McQueen
  0 siblings, 1 reply; 3+ messages in thread
From: Craig McQueen @ 2026-06-12  5:38 UTC (permalink / raw)
  To: linux-serial@vger.kernel.org

I have a Rockchip RK3328 based embedded Linux system, using the 8250_dw driver (device tree "snps,dw-apb-uart") for serial console and other serial ports. I'm using Yocto scarthgap with kernel v6.6.123.

It is talking to a microprocessor via a serial protocol at 921600 bps. Multiple times per hour, I see the serial protocol TX pause for 100 to 4500 ms. Usually the whole Linux system pauses during this time (realtime and monotonic clocks don't tick). mpstat shows high irq load. /proc/interrupts shows the 8250_dw interrupt count is going significantly higher during this time.

I'm also seeing complete system lock-ups occur every 1 to 72 hours, with no diagnostic information shown in the kernel serial console output.

Are there any known issues with the 8250_dw interrupt handler causing high CPU load, that I should try backporting to kernel v6.6?

I've written some kernel drivers, but I have no experience debugging interrupt handler issues, especially when it's an issue that prevents the kernel doing console output. I would appreciate any advice on kernel facilities that are suitable to debug this type of bug.

-- 
Craig McQueen


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re:8250_dw system pause due to IRQ load
  2026-06-12  5:38 8250_dw system pause due to IRQ load Craig McQueen
@ 2026-06-16  7:44 ` Craig McQueen
  2026-06-16  7:52   ` 8250_dw " Greg KH
  0 siblings, 1 reply; 3+ messages in thread
From: Craig McQueen @ 2026-06-16  7:44 UTC (permalink / raw)
  To: linux-serial

I previously wrote:

 > I have a Rockchip RK3328 based embedded Linux system, using the 8250_dw driver (device tree "snps,dw-apb-uart") for serial console and other serial ports. I'm using Yocto scarthgap with kernel v6.6.123. 
 >  
 > It is talking to a microprocessor via a serial protocol at 921600 bps. Multiple times per hour, I see the serial protocol TX pause for 100 to 4500 ms. Usually the whole Linux system pauses during this time (realtime and monotonic clocks don't tick). mpstat shows high irq load. /proc/interrupts shows the 8250_dw interrupt count is going significantly higher during this time. 
 >  
 > I'm also seeing complete system lock-ups occur every 1 to 72 hours, with no diagnostic information shown in the kernel serial console output. 
 >  
 > Are there any known issues with the 8250_dw interrupt handler causing high CPU load, that I should try backporting to kernel v6.6? 
 >  
 > I've written some kernel drivers, but I have no experience debugging interrupt handler issues, especially when it's an issue that prevents the kernel doing console output. I would appreciate any advice on kernel facilities that are suitable to debug this type of bug. 

I have been able to diagnose serial TX pauses, using trace_printk() in the interrupt handler. The cause of TX pauses is many repeated `UART_IIR_RX_TIMEOUT` interrupts. The serial device appears to randomly get out of this state.

I see the 8250_dw interrupt handler has a work-around to stop these `UART_IIR_RX_TIMEOUT` interrupts when the FIFO is empty. But it has only been enabled for non-DMA mode. But for the Rockchip RK3328, the serial device is configured for DMA mode. But in our usage, we're still seeing this issue randomly appear.

I have modified the 8250_dw interrupt handler to do the work-around even in DMA mode. This seems to resolve the repeated `UART_IIR_RX_TIMEOUT` interrupts, and eliminate the TX pauses.

My testing shows that this doesn't fix my other problem, of complete system lock-ups. I don't yet know if that is also 8250_dw related.

-- 
Craig McQueen


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 8250_dw system pause due to IRQ load
  2026-06-16  7:44 ` Craig McQueen
@ 2026-06-16  7:52   ` Greg KH
  0 siblings, 0 replies; 3+ messages in thread
From: Greg KH @ 2026-06-16  7:52 UTC (permalink / raw)
  To: Craig McQueen; +Cc: linux-serial

On Tue, Jun 16, 2026 at 05:44:39PM +1000, Craig McQueen wrote:
> I previously wrote:
> 
>  > I have a Rockchip RK3328 based embedded Linux system, using the 8250_dw driver (device tree "snps,dw-apb-uart") for serial console and other serial ports. I'm using Yocto scarthgap with kernel v6.6.123. 
>  >  
>  > It is talking to a microprocessor via a serial protocol at 921600 bps. Multiple times per hour, I see the serial protocol TX pause for 100 to 4500 ms. Usually the whole Linux system pauses during this time (realtime and monotonic clocks don't tick). mpstat shows high irq load. /proc/interrupts shows the 8250_dw interrupt count is going significantly higher during this time. 
>  >  
>  > I'm also seeing complete system lock-ups occur every 1 to 72 hours, with no diagnostic information shown in the kernel serial console output. 
>  >  
>  > Are there any known issues with the 8250_dw interrupt handler causing high CPU load, that I should try backporting to kernel v6.6? 
>  >  
>  > I've written some kernel drivers, but I have no experience debugging interrupt handler issues, especially when it's an issue that prevents the kernel doing console output. I would appreciate any advice on kernel facilities that are suitable to debug this type of bug. 
> 
> I have been able to diagnose serial TX pauses, using trace_printk() in the interrupt handler. The cause of TX pauses is many repeated `UART_IIR_RX_TIMEOUT` interrupts. The serial device appears to randomly get out of this state.
> 
> I see the 8250_dw interrupt handler has a work-around to stop these `UART_IIR_RX_TIMEOUT` interrupts when the FIFO is empty. But it has only been enabled for non-DMA mode. But for the Rockchip RK3328, the serial device is configured for DMA mode. But in our usage, we're still seeing this issue randomly appear.
> 
> I have modified the 8250_dw interrupt handler to do the work-around even in DMA mode. This seems to resolve the repeated `UART_IIR_RX_TIMEOUT` interrupts, and eliminate the TX pauses.
> 
> My testing shows that this doesn't fix my other problem, of complete system lock-ups. I don't yet know if that is also 8250_dw related.

Note, many changes have happened in this driver since 6.6.y, can you try
the 7.1 release to see if it has been resolved there?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-16  7:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12  5:38 8250_dw system pause due to IRQ load Craig McQueen
2026-06-16  7:44 ` Craig McQueen
2026-06-16  7:52   ` 8250_dw " Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox