public inbox for linux-serial@vger.kernel.org
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Adriana Nicolae <adriana@arista.com>
Cc: linux-serial <linux-serial@vger.kernel.org>,
	 Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	prasad@arista.com
Subject: Re: [PATCH] serial: 8250 dw: clear FIFO before writting LCR
Date: Thu, 22 Jan 2026 15:18:28 +0200 (EET)	[thread overview]
Message-ID: <d93b7a48-c20f-5af5-080c-a5920304d60d@linux.intel.com> (raw)
In-Reply-To: <CAERbo5z2370GPWUNuSJuwZxga0se7CNH2XAnsnF-vtB7t6XREQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 15950 bytes --]

On Wed, 21 Jan 2026, Adriana Nicolae wrote:
> On Wed, Jan 21, 2026 at 4:31 PM Ilpo Järvinen
> <ilpo.jarvinen@linux.intel.com> wrote:
> > On Fri, 31 Oct 2025, Adriana Nicolae wrote:
> > > On Wed, Oct 8, 2025 at 4:12 PM Ilpo Järvinen
> > > <ilpo.jarvinen@linux.intel.com> wrote:
> > > > On Wed, 8 Oct 2025, Adriana Nicolae wrote:
> > > > > Hello, Sorry for missing an update, the exact root cause is not clear, currently the
> > > > > brute force method of draining FIFO right before setting or clearing DLAB was stable
> > > > > during tests.
> > > > >
> > > > > The serial stuck seems to be a failed attempt to clear the DLAB.
> > > > > This operation fails because the USR indicates the hardware is
> > > > > still busy, even though the UART is in loopback mode and should
> > > > > be inactive.
> > > > >
> > > > > To isolate the issue, I tried the following without success:
> > > > > - Added delays: I inserted 100 repeated ndelay(p->frame_time)
> > > > > calls before and after enabling loopback mode to allow the FIFO
> > > > > to clear.
> > > > > - Cleared FIFO: I explicitly cleared the FIFO in addition to
> > > > > adding the delay.
> > > > > - Checked status: I printed the LSR just before the DLAB clear
> > > > > attempt and checked the USB busy bit.
> > > >
> > > > Okay, so the BUSY must be stuck asserted.
> > > >
> > > > Another idea, maybe test tx + rx over loopback to see if that manages to
> > > > de-freeze the BUSY bit. A patch to that effect below.
> > > >
> > > > (I've only added the new block into dw8250_idle_enter() compared with the
> > > > old patch but rebasing to tty-next resulted in some other changes due to
> > > > conflicts.)
> > >
> > > I've tested the new dw8250_idle_enter() sequence, and you're right,
> > > the BUSY bit remains set after entering loopback mode.
> > >
> > > However, the sequence in the patch (including the single loopback
> > > tx/rx) wasn't quite enough. I didn't see any kernel panics or console
> > > stuck anymore, but I've monitored the traces and there were cases when
> > > the trace after "p->serial_out(p, UART_LCR, up->lcr);" showed both
> > > BUSY bit set and DLAB bit still enabled.
> > >
> > > >
> > > > Only thing besides BUSY being stuck asserted is something Tx'ing after the
> > > > idle enter sequence. I think we could trap/check that too in
> > > > dw8250_serial_out() by using something along these lines:
> > > >
> > > >         if (d->in_idle && offset == UART_TX) {
> > > >                 WARN_ON_ONCE(1);
> > > >                 return;
> > > >         }
> > > >
> > > > (I think that should catch even console writes but I'm not 100% sure of
> > > > that and it will should get us where the rogue Tx originates from).
> > >
> > > I also added the WARN_ON_ONCE check you suggested in
> > > dw8250_serial_out(). The warning has not triggered, so it seems we
> > > don't have a rogue Tx firing while in_idle is set.
> > >
> > > >
> > > > > The critical finding was that immediately before the DLAB clear
> > > > > operation (p->serial_out(p, UART_LCR, up->lcr);), the LSR value
> > > > > was 0x6a and the USR busy bit [0] was set. This confirms the UART
> > > > > is busy, which blocks the DLAB modification.
> > > > >
> > > > > This is occurring on a device with a single UART console. The setup
> > > > > does not use DMA or modem control; only the Rx/Tx lines are connected.
> > > > >
> > > > > The trace below, from a single process, shows one successful DLAB
> > > > > clear followed by a failing one. The second attempt repeatedly logs
> > > > > "USR still busy" before eventually succeeding. This can lead to
> > > > > subsequent failures in dw8250_check_lcr: dw8250_idle_entered.
> > > > >
> > > > > Trace Log:
> > > > >
> > > > > <...>-15440  8583.592533: dw8250_idle_enter: in_idle = 1
> > > > > login-15440  8583.713817: dw8250_idle_enter: in loopback mode
> > > > > login-15440  8583.835099: dw8250_idle_enter: LSR in loopback mode is 0x63
> > > > > login-15440  8583.835103: dw8250_set_divisor: UART_LCR_DLAB cleared 13
> > > > > login-15440  8583.835104: dw8250_idle_exit: idle exit
> > > > > login-15440  8583.835105: dw8250_idle_exit: out of loopback mode
> > > > > login-15440  8583.835105: dw8250_idle_exit: in_idle = 0
> > > > > login-15440  8583.835352: dw8250_idle_enter: in_idle = 1
> > > > > login-15440  8583.956633: dw8250_idle_enter: in loopback mode
> > > > > login-15440  8583.956638: dw8250_idle_enter: LSR in loopback mode is 0x6a
> > > > > login-15440  8583.963918: dw8250_set_divisor: USR still busy dl_write
> > > > > login-15440  8584.000332: dw8250_set_divisor: USR still busy dl_write
> > > > > login-15440  8584.040385: dw8250_set_divisor: USR still busy dl_write
> > > > > login-15440  8584.078012: dw8250_set_divisor: UART_LCR_DLAB cleared 93
> > > > > login-15440  8584.078013: dw8250_idle_exit: idle exit
> > > > > login-15440  8584.078014: dw8250_idle_exit: out of loopback mode
> > > > > login-15440  8584.078015: dw8250_idle_exit: in_idle = 0
> > > >
> > > >
> > > >
> > > > --
> > > > From 01df58736a10f7f34aca895ef08e5519953f8572 Mon Sep 17 00:00:00 2001
> > > > From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com>
> > > > Date: Wed, 8 Oct 2025 15:40:19 +0300
> > > > Subject: [PATCH 1/1] serial: 8250_dw: Ensure BUSY is deasserted
> > > > MIME-Version: 1.0
> > > > Content-Type: text/plain; charset=UTF-8
> > > > Content-Transfer-Encoding: 8bit
> > > >
> > > > DW UART cannot write to LCR, DLL, and DLH while BUSY is asserted.
> > > > Existance of BUSY depends on uart_16550_compatible, if UART HW is
> > > > configured with 16550 compatible those registers can always be
> > > > written.
> > > >
> > > > There currently is dw8250_force_idle() which attempts to archive
> > > > non-BUSY state by disabling FIFO, however, the solution is unreliable
> > > > when Rx keeps getting more and more characters.
> > > >
> > > > Create a sequence of operations to enforce that ensures UART cannot
> > > > keep BUSY asserted indefinitely. The new sequence relies on enabling
> > > > loopback mode temporarily to prevent incoming Rx characters keeping
> > > > UART BUSY.
> > > >
> > > > Use the new dw8250_idle_enter/exit() to do divisor writes and LCR
> > > > writes.
> > > >
> > > > This issue was reported by qianfan Zhao who put lots of debugging
> > > > effort into understanding the solution space.
> > > >
> > > > Reported-by: qianfan Zhao <qianfanguijin@163.com>
> > > > Reported-by: Adriana Nicolae <adriana@arista.com>
> > > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > > > ---
> > > >  drivers/tty/serial/8250/8250_dw.c | 159 +++++++++++++++++++++---------
> > > >  1 file changed, 115 insertions(+), 44 deletions(-)
> > > >
> > > > diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
> > > > index a53ba04d9770..8e25dfe8e653 100644
> > > > --- a/drivers/tty/serial/8250/8250_dw.c
> > > > +++ b/drivers/tty/serial/8250/8250_dw.c
> > > > @@ -42,6 +42,8 @@
> > > >  /* DesignWare specific register fields */
> > > >  #define DW_UART_MCR_SIRE               BIT(6)
> > > >
> > > > +#define DW_UART_USR_BUSY               BIT(0)
> > > > +
> > > >  /* Renesas specific register fields */
> > > >  #define RZN1_UART_xDMACR_DMA_EN                BIT(0)
> > > >  #define RZN1_UART_xDMACR_1_WORD_BURST  (0 << 1)
> > > > @@ -77,6 +79,7 @@ struct dw8250_data {
> > > >
> > > >         unsigned int            skip_autocfg:1;
> > > >         unsigned int            uart_16550_compatible:1;
> > > > +       unsigned int            in_idle:1;
> > > >  };
> > > >
> > > >  static inline struct dw8250_data *to_dw8250_data(struct dw8250_port_data *data)
> > > > @@ -108,36 +111,103 @@ static inline u32 dw8250_modify_msr(struct uart_port *p, unsigned int offset, u3
> > > >  }
> > > >
> > > >  /*
> > > > - * This function is being called as part of the uart_port::serial_out()
> > > > - * routine. Hence, it must not call serial_port_out() or serial_out()
> > > > - * against the modified registers here, i.e. LCR.
> > > > + * Ensure BUSY is not asserted. If DW UART is configured with
> > > > + * !uart_16550_compatible, the writes to LCR, DLL, and DLH fail while
> > > > + * BUSY is asserted.
> > > > + *
> > > > + * Context: port's lock must be held
> > > >   */
> > > > -static void dw8250_force_idle(struct uart_port *p)
> > > > +static int dw8250_idle_enter(struct uart_port *p)
> > > >  {
> > > > +       struct dw8250_data *d = to_dw8250_data(p->private_data);
> > > >         struct uart_8250_port *up = up_to_u8250p(p);
> > > > -       unsigned int lsr;
> > > > +       u32 lsr;
> > > >
> > > > -       /*
> > > > -        * The following call currently performs serial_out()
> > > > -        * against the FCR register. Because it differs to LCR
> > > > -        * there will be no infinite loop, but if it ever gets
> > > > -        * modified, we might need a new custom version of it
> > > > -        * that avoids infinite recursion.
> > > > -        */
> > > > -       serial8250_clear_and_reinit_fifos(up);
> > > > +       if (d->uart_16550_compatible)
> > > > +               return 0;
> > > > +
> > > > +       d->in_idle = 1;
> > > > +
> > > > +       /* Prevent triggering interrupt from RBR filling */
> > > > +       p->serial_out(p, UART_IER, 0);
> > > > +
> > > > +       serial8250_rx_dma_flush(up);
> > > > +       // What about Tx DMA? Should probably pause that too and resume
> > > > +       // afterwards.
> > > > +
> > > > +       p->serial_out(p, UART_MCR, up->mcr | UART_MCR_LOOP);
> > > > +       if (up->capabilities & UART_CAP_FIFO)
> > > > +               p->serial_out(p, UART_FCR, 0);
> > >
> > > Changing this to repeatedly clear the FIFO turned out to reliably
> > > clear the BUSY bit , also no kernel panic or device stuck in busy
> > > mode.
> > >
> > > On the device I tested the first clear is not always enough, under
> > > high load I saw it cleared on the second iteration. I'm thinking it
> > > might be some particular issue with the device I'm using where the
> > > first FIFO clear might fail. I never encountered more than 2
> > > iterations with a "ndelay(p->frame_time);" in between here to get out
> > > of BUSY state.
> >
> > Hi,
> >
> > I seem to have missed this email until now (I'm sorry about that, though
> > to my defence, IIRC, I was quite sick around that timeframe it was sent
> > and clear the email backlog isn't ever fun and may end up missing
> > something).
> >
> > Do you mean changing this to a simple loop or writing something else than
> > just 0 to FCR (or perhaps calling serial8250_clear_fifos())?
> >
> > What is the exact code that you found working?
> >
> Yes, everything worked ok for me after changing the dw8250_idle_enter
> function with the one below. From traces added in the function, it
> sometimes reported iterations_in_busy = 2 but never higher than that.
> The function only has the prepended "while(p->serial_in(p,
> d->pdata->usr_reg) & DW_UART_USR_BUSY) {" to iterate forever, although
> it was at most 2 iterations when serial was stressed:
> static int dw8250_idle_enter(struct uart_port *p)
> {
>     struct dw8250_data *d = to_dw8250_data(p->private_data);
>     struct uart_8250_port *up = up_to_u8250p(p);
>     u32 lsr, iterations_in_busy = 0;
> 
>     if (d->uart_16550_compatible)
>         return 0;
> 
>     d->in_idle = 1;
> 
>     /* Prevent triggering interrupt from RBR filling */
>     p->serial_out(p, UART_IER, 0);
> 
>     serial8250_rx_dma_flush(up);
>     // What about Tx DMA? Should probably pause that too and resume
>     // afterwards.
>     p->serial_out(p, UART_MCR, up->mcr | UART_MCR_LOOP);
> 
>     while(p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY) {
>         if (up->capabilities & UART_CAP_FIFO) {
>             p->serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO);
>             p->serial_out(p, UART_FCR, UART_FCR_ENABLE_FIFO |
>                 UART_FCR_CLEAR_RCVR | UART_FCR_CLEAR_XMIT);
>             p->serial_out(p, UART_FCR, 0);

Thanks for the information!

Okay, so this is same as calling serial8250_clear_fifos() which is what 
I'll do here.

>         }
>         ndelay(p->frame_time);
>         iterations_in_busy++;
>     }
> 
>     trace_printk("Not busy got after %d\n", iterations_in_busy);
>     lsr = serial_lsr_in(up);
>     if (lsr & UART_LSR_DR) {
>         p->serial_in(p, UART_RX);
>         up->lsr_saved_flags = 0;
>     }
> 
>         /*
>      * BUSY might still be frozen to asserted, try to de-freeze it by
>      * sending a character over the loopback and receiving it.
>          */
>     if (p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY) {
>         trace_printk("Serial USR still busy\n");
>         p->serial_out(p, UART_TX, 0);
>         ndelay(1000);
>         lsr = serial_lsr_in(up);
> 
>         if (lsr & UART_LSR_DR) {
>             p->serial_in(p, UART_RX);
>             up->lsr_saved_flags = 0;
>         }
>         }

So I think this entire tx+rx de-freezing wasn't required at all? Adding 
this was based on my guess how the driver could try to force BUSY 
deassertion but if the FIFOs were the real culprit to the BUSY remaining 
asserted, I'd prefer to remove this block entirely to not add random 
complexity just for the sake of doing everything imaginable.

>      /* Now guaranteed to have BUSY deasserted? Just sanity check */
>     if (p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY) {
>         trace_printk("BUSY\n");
>         return -EBUSY;
>     }
> 
>     return 0;
> }
> > So when you fixed this FIFO clearing thing, everything seemed to work okay
> > after that?
> >
> > In the meantime, this issue has once again been reported to me by somebody
> > else, and I've done improvements to shutdown code as well to address a
> > few BUSY related problems (I'll be posting a series that solved that
> > case soon but I suppose this patch needs amendments based on input from
> > your case).
> >
> > --
> >  i.
> >
> > > > +       if (p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY)
> > > > +               ndelay(p->frame_time);
> > > > +
> > > > +       lsr = serial_lsr_in(up);
> > > > +       if (lsr & UART_LSR_DR) {
> > > > +               p->serial_in(p, UART_RX);
> > > > +               up->lsr_saved_flags = 0;
> > > > +       }
> > > >
> > > >         /*
> > > > -        * With PSLVERR_RESP_EN parameter set to 1, the device generates an
> > > > -        * error response when an attempt to read an empty RBR with FIFO
> > > > -        * enabled.
> > > > +        * BUSY might still be frozen to asserted, try to de-freeze it by
> > > > +        * sending a character over the loopback and receiving it.
> > > >          */
> > > > -       if (up->fcr & UART_FCR_ENABLE_FIFO) {
> > > > -               lsr = serial_port_in(p, UART_LSR);
> > > > -               if (!(lsr & UART_LSR_DR))
> > > > -                       return;
> > > > +       if (p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY) {
> > > > +               p->serial_out(p, UART_TX, 0);
> > > > +               ndelay(p->frame_time);
> > > > +               lsr = serial_lsr_in(up);
> > > > +               if (lsr & UART_LSR_DR) {
> > > > +                       p->serial_in(p, UART_RX);
> > > > +                       up->lsr_saved_flags = 0;
> > > > +               }
> > > >         }
> > > >
> > > > -       serial_port_in(p, UART_RX);
> > > > +       /* Now guaranteed to have BUSY deasserted? Just sanity check */
> > > > +       if (p->serial_in(p, d->pdata->usr_reg) & DW_UART_USR_BUSY)
> > > > +               return -EBUSY;
> > > > +
> > > > +       return 0;
> > > > +}


-- 
 i.

  reply	other threads:[~2026-01-22 13:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-19 18:23 [PATCH] serial: 8250 dw: clear FIFO before writting LCR adriana
2025-08-19 18:31 ` Greg KH
2025-08-19 18:32 ` Greg KH
2025-08-20 10:02 ` Ilpo Järvinen
2025-08-22  8:09   ` Adriana Nicolae
2025-08-22  9:06     ` Ilpo Järvinen
2025-10-08 10:39     ` Ilpo Järvinen
     [not found]       ` <CAERbo5xOg4OegoWYid3ZBCX1Fi+MBUMb59EtaQLrYkZy9MzSJA@mail.gmail.com>
2025-10-08 13:11         ` Ilpo Järvinen
2025-10-31 19:32           ` Adriana Nicolae
2026-01-21 14:31             ` Ilpo Järvinen
2026-01-21 16:55               ` Adriana Nicolae
2026-01-22 13:18                 ` Ilpo Järvinen [this message]
  -- strict thread matches above, loose matches on Subject: below --
2025-08-19 15:06 Adriana Nicolae
2025-08-19 15:28 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d93b7a48-c20f-5af5-080c-a5920304d60d@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=adriana@arista.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=linux-serial@vger.kernel.org \
    --cc=prasad@arista.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox