* UART_IIR_BUSY set for 16550A
@ 2014-05-23 17:08 Prasad Koya
2014-05-25 0:36 ` Theodore Ts'o
0 siblings, 1 reply; 10+ messages in thread
From: Prasad Koya @ 2014-05-23 17:08 UTC (permalink / raw)
To: linux-serial; +Cc: gregkh
Hi
I don't see anyone in kernel using UART_IIR_BUSY bit except Designware
serial driver. We are using 8250 driver for our 16550A and
occasionally we see UART_IIR_BUSY set and soon after that console is
hosed. In what situations is this bit set? I don't see much
documentation for this.
Appreciate any help.
thank you.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-23 17:08 UART_IIR_BUSY set for 16550A Prasad Koya
@ 2014-05-25 0:36 ` Theodore Ts'o
2014-05-25 1:22 ` Prasad Koya
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2014-05-25 0:36 UTC (permalink / raw)
To: Prasad Koya; +Cc: linux-serial, gregkh
On Fri, May 23, 2014 at 10:08:49AM -0700, Prasad Koya wrote:
>
> I don't see anyone in kernel using UART_IIR_BUSY bit except Designware
> serial driver. We are using 8250 driver for our 16550A and
> occasionally we see UART_IIR_BUSY set and soon after that console is
> hosed. In what situations is this bit set? I don't see much
> documentation for this.
UART_IIR_BUSY is not a bit, it's a magic bit pattern, which is I
believe a Designware-specific hack. As far as standard
8250-compatible UART's are concerned, if the low bit (bit 0) is set in
the IIR register, there are no interrupts pending, and so you
shouldn't need to check the 0x06 bits (i.e., bits 1 and 2).
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-25 0:36 ` Theodore Ts'o
@ 2014-05-25 1:22 ` Prasad Koya
2014-05-25 2:44 ` Theodore Ts'o
0 siblings, 1 reply; 10+ messages in thread
From: Prasad Koya @ 2014-05-25 1:22 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-serial, gregkh
Thanks for looking into this.
With 16550A, I'm seeing this weird issue with 3.4 kernel. At random
times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is
set.
#define UART_IIR_BUSY 0x07 /* DesignWare APB Busy Detect */
Soon after this I'm running into "serial8250: too much work for irq4".
And this is printed after iterating 512 times in 8250_interrupt
handler. This message is printed one more time right after this and it
appears that console does not work after those messages. I was
suspicious about that 'busy detect' bit. Am trying to reproduce this
and see what is in LCR when this hits. Can I (or how do I) reset the
device if I see this bit set?
On Sat, May 24, 2014 at 5:36 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Fri, May 23, 2014 at 10:08:49AM -0700, Prasad Koya wrote:
>>
>> I don't see anyone in kernel using UART_IIR_BUSY bit except Designware
>> serial driver. We are using 8250 driver for our 16550A and
>> occasionally we see UART_IIR_BUSY set and soon after that console is
>> hosed. In what situations is this bit set? I don't see much
>> documentation for this.
>
> UART_IIR_BUSY is not a bit, it's a magic bit pattern, which is I
> believe a Designware-specific hack. As far as standard
> 8250-compatible UART's are concerned, if the low bit (bit 0) is set in
> the IIR register, there are no interrupts pending, and so you
> shouldn't need to check the 0x06 bits (i.e., bits 1 and 2).
>
> - Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-25 1:22 ` Prasad Koya
@ 2014-05-25 2:44 ` Theodore Ts'o
2014-05-25 6:21 ` Prasad Koya
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2014-05-25 2:44 UTC (permalink / raw)
To: Prasad Koya; +Cc: linux-serial, gregkh
On Sat, May 24, 2014 at 06:22:02PM -0700, Prasad Koya wrote:
> Thanks for looking into this.
>
> With 16550A, I'm seeing this weird issue with 3.4 kernel. At random
> times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is
> set.
The high two bits mean the FIFO enabled -- so that's the 0xCX bits.
The 0x0C bits means that there is an interrupt pending (the low bit is
0). Bit 2 means that data is available in the FIFO:
#define UART_IIR_RDI 0x04 /* Receiver data interrupt */
Not that this matters; in the 8250 driver we simply check to see if
the UART_IIR_NO_INT bit is not set, and then instead of actually
checking the rest of the IIR register, we just check (a) if there is
incoming characters to read, (b) if the transmit FIFO has room
available and we have characters waiting to be sent, or (c) if the
modem status lines have changed and we care about that.
> Soon after this I'm running into "serial8250: too much work for irq4".
> And this is printed after iterating 512 times in 8250_interrupt
> handler. This message is printed one more time right after this and it
> appears that console does not work after those messages. I was
> suspicious about that 'busy detect' bit. Am trying to reproduce this
> and see what is in LCR when this hits. Can I (or how do I) reset the
> device if I see this bit set?
So what this means is that the serial port is apparently continuously
active. Because legacy ISA bus interrupts were edge triggered we
needed to make sure the all of the sources of interrupts for that irq
have been cleared before we return. To do this, we check all of the
UART's assocated with the irq (you should check and see if you have
more than one serial port associated with the irq) and only return
once all of the UART's report that they are not ready (i.e., that
we've serviced all possible receive, transmit, and modem status
register changes). But if the UART's are constantly reporting lots of
work, as a safety measure so that we don't completely hang the kernel,
we check the PASS_LIMIT and if that gets exceeded we print the "too
much work" message and break out. On ISA bus systems, this could
cause the interrupt to no longer signal. To prevent this, there was a
backup serial timeout that would allow the system to automatically recover.
None of this should be necessary on modern systems. I do see this
message using KVM, with a virtual serial console which is faster than
any real RS-232 port, so it's possible to trigger the "too much work"
message. But since any modern/sane bus uses level-triggered
interrupts, and KVM emulates a sane bus, the fact that we exit via the
"too much work" interrupt doesn't cause the interrupt to go dead.
If you are seeing the serial console go dead after this message, it
implies that you might have an edge-triggered interupt. But if that's
true, I'd call this a case of "the 1980's are calling and they want
their crappy ISA bus back"....
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-25 2:44 ` Theodore Ts'o
@ 2014-05-25 6:21 ` Prasad Koya
2014-05-25 12:04 ` Theodore Ts'o
0 siblings, 1 reply; 10+ messages in thread
From: Prasad Koya @ 2014-05-25 6:21 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-serial, gregkh
In our systems, serial port interrupt is not shared between any devices.
In the first iteration, I see
[ 480.972099] BUG1027: I0: 1571:0xc2 1551:0x21 1449:2 1492:1
IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and
sent 1 byte of data.
Since the interrupt handler services all ports before it returns, in
next iteration it sees:
[ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0
and it continues to see that till iteration 349. and nothing was read
from FIFO or transmitted from iteration 1 to 349.
[ 480.972525] BUG1027: I349: 1571:0xcc 1551:0x0
At next iteration it had 0x60 in LSR and again nothing is read or sent
out. This continues till we see that "too much work".
[ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60
:
[ 480.972737] serial8250: too much work for irq4
#define UART_LSR_TEMT 0x40 /* Transmitter empty */
#define UART_LSR_THRE 0x20 /* Transmit-hold-register empty */
After it exits interrupt handler above, on next interrupt handler
IIR_NO_INT is still 0 and LSR reads 0x60 the whole PASS_LIMIT
iterations.
[ 480.975458] BUG1027: I0: 1571:0xcc 1551:0x60
So the "too much work" happens back to back and only once at random time.
In our case the serial console ports on our systems are connected to a
serial concentrator. Like the KVM situation you mentioned, is it
possible our serial port concentrator is behaving bad? In 2.6.38 this
PASS_LIMIT is 256. I'll also check with our h/w lab admin to see if
there is anything special with serial port concentrator.
thanks again.
On Sat, May 24, 2014 at 7:44 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Sat, May 24, 2014 at 06:22:02PM -0700, Prasad Koya wrote:
>> Thanks for looking into this.
>>
>> With 16550A, I'm seeing this weird issue with 3.4 kernel. At random
>> times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is
>> set.
>
> The high two bits mean the FIFO enabled -- so that's the 0xCX bits.
> The 0x0C bits means that there is an interrupt pending (the low bit is
> 0). Bit 2 means that data is available in the FIFO:
>
> #define UART_IIR_RDI 0x04 /* Receiver data interrupt */
>
> Not that this matters; in the 8250 driver we simply check to see if
> the UART_IIR_NO_INT bit is not set, and then instead of actually
> checking the rest of the IIR register, we just check (a) if there is
> incoming characters to read, (b) if the transmit FIFO has room
> available and we have characters waiting to be sent, or (c) if the
> modem status lines have changed and we care about that.
>
>> Soon after this I'm running into "serial8250: too much work for irq4".
>> And this is printed after iterating 512 times in 8250_interrupt
>> handler. This message is printed one more time right after this and it
>> appears that console does not work after those messages. I was
>> suspicious about that 'busy detect' bit. Am trying to reproduce this
>> and see what is in LCR when this hits. Can I (or how do I) reset the
>> device if I see this bit set?
>
> So what this means is that the serial port is apparently continuously
> active. Because legacy ISA bus interrupts were edge triggered we
> needed to make sure the all of the sources of interrupts for that irq
> have been cleared before we return. To do this, we check all of the
> UART's assocated with the irq (you should check and see if you have
> more than one serial port associated with the irq) and only return
> once all of the UART's report that they are not ready (i.e., that
> we've serviced all possible receive, transmit, and modem status
> register changes). But if the UART's are constantly reporting lots of
> work, as a safety measure so that we don't completely hang the kernel,
> we check the PASS_LIMIT and if that gets exceeded we print the "too
> much work" message and break out. On ISA bus systems, this could
> cause the interrupt to no longer signal. To prevent this, there was a
> backup serial timeout that would allow the system to automatically recover.
>
> None of this should be necessary on modern systems. I do see this
> message using KVM, with a virtual serial console which is faster than
> any real RS-232 port, so it's possible to trigger the "too much work"
> message. But since any modern/sane bus uses level-triggered
> interrupts, and KVM emulates a sane bus, the fact that we exit via the
> "too much work" interrupt doesn't cause the interrupt to go dead.
>
> If you are seeing the serial console go dead after this message, it
> implies that you might have an edge-triggered interupt. But if that's
> true, I'd call this a case of "the 1980's are calling and they want
> their crappy ISA bus back"....
>
> - Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-25 6:21 ` Prasad Koya
@ 2014-05-25 12:04 ` Theodore Ts'o
2014-05-27 23:33 ` Prasad Koya
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2014-05-25 12:04 UTC (permalink / raw)
To: Prasad Koya; +Cc: linux-serial, gregkh
On Sat, May 24, 2014 at 11:21:38PM -0700, Prasad Koya wrote:
> IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and
> sent 1 byte of data.
>
> Since the interrupt handler services all ports before it returns, in
> next iteration it sees:
>
> [ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0
>
> and it continues to see that till iteration 349. and nothing was read
> from FIFO or transmitted from iteration 1 to 349.
If there's nothing in the receive FIFO to read, and assuming the
serial driver is correctly determining that fact (i.e., UART_LSR_DR is
zero), then something is buggy with your UART. If IIR is 0xcc, that
means that the UART is signalling that there is a receive interrupt
pending, with the FIFO level being below the trigger level but that
the characters have been in the FIFO long enough that they should get
picked up.
When the serial driver reads all of the characters from the receive
buffer, by checking UART_LSR for the UART_LSR_DR bit, and if it is
set, reading from the receive buffer via serial_in(up, UART_RX), that
should clear the IIR register of the receive interrupt.
> At next iteration it had 0x60 in LSR and again nothing is read or sent
> out. This continues till we see that "too much work".
>
> [ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60
Yep, buggy UART. Unfortunately there are lot of crappy
reimplementations of the 8250/16550A UART's out there. :-(
Which is precisely why we have the "too much work" safety check. If
we didn't, your system would be locked up forever, looping in the
serial driver. There is, alas, a lot of crappy hardware out there,
which is why the serial driver was coded so defensively.
In any case, combination of LSR with the UART_LSR_DR bit clear and IIR
set to 0xcc is one of these "this should never happen, under any
circumstances, with a correctly implemented 8250/16550A compatible
UART". Which is why I can't really tell you how to reset the UART,
since at least in theory, this should never happen, and if it does
happen, without having access to the hardware and doing a lot of
frustating testing to reverse engineer the exact nature of the buggy
hardware, it's hard to figure out how to work around the brain damage.
Which is one of the reasons I was happy to get out of the serial
driver maintenance business. :-)
Cheers,
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-25 12:04 ` Theodore Ts'o
@ 2014-05-27 23:33 ` Prasad Koya
2014-05-28 5:03 ` Theodore Ts'o
0 siblings, 1 reply; 10+ messages in thread
From: Prasad Koya @ 2014-05-27 23:33 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-serial, gregkh
Hi Ted,
http://www.lammertbies.nl/comm/info/serial-uart.html says IIR of 0xXC
=> "Character timeout (16550)".
http://en.wikibooks.org/wiki/Serial_Programming/8250_UART_Programming
says those bits mean "timeout interrupt pending".
from intel manual:
http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/intel-communications-chipset-89xx-series-datasheet.pdf
section 4.13.4.4.3 is about "Character Timeout Interrupt".
=====
When the receiver FIFO and receiver time out interrupt are enabled, a character
timeout interrupt occurs when all of the following conditions exist:
• At least one character is in the FIFO.
• The last received character was longer than four continuous
character times ago (if
two stop bits are programmed the second one is included in this time delay).
• The most recent processor read of the FIFO was longer than four continuous
character times ago.
• The receive FIFO trigger level is greater than one.
The maximum time between a received character and a timeout interrupt
is 160 ms at
300 baud with a 12-bit receive character (for example, one start,
eight data, one
parity, and two stop bits).
When a time out interrupt occurs, it is cleared and the timer is reset when the
processor reads one character from the receiver FIFO. If a timeout
interrupt has not
occurred, the timeout timer is reset after a new character is received
or after the
processor reads the receiver FIFO.
====
From what I understand, there is a character in recv FIFO and no other
characters have been received in last 4 chars time. So why wouldn't
that set UART_LSR_DR?
#define UART_LSR_DR 0x01 /* Receiver data ready */
If my understanding is correct and there is a byte in recv FIFO to be
read, why isn't the driver coded to pick up that byte if IIR is 0xcc.
maybe not all 16550A compatible UARTs don't do this and thats why its
left out?
Infact, if i type a char on console and let it go idle, I'm seeing IIR
register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte
is getting picked up. So I wonder why at a random time, UART sets IIR
as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350
iterations in that loop and stays that way. For a buggy UART like
that, sounds like one could use that condition as exception to go
ahead and read the receive buffer. What do you say?
Thank you.
On Sun, May 25, 2014 at 5:04 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Sat, May 24, 2014 at 11:21:38PM -0700, Prasad Koya wrote:
>> IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and
>> sent 1 byte of data.
>>
>> Since the interrupt handler services all ports before it returns, in
>> next iteration it sees:
>>
>> [ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0
>>
>> and it continues to see that till iteration 349. and nothing was read
>> from FIFO or transmitted from iteration 1 to 349.
>
> If there's nothing in the receive FIFO to read, and assuming the
> serial driver is correctly determining that fact (i.e., UART_LSR_DR is
> zero), then something is buggy with your UART. If IIR is 0xcc, that
> means that the UART is signalling that there is a receive interrupt
> pending, with the FIFO level being below the trigger level but that
> the characters have been in the FIFO long enough that they should get
> picked up.
>
> When the serial driver reads all of the characters from the receive
> buffer, by checking UART_LSR for the UART_LSR_DR bit, and if it is
> set, reading from the receive buffer via serial_in(up, UART_RX), that
> should clear the IIR register of the receive interrupt.
>
>> At next iteration it had 0x60 in LSR and again nothing is read or sent
>> out. This continues till we see that "too much work".
>>
>> [ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60
>
> Yep, buggy UART. Unfortunately there are lot of crappy
> reimplementations of the 8250/16550A UART's out there. :-(
>
> Which is precisely why we have the "too much work" safety check. If
> we didn't, your system would be locked up forever, looping in the
> serial driver. There is, alas, a lot of crappy hardware out there,
> which is why the serial driver was coded so defensively.
>
> In any case, combination of LSR with the UART_LSR_DR bit clear and IIR
> set to 0xcc is one of these "this should never happen, under any
> circumstances, with a correctly implemented 8250/16550A compatible
> UART". Which is why I can't really tell you how to reset the UART,
> since at least in theory, this should never happen, and if it does
> happen, without having access to the hardware and doing a lot of
> frustating testing to reverse engineer the exact nature of the buggy
> hardware, it's hard to figure out how to work around the brain damage.
> Which is one of the reasons I was happy to get out of the serial
> driver maintenance business. :-)
>
> Cheers,
>
> - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-27 23:33 ` Prasad Koya
@ 2014-05-28 5:03 ` Theodore Ts'o
2014-05-28 5:47 ` Prasad Koya
0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2014-05-28 5:03 UTC (permalink / raw)
To: Prasad Koya; +Cc: linux-serial, gregkh
On Tue, May 27, 2014 at 04:33:58PM -0700, Prasad Koya wrote:
>
> From what I understand, there is a character in recv FIFO and no other
> characters have been received in last 4 chars time. So why wouldn't
> that set UART_LSR_DR?
>
> #define UART_LSR_DR 0x01 /* Receiver data ready */
It is supposed to set UART_LSR_DR. But your own debugging printk's
have shown that it doesn't in some cases. Hence my assertion that you
have a buggy UART.
> If my understanding is correct and there is a byte in recv FIFO to be
> read, why isn't the driver coded to pick up that byte if IIR is 0xcc.
> maybe not all 16550A compatible UARTs don't do this and thats why its
> left out?
The problem is you have to sample UART_LSR_DR to determine when the
FIFO is empty, because just because the receive interrupt bit is set
in the IIR, you don't know how many characters are in the Receive
FIFO. So you have to trust the UART_LSR_DR to tell you when the
receive FIFO is empty.
Now, I suppose you could check to see if the first time through the
loop, if UART_LSR_DR is clear, maybe you should try anyway, but that
means adding a lot of extra complexity that historically has never
been needed.
What UART is this, and is there some way we can shame the manufacturer
into fixing it? It would be a shame to have to put in even more hair
just for one outlier. Now, if some major manufacturer is shippping
huge numbers of buggy UART's, maybe we should work around it --- but
at this point, I think it would be useful to understand who the guilty
party might be, and whether this is a systemic problem, or whether
your specific chip is buggy.
In general, making chages to the uart core is always fragile, because
a workaround for one buggy manufacturer could introduce problems for
other buggy UARTs....
> Infact, if i type a char on console and let it go idle, I'm seeing IIR
> register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte
> is getting picked up. So I wonder why at a random time, UART sets IIR
> as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350
> iterations in that loop and stays that way. For a buggy UART like
> that, sounds like one could use that condition as exception to go
> ahead and read the receive buffer. What do you say?
The other question is we don't know whether it's the IIR which is
buggy, or the DR bit which is buggy. Maybe the receive FIFO really is
empty, but it's the transmit interrupt which is stuck. So we don't
know whether the right thing to do is to read from the RX register and
put it into the buffer. It could be that might not do anything, and
just cause a stream of null's, or garbage, or the last character read
from the FIFO to be jam up the incoming tty receive buffer.
So if you were going to implement something which says, "ignore the DR
bit being clear, just read from the FIFO anyway, because the IIR tells
me so, there had better be some limiter where if the IIR doesn't
change even after you try reading from the receive buffer, at some
point you really do want to give up."
Basically, the best way to program the serial driver is very
defensively. Assume that the UART firmware is written by monkeys, and
malicious monkeys at that. Because sooner or later, you will come
across some UART which really is crappier than you know or can imagine....
Cheers,
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-28 5:03 ` Theodore Ts'o
@ 2014-05-28 5:47 ` Prasad Koya
2014-05-29 6:18 ` Theodore Ts'o
0 siblings, 1 reply; 10+ messages in thread
From: Prasad Koya @ 2014-05-28 5:47 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: linux-serial, gregkh
Hi Ted,
this UART is from intel board. here is the data sheet i referred to earlier:
http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/intel-communications-chipset-89xx-series-datasheet.pdf
we ship AMD based boards as well that has integrated UART but we
haven't seen this on AMD boards.
yes, the code i am talking about could be something like this that
should get enabled in last resort:
if ( pass_counter == (PASS_LIMIT - 1) ) {
if ( iir == 0xcc and ! (status & UART_LSR_DR) ) {
status |= UART_LSR_DR;
}
}
and let it fall through:
if (status & (UART_LSR_DR | UART_LSR_BI)) {
status = serial8250_rx_chars(up, status);
}
if character received is garbage i'm thinking it should be caught in
our test scripts.
its hacky. i am trying to be optimistic, give that a try and see if helps.
if char received is garbage then as you hinted it must be that
transmit ready interrupt that is not getting set. in the first
iteration of this interrupt handling (at the random time where it
eventually hits PASS_LIMIT), LSR is 0x21 and it always received 2
chars and sent 1 char. this has been consistent in all occurrences of
the bug.
when this happens i'm also wondering if 'stty' can help in resetting the UART.
thanks again.
On Tue, May 27, 2014 at 10:03 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Tue, May 27, 2014 at 04:33:58PM -0700, Prasad Koya wrote:
>>
>> From what I understand, there is a character in recv FIFO and no other
>> characters have been received in last 4 chars time. So why wouldn't
>> that set UART_LSR_DR?
>>
>> #define UART_LSR_DR 0x01 /* Receiver data ready */
>
> It is supposed to set UART_LSR_DR. But your own debugging printk's
> have shown that it doesn't in some cases. Hence my assertion that you
> have a buggy UART.
>
>> If my understanding is correct and there is a byte in recv FIFO to be
>> read, why isn't the driver coded to pick up that byte if IIR is 0xcc.
>> maybe not all 16550A compatible UARTs don't do this and thats why its
>> left out?
>
> The problem is you have to sample UART_LSR_DR to determine when the
> FIFO is empty, because just because the receive interrupt bit is set
> in the IIR, you don't know how many characters are in the Receive
> FIFO. So you have to trust the UART_LSR_DR to tell you when the
> receive FIFO is empty.
>
> Now, I suppose you could check to see if the first time through the
> loop, if UART_LSR_DR is clear, maybe you should try anyway, but that
> means adding a lot of extra complexity that historically has never
> been needed.
>
> What UART is this, and is there some way we can shame the manufacturer
> into fixing it? It would be a shame to have to put in even more hair
> just for one outlier. Now, if some major manufacturer is shippping
> huge numbers of buggy UART's, maybe we should work around it --- but
> at this point, I think it would be useful to understand who the guilty
> party might be, and whether this is a systemic problem, or whether
> your specific chip is buggy.
>
> In general, making chages to the uart core is always fragile, because
> a workaround for one buggy manufacturer could introduce problems for
> other buggy UARTs....
>
>> Infact, if i type a char on console and let it go idle, I'm seeing IIR
>> register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte
>> is getting picked up. So I wonder why at a random time, UART sets IIR
>> as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350
>> iterations in that loop and stays that way. For a buggy UART like
>> that, sounds like one could use that condition as exception to go
>> ahead and read the receive buffer. What do you say?
>
> The other question is we don't know whether it's the IIR which is
> buggy, or the DR bit which is buggy. Maybe the receive FIFO really is
> empty, but it's the transmit interrupt which is stuck. So we don't
> know whether the right thing to do is to read from the RX register and
> put it into the buffer. It could be that might not do anything, and
> just cause a stream of null's, or garbage, or the last character read
> from the FIFO to be jam up the incoming tty receive buffer.
>
> So if you were going to implement something which says, "ignore the DR
> bit being clear, just read from the FIFO anyway, because the IIR tells
> me so, there had better be some limiter where if the IIR doesn't
> change even after you try reading from the receive buffer, at some
> point you really do want to give up."
>
> Basically, the best way to program the serial driver is very
> defensively. Assume that the UART firmware is written by monkeys, and
> malicious monkeys at that. Because sooner or later, you will come
> across some UART which really is crappier than you know or can imagine....
>
> Cheers,
>
> - Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A
2014-05-28 5:47 ` Prasad Koya
@ 2014-05-29 6:18 ` Theodore Ts'o
0 siblings, 0 replies; 10+ messages in thread
From: Theodore Ts'o @ 2014-05-29 6:18 UTC (permalink / raw)
To: Prasad Koya; +Cc: linux-serial, gregkh
On Tue, May 27, 2014 at 10:47:05PM -0700, Prasad Koya wrote:
>
> if char received is garbage then as you hinted it must be that
> transmit ready interrupt that is not getting set.
This was typo on my part. What I meant to say was that it is the
"receive ready interrupt" (not transmit) getting set when it shouldn't
be.
That is, the fact that DR bit was clear was indeed correct --- the
receive FIFO really was empty, but for some reason some other part of
the UART was convinced that it should be trying to trigger a receive
interrupt, hence the value in IIR.
In that case, if you force the DR bit to be set, then you'll be
looping forever loading garbage into the tty flip buffer until it
overflows. This is why I said you should put in a counter that only
does this workaround a limited number of times, and which point you
send a printk of the saying "Buggy Intel UART, abandon all hope", and
then submit a patch to the kernel cc'ing someone from Intel, in the
hopes that they will fix the bug....
Cheers,
- Ted
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-05-29 6:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-23 17:08 UART_IIR_BUSY set for 16550A Prasad Koya
2014-05-25 0:36 ` Theodore Ts'o
2014-05-25 1:22 ` Prasad Koya
2014-05-25 2:44 ` Theodore Ts'o
2014-05-25 6:21 ` Prasad Koya
2014-05-25 12:04 ` Theodore Ts'o
2014-05-27 23:33 ` Prasad Koya
2014-05-28 5:03 ` Theodore Ts'o
2014-05-28 5:47 ` Prasad Koya
2014-05-29 6:18 ` Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).