* UART_IIR_BUSY set for 16550A @ 2014-05-23 17:08 Prasad Koya 2014-05-25 0:36 ` Theodore Ts'o 0 siblings, 1 reply; 10+ messages in thread From: Prasad Koya @ 2014-05-23 17:08 UTC (permalink / raw) To: linux-serial; +Cc: gregkh Hi I don't see anyone in kernel using UART_IIR_BUSY bit except Designware serial driver. We are using 8250 driver for our 16550A and occasionally we see UART_IIR_BUSY set and soon after that console is hosed. In what situations is this bit set? I don't see much documentation for this. Appreciate any help. thank you. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-23 17:08 UART_IIR_BUSY set for 16550A Prasad Koya @ 2014-05-25 0:36 ` Theodore Ts'o 2014-05-25 1:22 ` Prasad Koya 0 siblings, 1 reply; 10+ messages in thread From: Theodore Ts'o @ 2014-05-25 0:36 UTC (permalink / raw) To: Prasad Koya; +Cc: linux-serial, gregkh On Fri, May 23, 2014 at 10:08:49AM -0700, Prasad Koya wrote: > > I don't see anyone in kernel using UART_IIR_BUSY bit except Designware > serial driver. We are using 8250 driver for our 16550A and > occasionally we see UART_IIR_BUSY set and soon after that console is > hosed. In what situations is this bit set? I don't see much > documentation for this. UART_IIR_BUSY is not a bit, it's a magic bit pattern, which is I believe a Designware-specific hack. As far as standard 8250-compatible UART's are concerned, if the low bit (bit 0) is set in the IIR register, there are no interrupts pending, and so you shouldn't need to check the 0x06 bits (i.e., bits 1 and 2). - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-25 0:36 ` Theodore Ts'o @ 2014-05-25 1:22 ` Prasad Koya 2014-05-25 2:44 ` Theodore Ts'o 0 siblings, 1 reply; 10+ messages in thread From: Prasad Koya @ 2014-05-25 1:22 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-serial, gregkh Thanks for looking into this. With 16550A, I'm seeing this weird issue with 3.4 kernel. At random times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is set. #define UART_IIR_BUSY 0x07 /* DesignWare APB Busy Detect */ Soon after this I'm running into "serial8250: too much work for irq4". And this is printed after iterating 512 times in 8250_interrupt handler. This message is printed one more time right after this and it appears that console does not work after those messages. I was suspicious about that 'busy detect' bit. Am trying to reproduce this and see what is in LCR when this hits. Can I (or how do I) reset the device if I see this bit set? On Sat, May 24, 2014 at 5:36 PM, Theodore Ts'o <tytso@mit.edu> wrote: > On Fri, May 23, 2014 at 10:08:49AM -0700, Prasad Koya wrote: >> >> I don't see anyone in kernel using UART_IIR_BUSY bit except Designware >> serial driver. We are using 8250 driver for our 16550A and >> occasionally we see UART_IIR_BUSY set and soon after that console is >> hosed. In what situations is this bit set? I don't see much >> documentation for this. > > UART_IIR_BUSY is not a bit, it's a magic bit pattern, which is I > believe a Designware-specific hack. As far as standard > 8250-compatible UART's are concerned, if the low bit (bit 0) is set in > the IIR register, there are no interrupts pending, and so you > shouldn't need to check the 0x06 bits (i.e., bits 1 and 2). > > - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-25 1:22 ` Prasad Koya @ 2014-05-25 2:44 ` Theodore Ts'o 2014-05-25 6:21 ` Prasad Koya 0 siblings, 1 reply; 10+ messages in thread From: Theodore Ts'o @ 2014-05-25 2:44 UTC (permalink / raw) To: Prasad Koya; +Cc: linux-serial, gregkh On Sat, May 24, 2014 at 06:22:02PM -0700, Prasad Koya wrote: > Thanks for looking into this. > > With 16550A, I'm seeing this weird issue with 3.4 kernel. At random > times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is > set. The high two bits mean the FIFO enabled -- so that's the 0xCX bits. The 0x0C bits means that there is an interrupt pending (the low bit is 0). Bit 2 means that data is available in the FIFO: #define UART_IIR_RDI 0x04 /* Receiver data interrupt */ Not that this matters; in the 8250 driver we simply check to see if the UART_IIR_NO_INT bit is not set, and then instead of actually checking the rest of the IIR register, we just check (a) if there is incoming characters to read, (b) if the transmit FIFO has room available and we have characters waiting to be sent, or (c) if the modem status lines have changed and we care about that. > Soon after this I'm running into "serial8250: too much work for irq4". > And this is printed after iterating 512 times in 8250_interrupt > handler. This message is printed one more time right after this and it > appears that console does not work after those messages. I was > suspicious about that 'busy detect' bit. Am trying to reproduce this > and see what is in LCR when this hits. Can I (or how do I) reset the > device if I see this bit set? So what this means is that the serial port is apparently continuously active. Because legacy ISA bus interrupts were edge triggered we needed to make sure the all of the sources of interrupts for that irq have been cleared before we return. To do this, we check all of the UART's assocated with the irq (you should check and see if you have more than one serial port associated with the irq) and only return once all of the UART's report that they are not ready (i.e., that we've serviced all possible receive, transmit, and modem status register changes). But if the UART's are constantly reporting lots of work, as a safety measure so that we don't completely hang the kernel, we check the PASS_LIMIT and if that gets exceeded we print the "too much work" message and break out. On ISA bus systems, this could cause the interrupt to no longer signal. To prevent this, there was a backup serial timeout that would allow the system to automatically recover. None of this should be necessary on modern systems. I do see this message using KVM, with a virtual serial console which is faster than any real RS-232 port, so it's possible to trigger the "too much work" message. But since any modern/sane bus uses level-triggered interrupts, and KVM emulates a sane bus, the fact that we exit via the "too much work" interrupt doesn't cause the interrupt to go dead. If you are seeing the serial console go dead after this message, it implies that you might have an edge-triggered interupt. But if that's true, I'd call this a case of "the 1980's are calling and they want their crappy ISA bus back".... - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-25 2:44 ` Theodore Ts'o @ 2014-05-25 6:21 ` Prasad Koya 2014-05-25 12:04 ` Theodore Ts'o 0 siblings, 1 reply; 10+ messages in thread From: Prasad Koya @ 2014-05-25 6:21 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-serial, gregkh In our systems, serial port interrupt is not shared between any devices. In the first iteration, I see [ 480.972099] BUG1027: I0: 1571:0xc2 1551:0x21 1449:2 1492:1 IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and sent 1 byte of data. Since the interrupt handler services all ports before it returns, in next iteration it sees: [ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0 and it continues to see that till iteration 349. and nothing was read from FIFO or transmitted from iteration 1 to 349. [ 480.972525] BUG1027: I349: 1571:0xcc 1551:0x0 At next iteration it had 0x60 in LSR and again nothing is read or sent out. This continues till we see that "too much work". [ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60 : [ 480.972737] serial8250: too much work for irq4 #define UART_LSR_TEMT 0x40 /* Transmitter empty */ #define UART_LSR_THRE 0x20 /* Transmit-hold-register empty */ After it exits interrupt handler above, on next interrupt handler IIR_NO_INT is still 0 and LSR reads 0x60 the whole PASS_LIMIT iterations. [ 480.975458] BUG1027: I0: 1571:0xcc 1551:0x60 So the "too much work" happens back to back and only once at random time. In our case the serial console ports on our systems are connected to a serial concentrator. Like the KVM situation you mentioned, is it possible our serial port concentrator is behaving bad? In 2.6.38 this PASS_LIMIT is 256. I'll also check with our h/w lab admin to see if there is anything special with serial port concentrator. thanks again. On Sat, May 24, 2014 at 7:44 PM, Theodore Ts'o <tytso@mit.edu> wrote: > On Sat, May 24, 2014 at 06:22:02PM -0700, Prasad Koya wrote: >> Thanks for looking into this. >> >> With 16550A, I'm seeing this weird issue with 3.4 kernel. At random >> times 8250 driver reads 0xcc out of IIR. I'm not sure why bit 2 is >> set. > > The high two bits mean the FIFO enabled -- so that's the 0xCX bits. > The 0x0C bits means that there is an interrupt pending (the low bit is > 0). Bit 2 means that data is available in the FIFO: > > #define UART_IIR_RDI 0x04 /* Receiver data interrupt */ > > Not that this matters; in the 8250 driver we simply check to see if > the UART_IIR_NO_INT bit is not set, and then instead of actually > checking the rest of the IIR register, we just check (a) if there is > incoming characters to read, (b) if the transmit FIFO has room > available and we have characters waiting to be sent, or (c) if the > modem status lines have changed and we care about that. > >> Soon after this I'm running into "serial8250: too much work for irq4". >> And this is printed after iterating 512 times in 8250_interrupt >> handler. This message is printed one more time right after this and it >> appears that console does not work after those messages. I was >> suspicious about that 'busy detect' bit. Am trying to reproduce this >> and see what is in LCR when this hits. Can I (or how do I) reset the >> device if I see this bit set? > > So what this means is that the serial port is apparently continuously > active. Because legacy ISA bus interrupts were edge triggered we > needed to make sure the all of the sources of interrupts for that irq > have been cleared before we return. To do this, we check all of the > UART's assocated with the irq (you should check and see if you have > more than one serial port associated with the irq) and only return > once all of the UART's report that they are not ready (i.e., that > we've serviced all possible receive, transmit, and modem status > register changes). But if the UART's are constantly reporting lots of > work, as a safety measure so that we don't completely hang the kernel, > we check the PASS_LIMIT and if that gets exceeded we print the "too > much work" message and break out. On ISA bus systems, this could > cause the interrupt to no longer signal. To prevent this, there was a > backup serial timeout that would allow the system to automatically recover. > > None of this should be necessary on modern systems. I do see this > message using KVM, with a virtual serial console which is faster than > any real RS-232 port, so it's possible to trigger the "too much work" > message. But since any modern/sane bus uses level-triggered > interrupts, and KVM emulates a sane bus, the fact that we exit via the > "too much work" interrupt doesn't cause the interrupt to go dead. > > If you are seeing the serial console go dead after this message, it > implies that you might have an edge-triggered interupt. But if that's > true, I'd call this a case of "the 1980's are calling and they want > their crappy ISA bus back".... > > - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-25 6:21 ` Prasad Koya @ 2014-05-25 12:04 ` Theodore Ts'o 2014-05-27 23:33 ` Prasad Koya 0 siblings, 1 reply; 10+ messages in thread From: Theodore Ts'o @ 2014-05-25 12:04 UTC (permalink / raw) To: Prasad Koya; +Cc: linux-serial, gregkh On Sat, May 24, 2014 at 11:21:38PM -0700, Prasad Koya wrote: > IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and > sent 1 byte of data. > > Since the interrupt handler services all ports before it returns, in > next iteration it sees: > > [ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0 > > and it continues to see that till iteration 349. and nothing was read > from FIFO or transmitted from iteration 1 to 349. If there's nothing in the receive FIFO to read, and assuming the serial driver is correctly determining that fact (i.e., UART_LSR_DR is zero), then something is buggy with your UART. If IIR is 0xcc, that means that the UART is signalling that there is a receive interrupt pending, with the FIFO level being below the trigger level but that the characters have been in the FIFO long enough that they should get picked up. When the serial driver reads all of the characters from the receive buffer, by checking UART_LSR for the UART_LSR_DR bit, and if it is set, reading from the receive buffer via serial_in(up, UART_RX), that should clear the IIR register of the receive interrupt. > At next iteration it had 0x60 in LSR and again nothing is read or sent > out. This continues till we see that "too much work". > > [ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60 Yep, buggy UART. Unfortunately there are lot of crappy reimplementations of the 8250/16550A UART's out there. :-( Which is precisely why we have the "too much work" safety check. If we didn't, your system would be locked up forever, looping in the serial driver. There is, alas, a lot of crappy hardware out there, which is why the serial driver was coded so defensively. In any case, combination of LSR with the UART_LSR_DR bit clear and IIR set to 0xcc is one of these "this should never happen, under any circumstances, with a correctly implemented 8250/16550A compatible UART". Which is why I can't really tell you how to reset the UART, since at least in theory, this should never happen, and if it does happen, without having access to the hardware and doing a lot of frustating testing to reverse engineer the exact nature of the buggy hardware, it's hard to figure out how to work around the brain damage. Which is one of the reasons I was happy to get out of the serial driver maintenance business. :-) Cheers, - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-25 12:04 ` Theodore Ts'o @ 2014-05-27 23:33 ` Prasad Koya 2014-05-28 5:03 ` Theodore Ts'o 0 siblings, 1 reply; 10+ messages in thread From: Prasad Koya @ 2014-05-27 23:33 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-serial, gregkh Hi Ted, http://www.lammertbies.nl/comm/info/serial-uart.html says IIR of 0xXC => "Character timeout (16550)". http://en.wikibooks.org/wiki/Serial_Programming/8250_UART_Programming says those bits mean "timeout interrupt pending". from intel manual: http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/intel-communications-chipset-89xx-series-datasheet.pdf section 4.13.4.4.3 is about "Character Timeout Interrupt". ===== When the receiver FIFO and receiver time out interrupt are enabled, a character timeout interrupt occurs when all of the following conditions exist: • At least one character is in the FIFO. • The last received character was longer than four continuous character times ago (if two stop bits are programmed the second one is included in this time delay). • The most recent processor read of the FIFO was longer than four continuous character times ago. • The receive FIFO trigger level is greater than one. The maximum time between a received character and a timeout interrupt is 160 ms at 300 baud with a 12-bit receive character (for example, one start, eight data, one parity, and two stop bits). When a time out interrupt occurs, it is cleared and the timer is reset when the processor reads one character from the receiver FIFO. If a timeout interrupt has not occurred, the timeout timer is reset after a new character is received or after the processor reads the receiver FIFO. ==== From what I understand, there is a character in recv FIFO and no other characters have been received in last 4 chars time. So why wouldn't that set UART_LSR_DR? #define UART_LSR_DR 0x01 /* Receiver data ready */ If my understanding is correct and there is a byte in recv FIFO to be read, why isn't the driver coded to pick up that byte if IIR is 0xcc. maybe not all 16550A compatible UARTs don't do this and thats why its left out? Infact, if i type a char on console and let it go idle, I'm seeing IIR register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte is getting picked up. So I wonder why at a random time, UART sets IIR as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350 iterations in that loop and stays that way. For a buggy UART like that, sounds like one could use that condition as exception to go ahead and read the receive buffer. What do you say? Thank you. On Sun, May 25, 2014 at 5:04 AM, Theodore Ts'o <tytso@mit.edu> wrote: > On Sat, May 24, 2014 at 11:21:38PM -0700, Prasad Koya wrote: >> IIR as 0xc2 and LSR as 0x21 and it read 2 chars in that iteration and >> sent 1 byte of data. >> >> Since the interrupt handler services all ports before it returns, in >> next iteration it sees: >> >> [ 480.972102] BUG1027: I1: 1571:0xcc 1551:0x0 >> >> and it continues to see that till iteration 349. and nothing was read >> from FIFO or transmitted from iteration 1 to 349. > > If there's nothing in the receive FIFO to read, and assuming the > serial driver is correctly determining that fact (i.e., UART_LSR_DR is > zero), then something is buggy with your UART. If IIR is 0xcc, that > means that the UART is signalling that there is a receive interrupt > pending, with the FIFO level being below the trigger level but that > the characters have been in the FIFO long enough that they should get > picked up. > > When the serial driver reads all of the characters from the receive > buffer, by checking UART_LSR for the UART_LSR_DR bit, and if it is > set, reading from the receive buffer via serial_in(up, UART_RX), that > should clear the IIR register of the receive interrupt. > >> At next iteration it had 0x60 in LSR and again nothing is read or sent >> out. This continues till we see that "too much work". >> >> [ 480.972526] BUG1027: I350: 1571:0xcc 1551:0x60 > > Yep, buggy UART. Unfortunately there are lot of crappy > reimplementations of the 8250/16550A UART's out there. :-( > > Which is precisely why we have the "too much work" safety check. If > we didn't, your system would be locked up forever, looping in the > serial driver. There is, alas, a lot of crappy hardware out there, > which is why the serial driver was coded so defensively. > > In any case, combination of LSR with the UART_LSR_DR bit clear and IIR > set to 0xcc is one of these "this should never happen, under any > circumstances, with a correctly implemented 8250/16550A compatible > UART". Which is why I can't really tell you how to reset the UART, > since at least in theory, this should never happen, and if it does > happen, without having access to the hardware and doing a lot of > frustating testing to reverse engineer the exact nature of the buggy > hardware, it's hard to figure out how to work around the brain damage. > Which is one of the reasons I was happy to get out of the serial > driver maintenance business. :-) > > Cheers, > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-serial" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-27 23:33 ` Prasad Koya @ 2014-05-28 5:03 ` Theodore Ts'o 2014-05-28 5:47 ` Prasad Koya 0 siblings, 1 reply; 10+ messages in thread From: Theodore Ts'o @ 2014-05-28 5:03 UTC (permalink / raw) To: Prasad Koya; +Cc: linux-serial, gregkh On Tue, May 27, 2014 at 04:33:58PM -0700, Prasad Koya wrote: > > From what I understand, there is a character in recv FIFO and no other > characters have been received in last 4 chars time. So why wouldn't > that set UART_LSR_DR? > > #define UART_LSR_DR 0x01 /* Receiver data ready */ It is supposed to set UART_LSR_DR. But your own debugging printk's have shown that it doesn't in some cases. Hence my assertion that you have a buggy UART. > If my understanding is correct and there is a byte in recv FIFO to be > read, why isn't the driver coded to pick up that byte if IIR is 0xcc. > maybe not all 16550A compatible UARTs don't do this and thats why its > left out? The problem is you have to sample UART_LSR_DR to determine when the FIFO is empty, because just because the receive interrupt bit is set in the IIR, you don't know how many characters are in the Receive FIFO. So you have to trust the UART_LSR_DR to tell you when the receive FIFO is empty. Now, I suppose you could check to see if the first time through the loop, if UART_LSR_DR is clear, maybe you should try anyway, but that means adding a lot of extra complexity that historically has never been needed. What UART is this, and is there some way we can shame the manufacturer into fixing it? It would be a shame to have to put in even more hair just for one outlier. Now, if some major manufacturer is shippping huge numbers of buggy UART's, maybe we should work around it --- but at this point, I think it would be useful to understand who the guilty party might be, and whether this is a systemic problem, or whether your specific chip is buggy. In general, making chages to the uart core is always fragile, because a workaround for one buggy manufacturer could introduce problems for other buggy UARTs.... > Infact, if i type a char on console and let it go idle, I'm seeing IIR > register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte > is getting picked up. So I wonder why at a random time, UART sets IIR > as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350 > iterations in that loop and stays that way. For a buggy UART like > that, sounds like one could use that condition as exception to go > ahead and read the receive buffer. What do you say? The other question is we don't know whether it's the IIR which is buggy, or the DR bit which is buggy. Maybe the receive FIFO really is empty, but it's the transmit interrupt which is stuck. So we don't know whether the right thing to do is to read from the RX register and put it into the buffer. It could be that might not do anything, and just cause a stream of null's, or garbage, or the last character read from the FIFO to be jam up the incoming tty receive buffer. So if you were going to implement something which says, "ignore the DR bit being clear, just read from the FIFO anyway, because the IIR tells me so, there had better be some limiter where if the IIR doesn't change even after you try reading from the receive buffer, at some point you really do want to give up." Basically, the best way to program the serial driver is very defensively. Assume that the UART firmware is written by monkeys, and malicious monkeys at that. Because sooner or later, you will come across some UART which really is crappier than you know or can imagine.... Cheers, - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-28 5:03 ` Theodore Ts'o @ 2014-05-28 5:47 ` Prasad Koya 2014-05-29 6:18 ` Theodore Ts'o 0 siblings, 1 reply; 10+ messages in thread From: Prasad Koya @ 2014-05-28 5:47 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-serial, gregkh Hi Ted, this UART is from intel board. here is the data sheet i referred to earlier: http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/intel-communications-chipset-89xx-series-datasheet.pdf we ship AMD based boards as well that has integrated UART but we haven't seen this on AMD boards. yes, the code i am talking about could be something like this that should get enabled in last resort: if ( pass_counter == (PASS_LIMIT - 1) ) { if ( iir == 0xcc and ! (status & UART_LSR_DR) ) { status |= UART_LSR_DR; } } and let it fall through: if (status & (UART_LSR_DR | UART_LSR_BI)) { status = serial8250_rx_chars(up, status); } if character received is garbage i'm thinking it should be caught in our test scripts. its hacky. i am trying to be optimistic, give that a try and see if helps. if char received is garbage then as you hinted it must be that transmit ready interrupt that is not getting set. in the first iteration of this interrupt handling (at the random time where it eventually hits PASS_LIMIT), LSR is 0x21 and it always received 2 chars and sent 1 char. this has been consistent in all occurrences of the bug. when this happens i'm also wondering if 'stty' can help in resetting the UART. thanks again. On Tue, May 27, 2014 at 10:03 PM, Theodore Ts'o <tytso@mit.edu> wrote: > On Tue, May 27, 2014 at 04:33:58PM -0700, Prasad Koya wrote: >> >> From what I understand, there is a character in recv FIFO and no other >> characters have been received in last 4 chars time. So why wouldn't >> that set UART_LSR_DR? >> >> #define UART_LSR_DR 0x01 /* Receiver data ready */ > > It is supposed to set UART_LSR_DR. But your own debugging printk's > have shown that it doesn't in some cases. Hence my assertion that you > have a buggy UART. > >> If my understanding is correct and there is a byte in recv FIFO to be >> read, why isn't the driver coded to pick up that byte if IIR is 0xcc. >> maybe not all 16550A compatible UARTs don't do this and thats why its >> left out? > > The problem is you have to sample UART_LSR_DR to determine when the > FIFO is empty, because just because the receive interrupt bit is set > in the IIR, you don't know how many characters are in the Receive > FIFO. So you have to trust the UART_LSR_DR to tell you when the > receive FIFO is empty. > > Now, I suppose you could check to see if the first time through the > loop, if UART_LSR_DR is clear, maybe you should try anyway, but that > means adding a lot of extra complexity that historically has never > been needed. > > What UART is this, and is there some way we can shame the manufacturer > into fixing it? It would be a shame to have to put in even more hair > just for one outlier. Now, if some major manufacturer is shippping > huge numbers of buggy UART's, maybe we should work around it --- but > at this point, I think it would be useful to understand who the guilty > party might be, and whether this is a systemic problem, or whether > your specific chip is buggy. > > In general, making chages to the uart core is always fragile, because > a workaround for one buggy manufacturer could introduce problems for > other buggy UARTs.... > >> Infact, if i type a char on console and let it go idle, I'm seeing IIR >> register as 0xCC and LSR as 0x61. Since bit 0 of LSR is set, that byte >> is getting picked up. So I wonder why at a random time, UART sets IIR >> as 0xCC and leaves LSR as 0 and LSR becomes 0x60 after about 350 >> iterations in that loop and stays that way. For a buggy UART like >> that, sounds like one could use that condition as exception to go >> ahead and read the receive buffer. What do you say? > > The other question is we don't know whether it's the IIR which is > buggy, or the DR bit which is buggy. Maybe the receive FIFO really is > empty, but it's the transmit interrupt which is stuck. So we don't > know whether the right thing to do is to read from the RX register and > put it into the buffer. It could be that might not do anything, and > just cause a stream of null's, or garbage, or the last character read > from the FIFO to be jam up the incoming tty receive buffer. > > So if you were going to implement something which says, "ignore the DR > bit being clear, just read from the FIFO anyway, because the IIR tells > me so, there had better be some limiter where if the IIR doesn't > change even after you try reading from the receive buffer, at some > point you really do want to give up." > > Basically, the best way to program the serial driver is very > defensively. Assume that the UART firmware is written by monkeys, and > malicious monkeys at that. Because sooner or later, you will come > across some UART which really is crappier than you know or can imagine.... > > Cheers, > > - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: UART_IIR_BUSY set for 16550A 2014-05-28 5:47 ` Prasad Koya @ 2014-05-29 6:18 ` Theodore Ts'o 0 siblings, 0 replies; 10+ messages in thread From: Theodore Ts'o @ 2014-05-29 6:18 UTC (permalink / raw) To: Prasad Koya; +Cc: linux-serial, gregkh On Tue, May 27, 2014 at 10:47:05PM -0700, Prasad Koya wrote: > > if char received is garbage then as you hinted it must be that > transmit ready interrupt that is not getting set. This was typo on my part. What I meant to say was that it is the "receive ready interrupt" (not transmit) getting set when it shouldn't be. That is, the fact that DR bit was clear was indeed correct --- the receive FIFO really was empty, but for some reason some other part of the UART was convinced that it should be trying to trigger a receive interrupt, hence the value in IIR. In that case, if you force the DR bit to be set, then you'll be looping forever loading garbage into the tty flip buffer until it overflows. This is why I said you should put in a counter that only does this workaround a limited number of times, and which point you send a printk of the saying "Buggy Intel UART, abandon all hope", and then submit a patch to the kernel cc'ing someone from Intel, in the hopes that they will fix the bug.... Cheers, - Ted ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-05-29 6:18 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-23 17:08 UART_IIR_BUSY set for 16550A Prasad Koya 2014-05-25 0:36 ` Theodore Ts'o 2014-05-25 1:22 ` Prasad Koya 2014-05-25 2:44 ` Theodore Ts'o 2014-05-25 6:21 ` Prasad Koya 2014-05-25 12:04 ` Theodore Ts'o 2014-05-27 23:33 ` Prasad Koya 2014-05-28 5:03 ` Theodore Ts'o 2014-05-28 5:47 ` Prasad Koya 2014-05-29 6:18 ` Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).