public inbox for linux-serial@vger.kernel.org
 help / color / mirror / Atom feed
* RM9000 code broken in 8250.c
@ 2007-12-21 11:50 Thomas Koeller
  2007-12-21 14:23 ` Alex Williamson
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koeller @ 2007-12-21 11:50 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

Hi,

while investigating breakage of the RM9000 code in
drivers/serial/8250.c, I found that my problems are
caused by commit 40b36daad0ac704e6d5c1b75789f371ef5b053c1.
My device has the characteristic of not generating a
'transmitter empty' interrupt immediately if interrupts
are enabled while the transmitter is empty, it needs to
be kicked by transmitting a character first. Before
commit 40b36daad0ac704e6d5c1b75789f371ef5b053c1 this was
taken care of by code in serial8250_startup() that probed
the hardware for this bug an set the UART_BUG_TXEN flag,
which then modified the behavior of serial8250_start_tx()
to take the bug into account. This used to work well for
my device.

The commit introduced another check for the same condition
which is done earlier in serial8250_startup(), and causes
the driver to use a periodic timer instead of the hardware
interrupt. Now my device uses that timer, while it was
perfectly capable of using the serial port transmit
interrupt before.

I wonder why a new workaround has been introduced for a
problem for which there had already been a (far superior)
solution. Didn't this existing solution solve the problem
for the hardware in question? If so, wouldn't it have been
possible to modify the existing solution instead of adding
another one that conflicts with existing code that depends
on the old code?

regards,
Thomas
-- 
Thomas Koeller
thomas@koeller.dyndns.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2007-12-21 11:50 RM9000 code broken in 8250.c Thomas Koeller
@ 2007-12-21 14:23 ` Alex Williamson
  2008-02-16 17:18   ` Thomas Koeller
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Williamson @ 2007-12-21 14:23 UTC (permalink / raw)
  To: Thomas Koeller; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

Hi Thomas,

   Sorry this code is causing you trouble.  I tried the existing TXEN
bug code, it seems to be fighting a similar, but still quite different
bug.  The UART bug the backup timer code is trying to fix is more
asynchronous.  It might need a kick in the middle of a transmit, not
just at the beginning.  Can we re-arrange the tests such that they both
work?  It would be fine with me if we don't run the test for the backup
timer on ports with UART_BUG_TXEN already enabled.  The easiest
mechanism might be to test port.type for PORT_RM9000, and not test those
for the backup timer.  Then we don't need to re-order the code too much.
Thanks,

  Alex

On Fri, 2007-12-21 at 12:50 +0100, Thomas Koeller wrote:
> Hi,
> 
> while investigating breakage of the RM9000 code in
> drivers/serial/8250.c, I found that my problems are
> caused by commit 40b36daad0ac704e6d5c1b75789f371ef5b053c1.
> My device has the characteristic of not generating a
> 'transmitter empty' interrupt immediately if interrupts
> are enabled while the transmitter is empty, it needs to
> be kicked by transmitting a character first. Before
> commit 40b36daad0ac704e6d5c1b75789f371ef5b053c1 this was
> taken care of by code in serial8250_startup() that probed
> the hardware for this bug an set the UART_BUG_TXEN flag,
> which then modified the behavior of serial8250_start_tx()
> to take the bug into account. This used to work well for
> my device.
> 
> The commit introduced another check for the same condition
> which is done earlier in serial8250_startup(), and causes
> the driver to use a periodic timer instead of the hardware
> interrupt. Now my device uses that timer, while it was
> perfectly capable of using the serial port transmit
> interrupt before.
> 
> I wonder why a new workaround has been introduced for a
> problem for which there had already been a (far superior)
> solution. Didn't this existing solution solve the problem
> for the hardware in question? If so, wouldn't it have been
> possible to modify the existing solution instead of adding
> another one that conflicts with existing code that depends
> on the old code?
> 
> regards,
> Thomas
-- 
Alex Williamson                             HP Open Source & Linux Org.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2007-12-21 14:23 ` Alex Williamson
@ 2008-02-16 17:18   ` Thomas Koeller
  2008-02-17  6:09     ` Alex Williamson
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koeller @ 2008-02-16 17:18 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

Hi Alex,

I apologize for the long delay, I had to sort out a couple of things first,
but I am now ready to return to the issue.

>    Sorry this code is causing you trouble.  I tried the existing TXEN
> bug code, it seems to be fighting a similar, but still quite different
> bug.  The UART bug the backup timer code is trying to fix is more
> asynchronous.  It might need a kick in the middle of a transmit, not
> just at the beginning.

Is the test code you added really capable of detecting this kind of behavior?
To me it does not seem to be - it looks as if it only checks whether the
transmitter empty interrupt is raised when that interrupt is enabled
while the transmitter holding register is empty. That is the same
condition the UART_BUG_TXEN code checks for.

> Can we re-arrange the tests such that they both 
> work?  It would be fine with me if we don't run the test for the backup
> timer on ports with UART_BUG_TXEN already enabled.  The easiest
> mechanism might be to test port.type for PORT_RM9000, and not test those
> for the backup timer.  Then we don't need to re-order the code too much.

I do not think that is an option. The UART_BUG_TXEN code has been there
before I added support for the RM9000, so it probably has more users.

Looking at your description of the bug you are trying to deal with
(interrupts cease to be generated randomly in the middle of a transmission),
I wonder if it is at all possible to write a runtime check that reliably
can detect it. Wouldn't a configuration option be more appropriate here?
This would also have the added advantage of being able to leave out your
code in cases where it is not needed, reducing bloat.

thanks,
Thomas

-- 
Thomas Koeller
thomas@koeller.dyndns.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-16 17:18   ` Thomas Koeller
@ 2008-02-17  6:09     ` Alex Williamson
  2008-02-17 12:47       ` Thomas Koeller
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Williamson @ 2008-02-17  6:09 UTC (permalink / raw)
  To: Thomas Koeller; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

Hi Thomas,

On Sat, 2008-02-16 at 18:18 +0100, Thomas Koeller wrote:
> >    Sorry this code is causing you trouble.  I tried the existing TXEN
> > bug code, it seems to be fighting a similar, but still quite different
> > bug.  The UART bug the backup timer code is trying to fix is more
> > asynchronous.  It might need a kick in the middle of a transmit, not
> > just at the beginning.
> 
> Is the test code you added really capable of detecting this kind of behavior?
> To me it does not seem to be - it looks as if it only checks whether the
> transmitter empty interrupt is raised when that interrupt is enabled
> while the transmitter holding register is empty. That is the same
> condition the UART_BUG_TXEN code checks for.

   No, it's slightly different.  Note that the test code enables the
transmit empty interrupt, reads the IIR to clear the interrupt, then
disables and re-enables the transmit empty interrupt.  At this point a
normal UART will re-assert the transmitter empty interrupt.  The UART
I'm looking for doesn't.

> > Can we re-arrange the tests such that they both 
> > work?  It would be fine with me if we don't run the test for the backup
> > timer on ports with UART_BUG_TXEN already enabled.  The easiest
> > mechanism might be to test port.type for PORT_RM9000, and not test those
> > for the backup timer.  Then we don't need to re-order the code too much.
> 
> I do not think that is an option. The UART_BUG_TXEN code has been there
> before I added support for the RM9000, so it probably has more users.

   If there are more TXEN bug users, I haven't heard complaints.  I'm
really not sure what uses the TXEN bug code, which makes it difficult to
test or modify it's behavior.  Why do you think it's not an option to
have both tests?  Is your UART testing positive for both the backup
timer and the TXEN bug?  If so, perhaps my test needs to store the first
IIR we read and make sure NO_INT is only asserted on the second read.
Maybe we could even combine the tests then.

> Looking at your description of the bug you are trying to deal with
> (interrupts cease to be generated randomly in the middle of a transmission),
> I wonder if it is at all possible to write a runtime check that reliably
> can detect it. Wouldn't a configuration option be more appropriate here?
> This would also have the added advantage of being able to leave out your
> code in cases where it is not needed, reducing bloat.

   The existing test seems to be able to reliably distinguish between
UARTs exhibiting normal transmitter interrupt behavior and those that
can effectively drop an interrupt.  Even with a configuration option, we
need a runtime test because these systems need to be supported by
standard distribution kernels and we can't interfere with fully
functional UARTs.  I'd much rather pursue an option that can allow both
workarounds.  Thanks,

	Alex

-- 
Alex Williamson                             HP Open Source & Linux Org.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-17  6:09     ` Alex Williamson
@ 2008-02-17 12:47       ` Thomas Koeller
  2008-02-20 17:56         ` Alex Williamson
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koeller @ 2008-02-17 12:47 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

On Sonntag, 17. Februar 2008, Alex Williamson wrote:
> On Sat, 2008-02-16 at 18:18 +0100, Thomas Koeller wrote:
> > >    Sorry this code is causing you trouble.  I tried the existing TXEN
> > > bug code, it seems to be fighting a similar, but still quite different
> > > bug.  The UART bug the backup timer code is trying to fix is more
> > > asynchronous.  It might need a kick in the middle of a transmit, not
> > > just at the beginning.
> >
> > Is the test code you added really capable of detecting this kind of
> > behavior? To me it does not seem to be - it looks as if it only checks
> > whether the transmitter empty interrupt is raised when that interrupt is
> > enabled while the transmitter holding register is empty. That is the same
> > condition the UART_BUG_TXEN code checks for.
>
>    No, it's slightly different.  Note that the test code enables the
> transmit empty interrupt, reads the IIR to clear the interrupt, then
> disables and re-enables the transmit empty interrupt.  At this point a
> normal UART will re-assert the transmitter empty interrupt.  The UART
> I'm looking for doesn't.

O.K., let me try to get this straight. While a real 8250 or 16550 compatible
UART raises a transmitter empty interrupt if this interrupt is enabled while
the transmitter holding register is empty, there are buggy implementations
that do not. Your test code checks for this condition, and, in a less robust
and less elaborate way, so does the UART_BUG_TXEN test code.

The way to deal with this bug is to kick the transmitter by transmitting
the first character upon start of every transmission, instead of relying on
the interrupt handler to do this. If this is done, the RM9000 UART works
normally, by generating transmitter empty interrupts as expected when the
transmitter holding register becomes empty again, so it does not need any
special treatment after the initial kick. Also, as I did not introduce
this code, there must be (or have been) at least one more UART that behaves
this way.

Your hardware seems to be even buggier than that, because, to quote you
"It might need a kick in the middle of a transmit, not just at the beginning".
Is this to say that the transmitter interrupt is not working at all, or that
it is not generated reliably, but works to some extent? I assumed the latter
was the case, because of the phrase "might need a kick" which seemed to imply
that this only occurs at random.

>
> > > Can we re-arrange the tests such that they both
> > > work?  It would be fine with me if we don't run the test for the backup
> > > timer on ports with UART_BUG_TXEN already enabled.  The easiest
> > > mechanism might be to test port.type for PORT_RM9000, and not test
> > > those for the backup timer.  Then we don't need to re-order the code
> > > too much.
> >
> > I do not think that is an option. The UART_BUG_TXEN code has been there
> > before I added support for the RM9000, so it probably has more users.
>
>    If there are more TXEN bug users, I haven't heard complaints.  I'm
> really not sure what uses the TXEN bug code, which makes it difficult to
> test or modify it's behavior.  Why do you think it's not an option to
> have both tests?  Is your UART testing positive for both the backup
> timer and the TXEN bug?  If so, perhaps my test needs to store the first
> IIR we read and make sure NO_INT is only asserted on the second read.
> Maybe we could even combine the tests then.

Yes, my hardware fails your test and therefore the backup timer is used,
although its interrupt is working perfectly after kicking it once at the
beginning of the transmission. I still do not see how your existing test code
would discriminate between a hardware that behaves this way and one like
yours that looses interrupts later on even if kicked once at the beginning,
or has no working interrupt at all.

thanks,
Thomas


-- 
Thomas Koeller
thomas@koeller.dyndns.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-17 12:47       ` Thomas Koeller
@ 2008-02-20 17:56         ` Alex Williamson
  2008-02-26  0:59           ` Thomas Koeller
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Williamson @ 2008-02-20 17:56 UTC (permalink / raw)
  To: Thomas Koeller; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

Hi Thomas,

On Sun, 2008-02-17 at 13:47 +0100, Thomas Koeller wrote:
> 
> O.K., let me try to get this straight. While a real 8250 or 16550 compatible
> UART raises a transmitter empty interrupt if this interrupt is enabled while
> the transmitter holding register is empty, there are buggy implementations
> that do not. Your test code checks for this condition, and, in a less robust
> and less elaborate way, so does the UART_BUG_TXEN test code.

   UART_BUG_TXEN does not provide the kick my UART needs to keep
running.  I just re-tested enabling it and the UART hangs repeatedly.  I
think the patch below might clarify the logic of the backup timer test
and might be a potential compromise to allow both to exist.  Please
check.

> The way to deal with this bug is to kick the transmitter by transmitting
> the first character upon start of every transmission, instead of relying on
> the interrupt handler to do this. If this is done, the RM9000 UART works
> normally, by generating transmitter empty interrupts as expected when the
> transmitter holding register becomes empty again, so it does not need any
> special treatment after the initial kick. Also, as I did not introduce
> this code, there must be (or have been) at least one more UART that behaves
> this way.

   Certainly, your UART needs a different fix, and I don't think one
precludes the other.

> Your hardware seems to be even buggier than that, because, to quote you
> "It might need a kick in the middle of a transmit, not just at the beginning".
> Is this to say that the transmitter interrupt is not working at all, or that
> it is not generated reliably, but works to some extent? I assumed the latter
> was the case, because of the phrase "might need a kick" which seemed to imply
> that this only occurs at random.

   My UART generates the first interrupt, but then fails to reassert the
interrupt again later.  Thus the kick approach.  The TXEN bug doesn't
handle this

> Yes, my hardware fails your test and therefore the backup timer is used,
> although its interrupt is working perfectly after kicking it once at the
> beginning of the transmission. I still do not see how your existing test code
> would discriminate between a hardware that behaves this way and one like
> yours that looses interrupts later on even if kicked once at the beginning,
> or has no working interrupt at all.

   The key is that I'm looking at the 2nd IIR read and expecting NO_INT.
My UART passes the TXEN bug test, so the first interrupt is generated
correctly.  Does this patch help at all?  Thanks,

	Alex

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
--

diff -r 3ecfcae9a8f9 drivers/serial/8250.c
--- a/drivers/serial/8250.c	Fri Jan 25 00:00:31 2008 +0000
+++ b/drivers/serial/8250.c	Wed Feb 20 10:35:39 2008 -0700
@@ -1813,6 +1813,7 @@ static int serial8250_startup(struct uar
 	}
 
 	if (is_real_interrupt(up->port.irq)) {
+		unsigned char iir1;
 		/*
 		 * Test for UARTs that do not reassert THRE when the
 		 * transmitter is idle and the interrupt has already
@@ -1826,7 +1827,7 @@ static int serial8250_startup(struct uar
 		wait_for_xmitr(up, UART_LSR_THRE);
 		serial_out_sync(up, UART_IER, UART_IER_THRI);
 		udelay(1); /* allow THRE to set */
-		serial_in(up, UART_IIR);
+		iir1 = serial_in(up, UART_IIR);
 		serial_out(up, UART_IER, 0);
 		serial_out_sync(up, UART_IER, UART_IER_THRI);
 		udelay(1); /* allow a working UART time to re-assert THRE */
@@ -1839,7 +1840,7 @@ static int serial8250_startup(struct uar
 		 * If the interrupt is not reasserted, setup a timer to
 		 * kick the UART on a regular basis.
 		 */
-		if (iir & UART_IIR_NO_INT) {
+		if (!(iir1 & UART_IIR_NO_INT) && iir & UART_IIR_NO_INT) {
 			pr_debug("ttyS%d - using backup timer\n", port->line);
 			up->timer.function = serial8250_backup_timeout;
 			up->timer.data = (unsigned long)up;




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-20 17:56         ` Alex Williamson
@ 2008-02-26  0:59           ` Thomas Koeller
  2008-02-26 15:37             ` Alex Williamson
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koeller @ 2008-02-26  0:59 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

On Mittwoch, 20. Februar 2008, Alex Williamson wrote:
>
>    The key is that I'm looking at the 2nd IIR read and expecting NO_INT.
> My UART passes the TXEN bug test, so the first interrupt is generated
> correctly.  Does this patch help at all?

The patch seems to be good. After applying it, my UART passes the
modified test. I never tried enabling the interrupt twice, as your
modified test does, and was not aware that in this case my hardware
would raise an interrupt. How did you know? Can we be sure that possible
other UARTS affected by UART_BUG_TXEN will also do this?

thanks,
Thomas


-- 
Thomas Koeller
thomas@koeller.dyndns.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-26  0:59           ` Thomas Koeller
@ 2008-02-26 15:37             ` Alex Williamson
  2008-03-02 16:03               ` Thomas Koeller
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Williamson @ 2008-02-26 15:37 UTC (permalink / raw)
  To: Thomas Koeller; +Cc: linux-serial, Aristeu Sergio Rozanski Filho


On Tue, 2008-02-26 at 01:59 +0100, Thomas Koeller wrote:
> On Mittwoch, 20. Februar 2008, Alex Williamson wrote:
> >
> >    The key is that I'm looking at the 2nd IIR read and expecting NO_INT.
> > My UART passes the TXEN bug test, so the first interrupt is generated
> > correctly.  Does this patch help at all?
> 
> The patch seems to be good. After applying it, my UART passes the
> modified test. I never tried enabling the interrupt twice, as your
> modified test does, and was not aware that in this case my hardware
> would raise an interrupt. How did you know? Can we be sure that possible
> other UARTS affected by UART_BUG_TXEN will also do this?

Hi Thomas,

   What I expect to happen is that your UART will not raise the transmit
empty interrupt either time (but most importantly the first time).  My
UART does interrupt on the first, but not the second.  Thus the failure
to *reassert* the interrupt as described in the comment.  Are you sure
your hardware is raising the interrupt?  I'm hoping that it doesn't, and
will fall out of my test and get caught by the original TXEN test.

   I placed the backup timer test before the TXEN bug test to make sure
I get a clean shot at the interrupts.  I was assuming that TXEN bug
UARTs never assert the transmitter empty interrupt, but I either forgot
to add the check that this patch does, or was hoping the backup timer
would keep them running too.  We could consider consolidating the tests,
as they're very similar, but frankly, I'm afraid to mess with the TXEN
test without having at least a couple UARTs with the issue to test.

   If your UART is detected correctly now and you don't have any
suggestions for further improvements, I'll go ahead and submit this.
Let me know.  Thanks,

	Alex

-- 
Alex Williamson                             HP Open Source & Linux Org.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RM9000 code broken in 8250.c
  2008-02-26 15:37             ` Alex Williamson
@ 2008-03-02 16:03               ` Thomas Koeller
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Koeller @ 2008-03-02 16:03 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-serial, Aristeu Sergio Rozanski Filho

On Dienstag, 26. Februar 2008, Alex Williamson wrote:
>    If your UART is detected correctly now and you don't have any
> suggestions for further improvements, I'll go ahead and submit this.
> Let me know.  Thanks,

Yes, please go ahead and commit your changes. I ran some more tests
and everything seemed to work as expected.

thanks,
Thomas


-- 
Thomas Koeller
thomas@koeller.dyndns.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-03-02 16:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-21 11:50 RM9000 code broken in 8250.c Thomas Koeller
2007-12-21 14:23 ` Alex Williamson
2008-02-16 17:18   ` Thomas Koeller
2008-02-17  6:09     ` Alex Williamson
2008-02-17 12:47       ` Thomas Koeller
2008-02-20 17:56         ` Alex Williamson
2008-02-26  0:59           ` Thomas Koeller
2008-02-26 15:37             ` Alex Williamson
2008-03-02 16:03               ` Thomas Koeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox