linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* ZILOG serial port broken in 2.6.32
@ 2009-12-06  7:01 Rob Landley
  2009-12-07  1:10 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 8+ messages in thread
From: Rob Landley @ 2009-12-06  7:01 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus

Trying again with a few likely-looking cc's from the MAINTAINERS file:

Summary: 

The PMACZILOG serial driver last worked in 2.6.28.  It was broken by commit 
f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits of the tty 
layer dynamically allocated.  The PMACZILOG driver wasn't properly converted, 
it works with interrupts disabled (for boot messages), but as soon as 
interrupts are enabled (PID 1 spawns) the next write to the serial console 
panics the kernel.

Up through 2.6.31 I could fix it by reverting that patch (which isn't a proper 
fix but it made it work).  In 2.6.32 the patch no longer cleanly reverts.

I reported the issue here (with a cut and paste of the panic trace):

  http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-October/076727.html

And reported the results of bisecting the issue here:

  http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-October/077059.html

I noted that 2.6.32-pre had broken my workaround here:

  http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-December/078498.html

Background:

I have a project that builds the same native Linux development environment for 
multiple hardware targets.  It aims to support all the targets QEMU system 
emulation can boot Linux under, although I'm still a few short.  It creates a 
cross compiler and uses it to build a root filesystem from uClibc and busybox, 
adds a native toolchain, and packages it up into a system image (squashfs, 
ext2, or initramfs depending on the config you selected).

Anyone can then boot the resulting system image under qemu and use it to wet 
source and compile stuff natively.   (If the cross compiler is in the $PATH on 
the host, it will even configure distcc to call out to that cross compiler to 
speed up the builds to merely "painfully slow", with some pretense of SMP 
scalability).

Prebuilt binaries of all the targets I had working last release are at 
http://impactlinux.com/fwl/downloads/binaries (with obligatory screenshots at 
http://impactlinux.com/fwl/screenshots/ even).  They use the 2.6.31 kernel.

It supports powerpc.  If you look at system-image-powerpc.tar.bz2 you'll see 
that the run-emulator.sh script has been using qemu's "g3beige" target board 
emulation, which provides all the hardware I need for a development 
environment (hard drive, network card, at least 256 megs of memory, working 
clock chip, and of course a serial console).  Userspace doesn't care what I 
use, it's the same processor instruction set and same C library either way, 
the board emulation's just something to boot it on, only the kernel .config 
really cares as long as the appropriate resources are there.

Unfortunately, g3beige seems to have bit-rotted, and thus the serial console 
is now panicing.  This is a regression, and thus blocking a release of my 
project using the 2.6.32 kernel.  Is this of interest to anyone other than me?

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ZILOG serial port broken in 2.6.32
  2009-12-06  7:01 ZILOG serial port broken in 2.6.32 Rob Landley
@ 2009-12-07  1:10 ` Benjamin Herrenschmidt
  2009-12-07 20:25   ` Rob Landley
  2009-12-08 12:42   ` [PATCH] " Rob Landley
  0 siblings, 2 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2009-12-07  1:10 UTC (permalink / raw)
  To: Rob Landley; +Cc: paulus, linuxppc-dev

On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> Trying again with a few likely-looking cc's from the MAINTAINERS file:
> 
> Summary: 
> 
> The PMACZILOG serial driver last worked in 2.6.28.  It was broken by commit 
> f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits of the tty 
> layer dynamically allocated.  The PMACZILOG driver wasn't properly converted, 
> it works with interrupts disabled (for boot messages), but as soon as 
> interrupts are enabled (PID 1 spawns) the next write to the serial console 
> panics the kernel.

Ah looks like I missed that... I'll dig. Thanks for the report.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ZILOG serial port broken in 2.6.32
  2009-12-07  1:10 ` Benjamin Herrenschmidt
@ 2009-12-07 20:25   ` Rob Landley
  2009-12-08 12:42   ` [PATCH] " Rob Landley
  1 sibling, 0 replies; 8+ messages in thread
From: Rob Landley @ 2009-12-07 20:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev

On Sunday 06 December 2009 19:10:48 Benjamin Herrenschmidt wrote:
> On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> > Trying again with a few likely-looking cc's from the MAINTAINERS file:
> >
> > Summary:
> >
> > The PMACZILOG serial driver last worked in 2.6.28.  It was broken by
> > commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits
> > of the tty layer dynamically allocated.  The PMACZILOG driver wasn't
> > properly converted, it works with interrupts disabled (for boot
> > messages), but as soon as interrupts are enabled (PID 1 spawns) the next
> > write to the serial console panics the kernel.
>
> Ah looks like I missed that... I'll dig. Thanks for the report.

Ooh, thanks!

I've been digging into it myself this weekend, but I don't understand any of 
this code, so it's going slowly...

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Re: ZILOG serial port broken in 2.6.32
  2009-12-07  1:10 ` Benjamin Herrenschmidt
  2009-12-07 20:25   ` Rob Landley
@ 2009-12-08 12:42   ` Rob Landley
  2010-01-08  3:00     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 8+ messages in thread
From: Rob Landley @ 2009-12-08 12:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev

On Sunday 06 December 2009 19:10:48 Benjamin Herrenschmidt wrote:
> On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> > Trying again with a few likely-looking cc's from the MAINTAINERS file:
> >
> > Summary:
> >
> > The PMACZILOG serial driver last worked in 2.6.28.  It was broken by
> > commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits
> > of the tty layer dynamically allocated.  The PMACZILOG driver wasn't
> > properly converted, it works with interrupts disabled (for boot
> > messages), but as soon as interrupts are enabled (PID 1 spawns) the next
> > write to the serial console panics the kernel.
>
> Ah looks like I missed that... I'll dig. Thanks for the report.
>
> Cheers,
> Ben.

Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm) and I'll
leave it to you guys to figure out what this _means_:

Signed-off-by: Rob Landley <rob@landley.net>

diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
--- build/packages/linux/drivers/serial/serial_core.c	2009-12-02 21:51:21.000000000 -0600
+++ build/packages/linux2/drivers/serial/serial_core.c	2009-12-08 06:17:06.000000000 -0600
@@ -113,7 +113,7 @@
 static void uart_tasklet_action(unsigned long data)
 {
 	struct uart_state *state = (struct uart_state *)data;
-	tty_wakeup(state->port.tty);
+	if (state->port.tty) tty_wakeup(state->port.tty);
 }
 
 static inline void

That one line workaround makes the panic go away, and things seem to work fine from there.

I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:

        /* Sanity check, make sure the old bug is no longer happening */
        if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
                WARN_ON(1);
                (void)read_zsdata(uap);
                return NULL;
        }

Which doesn't catch this because it's the write code path (not the read code path) that's running into 
this.

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
  2009-12-08 12:42   ` [PATCH] " Rob Landley
@ 2010-01-08  3:00     ` Benjamin Herrenschmidt
  2010-01-09  8:17       ` Rob Landley
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-08  3:00 UTC (permalink / raw)
  To: Rob Landley; +Cc: paulus, linuxppc-dev


> Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm) and I'll
> leave it to you guys to figure out what this _means_:

I've failed to reproduce so far on both a Wallstreet powerbook (similar
generation and chipset as your beige G3) and a G5 with an added serial
port using current upstream...

Can you verify it's still there ? I might be able to reproduce on a
Beige G3 as well next week.

Cheers,
Ben.

> Signed-off-by: Rob Landley <rob@landley.net>
> 
> diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
> --- build/packages/linux/drivers/serial/serial_core.c	2009-12-02 21:51:21.000000000 -0600
> +++ build/packages/linux2/drivers/serial/serial_core.c	2009-12-08 06:17:06.000000000 -0600
> @@ -113,7 +113,7 @@
>  static void uart_tasklet_action(unsigned long data)
>  {
>  	struct uart_state *state = (struct uart_state *)data;
> -	tty_wakeup(state->port.tty);
> +	if (state->port.tty) tty_wakeup(state->port.tty);
>  }
>  
>  static inline void
> 
> That one line workaround makes the panic go away, and things seem to work fine from there.
> 
> I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:
> 
>         /* Sanity check, make sure the old bug is no longer happening */
>         if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
>                 WARN_ON(1);
>                 (void)read_zsdata(uap);
>                 return NULL;
>         }
> 
> Which doesn't catch this because it's the write code path (not the read code path) that's running into 
> this.
> 
> Rob

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
  2010-01-08  3:00     ` Benjamin Herrenschmidt
@ 2010-01-09  8:17       ` Rob Landley
  2010-01-11  3:02         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 8+ messages in thread
From: Rob Landley @ 2010-01-09  8:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev

On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm)
> > and I'll leave it to you guys to figure out what this _means_:
>
> I've failed to reproduce so far on both a Wallstreet powerbook (similar
> generation and chipset as your beige G3) and a G5 with an added serial
> port using current upstream...
>
> Can you verify it's still there ? I might be able to reproduce on a
> Beige G3 as well next week.

It's still there on qemu 0.11.0's "g3beige" emulation when you use 
CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550 
serial chip for its g3beige emulation instead of the actual ZILOG one.)  Still 
dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch 
fixes it for me, and it comes back without the patch.

I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would change 
this...

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
  2010-01-09  8:17       ` Rob Landley
@ 2010-01-11  3:02         ` Benjamin Herrenschmidt
  2010-01-11  7:22           ` Rob Landley
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-11  3:02 UTC (permalink / raw)
  To: Rob Landley; +Cc: paulus, linuxppc-dev

On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me (tm)
> > > and I'll leave it to you guys to figure out what this _means_:
> >
> > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > generation and chipset as your beige G3) and a G5 with an added serial
> > port using current upstream...
> >
> > Can you verify it's still there ? I might be able to reproduce on a
> > Beige G3 as well next week.
> 
> It's still there on qemu 0.11.0's "g3beige" emulation when you use 
> CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550 
> serial chip for its g3beige emulation instead of the actual ZILOG one.)  Still 
> dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch 
> fixes it for me, and it comes back without the patch.
> 
> I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would change 
> this...

Ok so I compiled qemu and things are a bit strange.

How do you get the output of both channels of the serial port with it ?

If I use -nographics, what happens is that OpenBIOS, for some reason,
tells qemu that the console on the second channel of the ESCC.

I see my kernel messages in the console if I do console=ttyPZ0 but the
debug stuff goes where udbg initializes it, which is where OpenBIOS says
the FW console is, which is channel B and I don't know how to "see" that
with qemu.

I do see it crash due to a message from the kernel but I can't get into
xmon which is a pain.

If I modify the kernel to force udbg on channel A (same channel as the
console), then the problem doesn't appear (it doesn't crash) :-)

Cheers
Ben.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
  2010-01-11  3:02         ` Benjamin Herrenschmidt
@ 2010-01-11  7:22           ` Rob Landley
  0 siblings, 0 replies; 8+ messages in thread
From: Rob Landley @ 2010-01-11  7:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev

On Sunday 10 January 2010 21:02:16 Benjamin Herrenschmidt wrote:
> On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> > On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > > Ok, here's the fix.  It's not the _right_ fix, but it Works For Me
> > > > (tm) and I'll leave it to you guys to figure out what this _means_:
> > >
> > > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > > generation and chipset as your beige G3) and a G5 with an added serial
> > > port using current upstream...
> > >
> > > Can you verify it's still there ? I might be able to reproduce on a
> > > Beige G3 as well next week.
> >
> > It's still there on qemu 0.11.0's "g3beige" emulation when you use
> > CONFIG_SERIAL_PMACZILOG as the serial console.  (QEMU 0.10.x used a 16550
> > serial chip for its g3beige emulation instead of the actual ZILOG one.) 
> > Still dunno if it's a qemu or bug or a kernel bug, I just know that
> > kernel patch fixes it for me, and it comes back without the patch.
> >
> > I tested 2.6.32.  Haven't tried the 2.6.32.3 but don't see why it would
> > change this...
>
> Ok so I compiled qemu and things are a bit strange.
>
> How do you get the output of both channels of the serial port with it ?
>
> If I use -nographics, what happens is that OpenBIOS, for some reason,
> tells qemu that the console on the second channel of the ESCC.

Instead of "-nographic", you could try "-serial stdio" instead?

> I see my kernel messages in the console if I do console=ttyPZ0 but the
> debug stuff goes where udbg initializes it, which is where OpenBIOS says
> the FW console is, which is channel B and I don't know how to "see" that
> with qemu.

I'm just trying to get a serial console, which is why I'm booting the sucker 
with:

qemu-system-ppc -M g3beige -nographic -no-reboot -kernel zImage-powerpc -hda 
image-powerpc.sqf -append "root=/dev/hda rw init=/usr/sbin/init.sh panic=1 
PATH=/usr/bin console=ttyS0"

I didn't even know there were more debug messages...

I have CONFIG_SERIAL_PMACZILOG_TTYS=y of course:

pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>)
ttyS0 at MMIO 0x80813020 (irq = 16) is a Z85c30 ESCC - Serial port
ttyS1 at MMIO 0x80813000 (irq = 17) is a Z85c30 ESCC - Serial port

CONFIG_SERIO=y
CONFIG_SERIAL_PMACZILOG=y
CONFIG_SERIAL_PMACZILOG_TTYS=y
CONFIG_SERIAL_PMACZILOG_CONSOLE=y

> I do see it crash due to a message from the kernel but I can't get into
> xmon which is a pain.

Does the -serial stdio thing help?

(I know to switch between screens in the qemu x11 window, it's ctrl-alt-number 
(so ctrl-alt-1, ctrl-alt-2, and so on.  I really don't use 'em much, though.)

> If I modify the kernel to force udbg on channel A (same channel as the
> console), then the problem doesn't appear (it doesn't crash) :-)

You can attach gdb to qemu via the "qemu -s" option and then in gdb use the 
"target remote" stuff like you would with gdbserver.  It acts a bit like you've 
connected it to a jtag through openocd, if that helps...

(I know qemu has many, many options I don't really use much.)

> Cheers
> Ben.

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-11  7:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-06  7:01 ZILOG serial port broken in 2.6.32 Rob Landley
2009-12-07  1:10 ` Benjamin Herrenschmidt
2009-12-07 20:25   ` Rob Landley
2009-12-08 12:42   ` [PATCH] " Rob Landley
2010-01-08  3:00     ` Benjamin Herrenschmidt
2010-01-09  8:17       ` Rob Landley
2010-01-11  3:02         ` Benjamin Herrenschmidt
2010-01-11  7:22           ` Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).