* ZILOG serial port broken in 2.6.32
@ 2009-12-06 7:01 Rob Landley
2009-12-07 1:10 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Rob Landley @ 2009-12-06 7:01 UTC (permalink / raw)
To: linuxppc-dev; +Cc: paulus
Trying again with a few likely-looking cc's from the MAINTAINERS file:
Summary:
The PMACZILOG serial driver last worked in 2.6.28. It was broken by commit
f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits of the tty
layer dynamically allocated. The PMACZILOG driver wasn't properly converted,
it works with interrupts disabled (for boot messages), but as soon as
interrupts are enabled (PID 1 spawns) the next write to the serial console
panics the kernel.
Up through 2.6.31 I could fix it by reverting that patch (which isn't a proper
fix but it made it work). In 2.6.32 the patch no longer cleanly reverts.
I reported the issue here (with a cut and paste of the panic trace):
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-October/076727.html
And reported the results of bisecting the issue here:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-October/077059.html
I noted that 2.6.32-pre had broken my workaround here:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-December/078498.html
Background:
I have a project that builds the same native Linux development environment for
multiple hardware targets. It aims to support all the targets QEMU system
emulation can boot Linux under, although I'm still a few short. It creates a
cross compiler and uses it to build a root filesystem from uClibc and busybox,
adds a native toolchain, and packages it up into a system image (squashfs,
ext2, or initramfs depending on the config you selected).
Anyone can then boot the resulting system image under qemu and use it to wet
source and compile stuff natively. (If the cross compiler is in the $PATH on
the host, it will even configure distcc to call out to that cross compiler to
speed up the builds to merely "painfully slow", with some pretense of SMP
scalability).
Prebuilt binaries of all the targets I had working last release are at
http://impactlinux.com/fwl/downloads/binaries (with obligatory screenshots at
http://impactlinux.com/fwl/screenshots/ even). They use the 2.6.31 kernel.
It supports powerpc. If you look at system-image-powerpc.tar.bz2 you'll see
that the run-emulator.sh script has been using qemu's "g3beige" target board
emulation, which provides all the hardware I need for a development
environment (hard drive, network card, at least 256 megs of memory, working
clock chip, and of course a serial console). Userspace doesn't care what I
use, it's the same processor instruction set and same C library either way,
the board emulation's just something to boot it on, only the kernel .config
really cares as long as the appropriate resources are there.
Unfortunately, g3beige seems to have bit-rotted, and thus the serial console
is now panicing. This is a regression, and thus blocking a release of my
project using the 2.6.32 kernel. Is this of interest to anyone other than me?
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ZILOG serial port broken in 2.6.32
2009-12-06 7:01 ZILOG serial port broken in 2.6.32 Rob Landley
@ 2009-12-07 1:10 ` Benjamin Herrenschmidt
2009-12-07 20:25 ` Rob Landley
2009-12-08 12:42 ` [PATCH] " Rob Landley
0 siblings, 2 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2009-12-07 1:10 UTC (permalink / raw)
To: Rob Landley; +Cc: paulus, linuxppc-dev
On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> Trying again with a few likely-looking cc's from the MAINTAINERS file:
>
> Summary:
>
> The PMACZILOG serial driver last worked in 2.6.28. It was broken by commit
> f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits of the tty
> layer dynamically allocated. The PMACZILOG driver wasn't properly converted,
> it works with interrupts disabled (for boot messages), but as soon as
> interrupts are enabled (PID 1 spawns) the next write to the serial console
> panics the kernel.
Ah looks like I missed that... I'll dig. Thanks for the report.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: ZILOG serial port broken in 2.6.32
2009-12-07 1:10 ` Benjamin Herrenschmidt
@ 2009-12-07 20:25 ` Rob Landley
2009-12-08 12:42 ` [PATCH] " Rob Landley
1 sibling, 0 replies; 8+ messages in thread
From: Rob Landley @ 2009-12-07 20:25 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev
On Sunday 06 December 2009 19:10:48 Benjamin Herrenschmidt wrote:
> On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> > Trying again with a few likely-looking cc's from the MAINTAINERS file:
> >
> > Summary:
> >
> > The PMACZILOG serial driver last worked in 2.6.28. It was broken by
> > commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits
> > of the tty layer dynamically allocated. The PMACZILOG driver wasn't
> > properly converted, it works with interrupts disabled (for boot
> > messages), but as soon as interrupts are enabled (PID 1 spawns) the next
> > write to the serial console panics the kernel.
>
> Ah looks like I missed that... I'll dig. Thanks for the report.
Ooh, thanks!
I've been digging into it myself this weekend, but I don't understand any of
this code, so it's going slowly...
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Re: ZILOG serial port broken in 2.6.32
2009-12-07 1:10 ` Benjamin Herrenschmidt
2009-12-07 20:25 ` Rob Landley
@ 2009-12-08 12:42 ` Rob Landley
2010-01-08 3:00 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 8+ messages in thread
From: Rob Landley @ 2009-12-08 12:42 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev
On Sunday 06 December 2009 19:10:48 Benjamin Herrenschmidt wrote:
> On Sun, 2009-12-06 at 01:01 -0600, Rob Landley wrote:
> > Trying again with a few likely-looking cc's from the MAINTAINERS file:
> >
> > Summary:
> >
> > The PMACZILOG serial driver last worked in 2.6.28. It was broken by
> > commit f751928e0ddf54ea4fe5546f35e99efc5b5d9938 by Alan Cox making bits
> > of the tty layer dynamically allocated. The PMACZILOG driver wasn't
> > properly converted, it works with interrupts disabled (for boot
> > messages), but as soon as interrupts are enabled (PID 1 spawns) the next
> > write to the serial console panics the kernel.
>
> Ah looks like I missed that... I'll dig. Thanks for the report.
>
> Cheers,
> Ben.
Ok, here's the fix. It's not the _right_ fix, but it Works For Me (tm) and I'll
leave it to you guys to figure out what this _means_:
Signed-off-by: Rob Landley <rob@landley.net>
diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
--- build/packages/linux/drivers/serial/serial_core.c 2009-12-02 21:51:21.000000000 -0600
+++ build/packages/linux2/drivers/serial/serial_core.c 2009-12-08 06:17:06.000000000 -0600
@@ -113,7 +113,7 @@
static void uart_tasklet_action(unsigned long data)
{
struct uart_state *state = (struct uart_state *)data;
- tty_wakeup(state->port.tty);
+ if (state->port.tty) tty_wakeup(state->port.tty);
}
static inline void
That one line workaround makes the panic go away, and things seem to work fine from there.
I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:
/* Sanity check, make sure the old bug is no longer happening */
if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
WARN_ON(1);
(void)read_zsdata(uap);
return NULL;
}
Which doesn't catch this because it's the write code path (not the read code path) that's running into
this.
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
2009-12-08 12:42 ` [PATCH] " Rob Landley
@ 2010-01-08 3:00 ` Benjamin Herrenschmidt
2010-01-09 8:17 ` Rob Landley
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-08 3:00 UTC (permalink / raw)
To: Rob Landley; +Cc: paulus, linuxppc-dev
> Ok, here's the fix. It's not the _right_ fix, but it Works For Me (tm) and I'll
> leave it to you guys to figure out what this _means_:
I've failed to reproduce so far on both a Wallstreet powerbook (similar
generation and chipset as your beige G3) and a G5 with an added serial
port using current upstream...
Can you verify it's still there ? I might be able to reproduce on a
Beige G3 as well next week.
Cheers,
Ben.
> Signed-off-by: Rob Landley <rob@landley.net>
>
> diff -ru build/packages/linux/drivers/serial/serial_core.c build/packages/linux2/drivers/serial/serial_core.c
> --- build/packages/linux/drivers/serial/serial_core.c 2009-12-02 21:51:21.000000000 -0600
> +++ build/packages/linux2/drivers/serial/serial_core.c 2009-12-08 06:17:06.000000000 -0600
> @@ -113,7 +113,7 @@
> static void uart_tasklet_action(unsigned long data)
> {
> struct uart_state *state = (struct uart_state *)data;
> - tty_wakeup(state->port.tty);
> + if (state->port.tty) tty_wakeup(state->port.tty);
> }
>
> static inline void
>
> That one line workaround makes the panic go away, and things seem to work fine from there.
>
> I note that pmac_zilog.c function pmz_receiv_chars() has the following chunk:
>
> /* Sanity check, make sure the old bug is no longer happening */
> if (uap->port.state == NULL || uap->port.state->port.tty == NULL) {
> WARN_ON(1);
> (void)read_zsdata(uap);
> return NULL;
> }
>
> Which doesn't catch this because it's the write code path (not the read code path) that's running into
> this.
>
> Rob
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
2010-01-08 3:00 ` Benjamin Herrenschmidt
@ 2010-01-09 8:17 ` Rob Landley
2010-01-11 3:02 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Rob Landley @ 2010-01-09 8:17 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev
On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > Ok, here's the fix. It's not the _right_ fix, but it Works For Me (tm)
> > and I'll leave it to you guys to figure out what this _means_:
>
> I've failed to reproduce so far on both a Wallstreet powerbook (similar
> generation and chipset as your beige G3) and a G5 with an added serial
> port using current upstream...
>
> Can you verify it's still there ? I might be able to reproduce on a
> Beige G3 as well next week.
It's still there on qemu 0.11.0's "g3beige" emulation when you use
CONFIG_SERIAL_PMACZILOG as the serial console. (QEMU 0.10.x used a 16550
serial chip for its g3beige emulation instead of the actual ZILOG one.) Still
dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch
fixes it for me, and it comes back without the patch.
I tested 2.6.32. Haven't tried the 2.6.32.3 but don't see why it would change
this...
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
2010-01-09 8:17 ` Rob Landley
@ 2010-01-11 3:02 ` Benjamin Herrenschmidt
2010-01-11 7:22 ` Rob Landley
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2010-01-11 3:02 UTC (permalink / raw)
To: Rob Landley; +Cc: paulus, linuxppc-dev
On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > Ok, here's the fix. It's not the _right_ fix, but it Works For Me (tm)
> > > and I'll leave it to you guys to figure out what this _means_:
> >
> > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > generation and chipset as your beige G3) and a G5 with an added serial
> > port using current upstream...
> >
> > Can you verify it's still there ? I might be able to reproduce on a
> > Beige G3 as well next week.
>
> It's still there on qemu 0.11.0's "g3beige" emulation when you use
> CONFIG_SERIAL_PMACZILOG as the serial console. (QEMU 0.10.x used a 16550
> serial chip for its g3beige emulation instead of the actual ZILOG one.) Still
> dunno if it's a qemu or bug or a kernel bug, I just know that kernel patch
> fixes it for me, and it comes back without the patch.
>
> I tested 2.6.32. Haven't tried the 2.6.32.3 but don't see why it would change
> this...
Ok so I compiled qemu and things are a bit strange.
How do you get the output of both channels of the serial port with it ?
If I use -nographics, what happens is that OpenBIOS, for some reason,
tells qemu that the console on the second channel of the ESCC.
I see my kernel messages in the console if I do console=ttyPZ0 but the
debug stuff goes where udbg initializes it, which is where OpenBIOS says
the FW console is, which is channel B and I don't know how to "see" that
with qemu.
I do see it crash due to a message from the kernel but I can't get into
xmon which is a pain.
If I modify the kernel to force udbg on channel A (same channel as the
console), then the problem doesn't appear (it doesn't crash) :-)
Cheers
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] Re: ZILOG serial port broken in 2.6.32
2010-01-11 3:02 ` Benjamin Herrenschmidt
@ 2010-01-11 7:22 ` Rob Landley
0 siblings, 0 replies; 8+ messages in thread
From: Rob Landley @ 2010-01-11 7:22 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: paulus, linuxppc-dev
On Sunday 10 January 2010 21:02:16 Benjamin Herrenschmidt wrote:
> On Sat, 2010-01-09 at 02:17 -0600, Rob Landley wrote:
> > On Thursday 07 January 2010 21:00:43 Benjamin Herrenschmidt wrote:
> > > > Ok, here's the fix. It's not the _right_ fix, but it Works For Me
> > > > (tm) and I'll leave it to you guys to figure out what this _means_:
> > >
> > > I've failed to reproduce so far on both a Wallstreet powerbook (similar
> > > generation and chipset as your beige G3) and a G5 with an added serial
> > > port using current upstream...
> > >
> > > Can you verify it's still there ? I might be able to reproduce on a
> > > Beige G3 as well next week.
> >
> > It's still there on qemu 0.11.0's "g3beige" emulation when you use
> > CONFIG_SERIAL_PMACZILOG as the serial console. (QEMU 0.10.x used a 16550
> > serial chip for its g3beige emulation instead of the actual ZILOG one.)
> > Still dunno if it's a qemu or bug or a kernel bug, I just know that
> > kernel patch fixes it for me, and it comes back without the patch.
> >
> > I tested 2.6.32. Haven't tried the 2.6.32.3 but don't see why it would
> > change this...
>
> Ok so I compiled qemu and things are a bit strange.
>
> How do you get the output of both channels of the serial port with it ?
>
> If I use -nographics, what happens is that OpenBIOS, for some reason,
> tells qemu that the console on the second channel of the ESCC.
Instead of "-nographic", you could try "-serial stdio" instead?
> I see my kernel messages in the console if I do console=ttyPZ0 but the
> debug stuff goes where udbg initializes it, which is where OpenBIOS says
> the FW console is, which is channel B and I don't know how to "see" that
> with qemu.
I'm just trying to get a serial console, which is why I'm booting the sucker
with:
qemu-system-ppc -M g3beige -nographic -no-reboot -kernel zImage-powerpc -hda
image-powerpc.sqf -append "root=/dev/hda rw init=/usr/sbin/init.sh panic=1
PATH=/usr/bin console=ttyS0"
I didn't even know there were more debug messages...
I have CONFIG_SERIAL_PMACZILOG_TTYS=y of course:
pmac_zilog: 0.6 (Benjamin Herrenschmidt <benh@kernel.crashing.org>)
ttyS0 at MMIO 0x80813020 (irq = 16) is a Z85c30 ESCC - Serial port
ttyS1 at MMIO 0x80813000 (irq = 17) is a Z85c30 ESCC - Serial port
CONFIG_SERIO=y
CONFIG_SERIAL_PMACZILOG=y
CONFIG_SERIAL_PMACZILOG_TTYS=y
CONFIG_SERIAL_PMACZILOG_CONSOLE=y
> I do see it crash due to a message from the kernel but I can't get into
> xmon which is a pain.
Does the -serial stdio thing help?
(I know to switch between screens in the qemu x11 window, it's ctrl-alt-number
(so ctrl-alt-1, ctrl-alt-2, and so on. I really don't use 'em much, though.)
> If I modify the kernel to force udbg on channel A (same channel as the
> console), then the problem doesn't appear (it doesn't crash) :-)
You can attach gdb to qemu via the "qemu -s" option and then in gdb use the
"target remote" stuff like you would with gdbserver. It acts a bit like you've
connected it to a jtag through openocd, if that helps...
(I know qemu has many, many options I don't really use much.)
> Cheers
> Ben.
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-01-11 7:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-06 7:01 ZILOG serial port broken in 2.6.32 Rob Landley
2009-12-07 1:10 ` Benjamin Herrenschmidt
2009-12-07 20:25 ` Rob Landley
2009-12-08 12:42 ` [PATCH] " Rob Landley
2010-01-08 3:00 ` Benjamin Herrenschmidt
2010-01-09 8:17 ` Rob Landley
2010-01-11 3:02 ` Benjamin Herrenschmidt
2010-01-11 7:22 ` Rob Landley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).