* [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port
@ 2011-10-07 21:59 Doug Anderson
2011-10-08 8:49 ` Jiri Slaby
2011-10-12 0:25 ` [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend Doug Anderson
0 siblings, 2 replies; 8+ messages in thread
From: Doug Anderson @ 2011-10-07 21:59 UTC (permalink / raw)
To: Greg Kroah-Hartman, Alan Cox; +Cc: linux-serial, linux-kernel, Doug Anderson
The symptoms of the bug showed up if you were running agetty on ttyS0
but not using ttyS0 as the console= port on the kernel command
line. In this case, if you suspended the system you'd get a crash
on resume.
The specific order of operations that were running:
* uart_suspend_port() would be called to put the uart in suspend mode
* a tty hangup would be processed
* the hangup would call uart_shutdown()
* ... suspend / resume happens ...
* uart_resume_port() would be called and run the code in the
(port->flags & ASYNC_SUSPENDED) block, which would startup the port
(and enable tx again).
* Since the UART would be available for tx, we'd immediately get
an interrupt, eventually calling transmit_chars()
* The transmit_chars() function would crash. The first crash would
be a dereference of a NULL tty member, but since the port has been
shutdown that was just a symptom.
Signed-off-by: Doug Anderson <dianders@chromium.org>
---
drivers/tty/serial/serial_core.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index a3efbea..668e56a 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -255,6 +255,14 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
}
/*
+ * It's possible for shutdown to be called after suspend. Specifically
+ * if agetty() is listening to the serial port we get a HUP after the
+ * suspend happend (and HUP calls shutdown). Clear suspended bit so
+ * we don't try to resume a port that has been shutdown.
+ */
+ clear_bit(ASYNCB_SUSPENDED, &port->flags);
+
+ /*
* kill off our tasklet
*/
tasklet_kill(&state->tlet);
--
1.7.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port
2011-10-07 21:59 [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port Doug Anderson
@ 2011-10-08 8:49 ` Jiri Slaby
2011-10-08 15:55 ` Doug Anderson
2011-10-12 0:25 ` [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend Doug Anderson
1 sibling, 1 reply; 8+ messages in thread
From: Jiri Slaby @ 2011-10-08 8:49 UTC (permalink / raw)
To: Doug Anderson; +Cc: Greg Kroah-Hartman, Alan Cox, linux-serial, linux-kernel
On 10/07/2011 11:59 PM, Doug Anderson wrote:
> The symptoms of the bug showed up if you were running agetty on ttyS0
> but not using ttyS0 as the console= port on the kernel command
> line. In this case, if you suspended the system you'd get a crash
> on resume.
>
> The specific order of operations that were running:
> * uart_suspend_port() would be called to put the uart in suspend mode
> * a tty hangup would be processed
> * the hangup would call uart_shutdown()
> * ... suspend / resume happens ...
> * uart_resume_port() would be called and run the code in the
> (port->flags & ASYNC_SUSPENDED) block, which would startup the port
> (and enable tx again).
> * Since the UART would be available for tx, we'd immediately get
> an interrupt, eventually calling transmit_chars()
> * The transmit_chars() function would crash. The first crash would
> be a dereference of a NULL tty member, but since the port has been
> shutdown that was just a symptom.
I cannot reproduce this. What uart driver is this with? And where does
it call uart_suspend_port from? Basically, it would mean that it calls
uart_suspend_port while userspace is still running? Or who HUPs the port
(the second point in your list)?
> Signed-off-by: Doug Anderson <dianders@chromium.org>
> ---
> drivers/tty/serial/serial_core.c | 8 ++++++++
> 1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index a3efbea..668e56a 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -255,6 +255,14 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
> }
>
> /*
> + * It's possible for shutdown to be called after suspend. Specifically
> + * if agetty() is listening to the serial port we get a HUP after the
> + * suspend happend (and HUP calls shutdown). Clear suspended bit so
> + * we don't try to resume a port that has been shutdown.
> + */
> + clear_bit(ASYNCB_SUSPENDED, &port->flags);
> +
> + /*
> * kill off our tasklet
> */
> tasklet_kill(&state->tlet);
thanks,
--
js
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port
2011-10-08 8:49 ` Jiri Slaby
@ 2011-10-08 15:55 ` Doug Anderson
2011-10-10 20:28 ` Doug Anderson
0 siblings, 1 reply; 8+ messages in thread
From: Doug Anderson @ 2011-10-08 15:55 UTC (permalink / raw)
To: Jiri Slaby; +Cc: Greg Kroah-Hartman, Alan Cox, linux-serial, linux-kernel
On Sat, Oct 8, 2011 at 1:49 AM, Jiri Slaby <jirislaby@gmail.com> wrote:
> I cannot reproduce this. What uart driver is this with? And where does
> it call uart_suspend_port from? Basically, it would mean that it calls
> uart_suspend_port while userspace is still running? Or who HUPs the port
> (the second point in your list)?
Thank you for testing! I am working with the 8250 driver controlling
the onboard UART on an nVidia Tegra T25 (ARM).
You raise very good questions, and I'm happy to keep tracking down to
continue to look for a deeper root cause. I continued to do some more
testing of this patch (with the help of a few others) after posting
the list. We tried the same setup on an x86-based board and the HUP
doesn't show up. ...so my patch wasn't needed on that board (although
my patch didn't hurt). ...that means, as you already pointed out,
that the HUP must _not_ be coming from userspace.
Probably the true bug in my system is that a spurious HUP is being
generated as a result of the suspend. That is probably platform
dependent and would explain why this hasn't been a bigger problem for
others.
...however, even though my patch shouldn't be needed for any hardware
that doesn't generate the spurious HUP, it might still be worthwhile
to consider including it? ...or perhaps changing it to a BUG_ON or
WARN_ON to at least detect the case of running the shutdown code after
the suspend code? Definitely the case that I ran into (shutdown after
suspend) will definitely cause a crash during resume.
I will spend time on Monday digging into the source of the HUP and
will post a followup.
-Doug
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port
2011-10-08 15:55 ` Doug Anderson
@ 2011-10-10 20:28 ` Doug Anderson
0 siblings, 0 replies; 8+ messages in thread
From: Doug Anderson @ 2011-10-10 20:28 UTC (permalink / raw)
To: Jiri Slaby
Cc: Greg Kroah-Hartman, Alan Cox, linux-serial, linux-kernel,
Olof Johansson, Tom Warren, Allen Martin
OK, so I've tracked down the source of the HUP. Unfortunately, it
looks to be hardware that I have no control over.
Specifically, on the nVidia Tegra 250 on UARTB (and probably UARTs C,
D, and E), the DCD / DSR / DTR / RI lines are not pinned out.
However, some of the lines seem to be connected internally.
Specifically, the DTR line appears to be tied to the DCD and DSR lines
for some reason.
What does that mean for us? It means that when we call this in
uart_suspend_port():
ops->set_mctrl(uport, 0);
...we'll drop DTR. ...and because the lines are tied, we'll get a
drop of DCD and DSR. ...and a drop of DCD is a hangup.
So where does that leave us?
1. I'd like to get rid of the HUP, and it looks like I can use
mcr_mask / mcr_force to do this. I could do this in a nice way
(adding it as a flag in the serial8250_config structure) so that I
don't need to use the ugly ALPHA_KLUDGE_MCR. I'll submit a separate
patch for this and make sure to include everyone on this thread.
2. I still think that my original patch has some merit to it. It
shouldn't hurt anything, and there's always the (very unlikely) case
that someone else could see this if they get a DCD drop at just the
right time during suspend. If you agree, I'd love to get an ack on
it. :)
Look for my mcr_mask / mcr_force soon. I look forward to your
thoughts on the matter.
Thanks again,
-Doug
--
P.S. For the truly interested, here's the writeup I did for
<http://crosbug.com/21461>, which is the bug entry I created in the
Chromium OS project...
Specifically, here's the stack trace of the HUP being thrown into the system:
<7>[ 52.882844] ttyS0 hangup...
<5>[ 52.885986] Backtrace:
<5>[ 52.888770] [<c0053fa4>] (unwind_backtrace+0x0/0xec) from
[<c042ab14>] (dump_stack+0x28/0x30)
<5>[ 52.897853] [<c042ab14>] (dump_stack+0x28/0x30) from
[<c0266f78>] (tty_hangup+0x40/0x6c)
<5>[ 52.906486] [<c0266f78>] (tty_hangup+0x40/0x6c) from
[<c02828fc>] (check_modem_status+0xf4/0x1dc)
<5>[ 52.915914] [<c02828fc>] (check_modem_status+0xf4/0x1dc) from
[<c0283e08>] (serial8250_handle_port+0x2d0/0x300)
<5>[ 52.926591] [<c0283e08>] (serial8250_handle_port+0x2d0/0x300)
from [<c0283e98>] (serial8250_interrupt+0x60/0x134)
<5>[ 52.937449] [<c0283e98>] (serial8250_interrupt+0x60/0x134)
from [<c00c0904>] (handle_IRQ_event+0x88/0x198)
<5>[ 52.947681] [<c00c0904>] (handle_IRQ_event+0x88/0x198) from
[<c00c27d8>] (handle_level_irq+0xf4/0x188)
<5>[ 52.957565] [<c00c27d8>] (handle_level_irq+0xf4/0x188) from
[<c0047320>] (asm_do_IRQ+0x98/0xd8)
<5>[ 52.966810] [<c0047320>] (asm_do_IRQ+0x98/0xd8) from
[<c004cf38>] (__irq_svc+0x38/0xc0)
<5>[ 52.975316] Exception stack(0xf4fe5d28 to 0xf4fe5d70)
<5>[ 52.980800] 5d20: c060aac4 a0000093 00000000
00000000 c0698988 f5000800
<5>[ 52.989504] 5d40: c05f7cc0 f5000840 c05f7c08 00000000 f5000838
f4fe5da4 f4fe5d28 f4fe5d70
<5>[ 52.998192] 5d60: c005b2b4 c0281000 60000013 ffffffff
<5>[ 53.003688] [<c004cf38>] (__irq_svc+0x38/0xc0) from
[<c0281000>] (uart_suspend_port+0x18c/0x2e8)
<5>[ 53.013023] [<c0281000>] (uart_suspend_port+0x18c/0x2e8) from
[<c0282a70>] (serial8250_suspend+0x4c/0x68)
<5>[ 53.023176] [<c0282a70>] (serial8250_suspend+0x4c/0x68) from
[<c0292b40>] (platform_pm_suspend+0x58/0x6c)
<5>[ 53.033321] [<c0292b40>] (platform_pm_suspend+0x58/0x6c) from
[<c029654c>] (pm_op+0x60/0xbc)
<5>[ 53.042297] [<c029654c>] (pm_op+0x60/0xbc) from [<c0296d8c>]
(__device_suspend+0x138/0x1c8)
<5>[ 53.051184] [<c0296d8c>] (__device_suspend+0x138/0x1c8) from
[<c029715c>] (dpm_suspend_start+0x2ec/0x3dc)
<5>[ 53.061333] [<c029715c>] (dpm_suspend_start+0x2ec/0x3dc) from
[<c00b89cc>] (suspend_devices_and_enter+0xa8/0x2e0)
<5>[ 53.072189] [<c00b89cc>]
(suspend_devices_and_enter+0xa8/0x2e0) from [<c00b8cdc>]
(enter_state+0xd8/0x140)
<5>[ 53.082415] [<c00b8cdc>] (enter_state+0xd8/0x140) from
[<c00b80c4>] (state_store+0xb4/0xc8)
<5>[ 53.091306] [<c00b80c4>] (state_store+0xb4/0xc8) from
[<c02258ac>] (kobj_attr_store+0x1c/0x28)
<5>[ 53.100474] [<c02258ac>] (kobj_attr_store+0x1c/0x28) from
[<c0170628>] (sysfs_write_file+0x118/0x14c)
<5>[ 53.110272] [<c0170628>] (sysfs_write_file+0x118/0x14c) from
[<c011e050>] (vfs_write+0xc4/0x140)
<5>[ 53.119608] [<c011e050>] (vfs_write+0xc4/0x140) from
[<c011e2cc>] (sys_write+0x4c/0x78)
<5>[ 53.128136] [<c011e2cc>] (sys_write+0x4c/0x78) from
[<c004d380>] (ret_fast_syscall+0x0/0x30)
The check_modem_status+0xf4 is actually the line:
uart_handle_dcd_change(&up->port, status & UART_MSR_DCD);
The uart_suspend_port()+0x18c is the unlocking of the IRQ following the line:
ops->set_mctrl(uport, 0);
I've gone digging and it looks like the set_mctrl() ends up changing
MCR from 0x0b to 0x00. In the progress, the MSR goes from 0xb0 to
0x1a. Decoding that...
MCR:
* RTS: disable
* CTS: disable
* LOOKBK: disable
* OUT2: enable => disable
* OUT1: disable
* RTS: FORCE_RTS_LOW => FORCE_RTS_HI
* DTR: FORCE_DTR_LOW => FORCE_DTR_HI
MSR:
* CD: 1 => 0
* RI: 0
* DSR: 1 => 0
* CTS: 1
* DCD: 0 => 1
* DRI: 0
* DDSR: 0 => 1
* DCTS: 0
I tracked the changes to MCR down one bit at a time, and it appears
the the DTR bit change is the magic bit. I'm guessing that DTR is
routed internally back to the CD and DSR lines (the lines don't appear
to be externally available on the CPU according to the tech reference
manual).
<6>[ 41.302967] set_mctrl MSR starts at b0
<6>[ 41.307640] set_mctrl MCR 0b => 03, MSR became b0
<6>[ 41.313289] set_mctrl MCR 03 => 01, MSR became b0
<6>[ 41.318937] set_mctrl MCR 01 => 00, MSR became 1a
...and just to confirm I didn't mess up, I tried the other order:
<6>[ 54.522910] set_mctrl MSR starts at b0
<6>[ 54.527577] set_mctrl MCR 0b => 0a, MSR became 1a
<6>[ 54.533226] set_mctrl MCR 0a => 08, MSR became 1a
<6>[ 54.538870] set_mctrl MCR 08 => 00, MSR became 1a
---
On Sat, Oct 8, 2011 at 8:55 AM, Doug Anderson <dianders@chromium.org> wrote:
>
> On Sat, Oct 8, 2011 at 1:49 AM, Jiri Slaby <jirislaby@gmail.com> wrote:
> > I cannot reproduce this. What uart driver is this with? And where does
> > it call uart_suspend_port from? Basically, it would mean that it calls
> > uart_suspend_port while userspace is still running? Or who HUPs the port
> > (the second point in your list)?
>
> Thank you for testing! I am working with the 8250 driver controlling
> the onboard UART on an nVidia Tegra T25 (ARM).
>
> You raise very good questions, and I'm happy to keep tracking down to
> continue to look for a deeper root cause. I continued to do some more
> testing of this patch (with the help of a few others) after posting
> the list. We tried the same setup on an x86-based board and the HUP
> doesn't show up. ...so my patch wasn't needed on that board (although
> my patch didn't hurt). ...that means, as you already pointed out,
> that the HUP must _not_ be coming from userspace.
>
> Probably the true bug in my system is that a spurious HUP is being
> generated as a result of the suspend. That is probably platform
> dependent and would explain why this hasn't been a bigger problem for
> others.
>
>
> ...however, even though my patch shouldn't be needed for any hardware
> that doesn't generate the spurious HUP, it might still be worthwhile
> to consider including it? ...or perhaps changing it to a BUG_ON or
> WARN_ON to at least detect the case of running the shutdown code after
> the suspend code? Definitely the case that I ran into (shutdown after
> suspend) will definitely cause a crash during resume.
>
>
> I will spend time on Monday digging into the source of the HUP and
> will post a followup.
>
> -Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend
2011-10-07 21:59 [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port Doug Anderson
2011-10-08 8:49 ` Jiri Slaby
@ 2011-10-12 0:25 ` Doug Anderson
2011-10-12 11:57 ` Alan Cox
2011-10-18 21:21 ` Greg KH
1 sibling, 2 replies; 8+ messages in thread
From: Doug Anderson @ 2011-10-12 0:25 UTC (permalink / raw)
To: Jiri Slaby, Stephen Warren, Olof Johansson
Cc: Greg Kroah-Hartman, Alan Cox, Allen Martin, Tom Warren,
linux-serial, linux-kernel, Doug Anderson
This crash was showing up 100% of the time on Tegra CPUs when an
agetty was running on the serial port and the console was not running
on the serial port. The reason the Tegra saw it so reliably is that
the Tegra CPU internally ties DTR to DCD/DSR. That means when we
dropped DTR during suspend we would get always get an immediate DCD
drop.
The specific order of operations that were running:
* uart_suspend_port() would be called to put the uart in suspend mode
* we'd drop DTR (ops->set_mctrl(uport, 0)).
* the DTR drop would be looped back in the CPU to be a DCD drop.
* the DCD drop would look to the serial driver as a hangup
* the hangup would call uart_shutdown()
* ... suspend / resume happens ...
* uart_resume_port() would be called and run the code in the
(port->flags & ASYNC_SUSPENDED) block, which would startup the port
(and enable tx again).
* Since the UART would be available for tx, we'd immediately get
an interrupt, eventually calling transmit_chars()
* The transmit_chars() function would crash. The first crash would
be a dereference of a NULL tty member, but since the port has been
shutdown that was just a symptom.
I have proposed a patch that would fix the Tegra CPUs here (see
https://lkml.org/lkml/2011/10/11/444 - tty/serial: Prevent drop of DCD
on suspend for Tegra UARTs). However, even with that fix it is still
possible for systems that have an externally visible DCD line to see a
crash if the DCD drops at just the right time during suspend: thus
this patch is still useful.
Signed-off-by: Doug Anderson <dianders@chromium.org>
---
v2: Updated change description and comments based on deeper digging.
drivers/tty/serial/serial_core.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index a3efbea..7a86b39 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -255,6 +255,13 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
}
/*
+ * It's possible for shutdown to be called after suspend if we get
+ * a DCD drop (hangup) at just the right time. Clear suspended bit so
+ * we don't try to resume a port that has been shutdown.
+ */
+ clear_bit(ASYNCB_SUSPENDED, &port->flags);
+
+ /*
* kill off our tasklet
*/
tasklet_kill(&state->tlet);
--
1.7.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend
2011-10-12 0:25 ` [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend Doug Anderson
@ 2011-10-12 11:57 ` Alan Cox
2011-10-18 21:21 ` Greg KH
1 sibling, 0 replies; 8+ messages in thread
From: Alan Cox @ 2011-10-12 11:57 UTC (permalink / raw)
To: Doug Anderson
Cc: Jiri Slaby, Stephen Warren, Olof Johansson, Greg Kroah-Hartman,
Alan Cox, Allen Martin, Tom Warren, linux-serial, linux-kernel
On Tue, 11 Oct 2011 17:25:44 -0700
Doug Anderson <dianders@chromium.org> wrote:
> This crash was showing up 100% of the time on Tegra CPUs when an
> agetty was running on the serial port and the console was not running
> on the serial port. The reason the Tegra saw it so reliably is that
> the Tegra CPU internally ties DTR to DCD/DSR. That means when we
> dropped DTR during suspend we would get always get an immediate DCD
> drop.
>
> The specific order of operations that were running:
> * uart_suspend_port() would be called to put the uart in suspend mode
> * we'd drop DTR (ops->set_mctrl(uport, 0)).
> * the DTR drop would be looped back in the CPU to be a DCD drop.
> * the DCD drop would look to the serial driver as a hangup
> * the hangup would call uart_shutdown()
> * ... suspend / resume happens ...
> * uart_resume_port() would be called and run the code in the
> (port->flags & ASYNC_SUSPENDED) block, which would startup the port
> (and enable tx again).
> * Since the UART would be available for tx, we'd immediately get
> an interrupt, eventually calling transmit_chars()
> * The transmit_chars() function would crash. The first crash would
> be a dereference of a NULL tty member, but since the port has been
> shutdown that was just a symptom.
>
> I have proposed a patch that would fix the Tegra CPUs here (see
> https://lkml.org/lkml/2011/10/11/444 - tty/serial: Prevent drop of DCD
> on suspend for Tegra UARTs). However, even with that fix it is still
> possible for systems that have an externally visible DCD line to see a
> crash if the DCD drops at just the right time during suspend: thus
> this patch is still useful.
>
> Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Alan Cox <alan@linux.intel.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend
2011-10-12 0:25 ` [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend Doug Anderson
2011-10-12 11:57 ` Alan Cox
@ 2011-10-18 21:21 ` Greg KH
2011-10-19 18:52 ` [PATCH v3] " Doug Anderson
1 sibling, 1 reply; 8+ messages in thread
From: Greg KH @ 2011-10-18 21:21 UTC (permalink / raw)
To: Doug Anderson
Cc: Jiri Slaby, Stephen Warren, Olof Johansson, Greg Kroah-Hartman,
Alan Cox, Allen Martin, Tom Warren, linux-serial, linux-kernel
On Tue, Oct 11, 2011 at 05:25:44PM -0700, Doug Anderson wrote:
> This crash was showing up 100% of the time on Tegra CPUs when an
> agetty was running on the serial port and the console was not running
> on the serial port. The reason the Tegra saw it so reliably is that
> the Tegra CPU internally ties DTR to DCD/DSR. That means when we
> dropped DTR during suspend we would get always get an immediate DCD
> drop.
>
> The specific order of operations that were running:
> * uart_suspend_port() would be called to put the uart in suspend mode
> * we'd drop DTR (ops->set_mctrl(uport, 0)).
> * the DTR drop would be looped back in the CPU to be a DCD drop.
> * the DCD drop would look to the serial driver as a hangup
> * the hangup would call uart_shutdown()
> * ... suspend / resume happens ...
> * uart_resume_port() would be called and run the code in the
> (port->flags & ASYNC_SUSPENDED) block, which would startup the port
> (and enable tx again).
> * Since the UART would be available for tx, we'd immediately get
> an interrupt, eventually calling transmit_chars()
> * The transmit_chars() function would crash. The first crash would
> be a dereference of a NULL tty member, but since the port has been
> shutdown that was just a symptom.
>
> I have proposed a patch that would fix the Tegra CPUs here (see
> https://lkml.org/lkml/2011/10/11/444 - tty/serial: Prevent drop of DCD
> on suspend for Tegra UARTs). However, even with that fix it is still
> possible for systems that have an externally visible DCD line to see a
> crash if the DCD drops at just the right time during suspend: thus
> this patch is still useful.
>
> Signed-off-by: Doug Anderson <dianders@chromium.org>
> Acked-by: Alan Cox <alan@linux.intel.com>
> ---
> v2: Updated change description and comments based on deeper digging.
This patch doesn't apply at all to the linux-next tree, can you redo it
so that I know it is correct, and resend it? (remember to keep Alan's
Ack on it.)
thanks,
greg k-h
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3] TTY: serial_core: Fix crash if DCD drop during suspend
2011-10-18 21:21 ` Greg KH
@ 2011-10-19 18:52 ` Doug Anderson
0 siblings, 0 replies; 8+ messages in thread
From: Doug Anderson @ 2011-10-19 18:52 UTC (permalink / raw)
To: Greg KH
Cc: Jiri Slaby, Stephen Warren, Olof Johansson, Greg Kroah-Hartman,
Alan Cox, Allen Martin, Tom Warren, linux-serial, linux-kernel,
Doug Anderson
This crash was showing up 100% of the time on Tegra CPUs when an
agetty was running on the serial port and the console was not running
on the serial port. The reason the Tegra saw it so reliably is that
the Tegra CPU internally ties DTR to DCD/DSR. That means when we
dropped DTR during suspend we would get always get an immediate DCD
drop.
The specific order of operations that were running:
* uart_suspend_port() would be called to put the uart in suspend mode
* we'd drop DTR (ops->set_mctrl(uport, 0)).
* the DTR drop would be looped back in the CPU to be a DCD drop.
* the DCD drop would look to the serial driver as a hangup
* the hangup would call uart_shutdown()
* ... suspend / resume happens ...
* uart_resume_port() would be called and run the code in the
(port->flags & ASYNC_SUSPENDED) block, which would startup the port
(and enable tx again).
* Since the UART would be available for tx, we'd immediately get
an interrupt, eventually calling transmit_chars()
* The transmit_chars() function would crash. The first crash would
be a dereference of a NULL tty member, but since the port has been
shutdown that was just a symptom.
I have proposed a patch that would fix the Tegra CPUs here (see
https://lkml.org/lkml/2011/10/11/444 - tty/serial: Prevent drop of DCD
on suspend for Tegra UARTs). However, even with that fix it is still
possible for systems that have an externally visible DCD line to see a
crash if the DCD drops at just the right time during suspend: thus
this patch is still useful.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Alan Cox <alan@linux.intel.com>
---
v3: Applies against linux-next (needed after 6a3e492b)
drivers/tty/serial/serial_core.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 9dc44a5..0406d7f 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -249,6 +249,13 @@ static void uart_shutdown(struct tty_struct *tty, struct uart_state *state)
}
/*
+ * It's possible for shutdown to be called after suspend if we get
+ * a DCD drop (hangup) at just the right time. Clear suspended bit so
+ * we don't try to resume a port that has been shutdown.
+ */
+ clear_bit(ASYNCB_SUSPENDED, &port->flags);
+
+ /*
* Free the transmit buffer page.
*/
if (state->xmit.buf) {
--
1.7.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-10-19 19:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-07 21:59 [PATCH] TTY: serial_core: Fix crash if agetty() run on non-console serial port Doug Anderson
2011-10-08 8:49 ` Jiri Slaby
2011-10-08 15:55 ` Doug Anderson
2011-10-10 20:28 ` Doug Anderson
2011-10-12 0:25 ` [PATCH v2] TTY: serial_core: Fix crash if DCD drop during suspend Doug Anderson
2011-10-12 11:57 ` Alan Cox
2011-10-18 21:21 ` Greg KH
2011-10-19 18:52 ` [PATCH v3] " Doug Anderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).