From: Greg KH <gregkh@linuxfoundation.org>
To: Mark Roszko <mark.roszko@gmail.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>,
Leilei Zhao <leilei.zhao@atmel.com>,
mdeneen@gmail.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH 3/4] tty/serial: at91: prevent null dereference in tasklet function
Date: Tue, 7 Jan 2014 19:44:53 -0800 [thread overview]
Message-ID: <20140108034453.GA8664@kroah.com> (raw)
In-Reply-To: <CAJjB1qLDd6JFb4LwD5iokg5=O8Bwp5MkKsrr45QDodZkZrx75g@mail.gmail.com>
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Tue, Jan 07, 2014 at 08:52:34PM -0500, Mark Roszko wrote:
> This patch I was somewhat hesitant to pushing to Atmel when I did the other two
> patches.
Then why did you? :)
> The main issue is the use of systemd causes the serial driver to somehow get
> out of sync on startups randomly. i.e. on one bootup it's fine, and on other
> it'll kernel panic.
> It occurs because systemd causes the startup and shutdown driver ops to be
> fired in rapid succession.
How does systemd cause this? Is this when the serial port is being used
as a console or something else?
> Every single service start message causes the _startup callback first, then the
> message prints and _shutdown callbacks fires.
So something is opening and closing the port quickly? That should be
easy to test without systemd.
Any console involved?
> And a kernel panic always occurs somewhere during the service status output,
> never before when it's just the kernel startup and never after once systemd
> finishes and getty for example takes over.
>
> The stacktrace looked like this:
> Unable to handle kernel NULL pointer dereference at virtual address 00000088
> pgd = c0004000
> [00000088] *pgd=00000000
> Internal error: Oops: 17 [#1] ARM
> Modules linked in: autofs4
> CPU: 0 Not tainted (3.6.9-yocto-standard #1)
3.6.9 is _very_ old, loads of things have happened in the tty layer
since then, can you duplicate this on 3.12 or 3.13-rc?
> PC is at tty_wakeup+0x8/0x58
> LR is at atmel_tasklet_func+0x238/0x80c
> pc : [<c01efd84>] lr : [<c0208fc0>] psr: a00f0013
> sp : df84ff28 ip : 00000000 fp : 00000100
> r10: c05d1ec0 r9 : 04208040 r8 : c05d1e80
> r7 : 0000000a r6 : 00000000 r5 : dedf7c00 r4 : 00000000
> r3 : dedf7c00 r2 : 00000000 r1 : 600f0013 r0 : 00000000
> Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
> Control: 10c53c7d Table: 3fb0c059 DAC: 00000015
> Process ksoftirqd/0 (pid: 3, stack limit = 0xdf84e2f0)
> Stack: (0xdf84ff28 to 0xdf850000)
> ff20: c05e4378 dedf7c00 00000000 c0208fc0 c05aa458 df8496b0
> ff40: df849680 c05aa458 df8496b0 c00394a8 c0039420 c05a8d04 00000000 00000000
> ff60: 0000000a c05d1e80 04208040 c05d1ec0 00000100 c001fd34 00000009 00000018
> ff80: df84e000 c001f60c df84ffbc c03f8e60 00000013 00000006 00000000 c05d1e80
> ffa0: df84e000 00000000 00000013 00000000 00000000 00000000 00000000 c001f728
> ffc0: df84bf3c 00000000 c001f6c0 c0030870 00000000 00000000 00000000 00000000
> ffe0: df84ffe0 df84ffe0 df84bf3c c00307f0 c000e348 c000e348 fff73fbf 3fbeffff
> [<c01efd84>] (tty_wakeup+0x8/0x58) from [<c0208fc0>] (atmel_tasklet_func+0x238/
> 0x80c)
> [<c0208fc0>] (atmel_tasklet_func+0x238/0x80c) from [<c001fd34>]
> (tasklet_action+0x70/0xa8)
> [<c001fd34>] (tasklet_action+0x70/0xa8) from [<c001f60c>] (__do_softirq+0x90/
> 0x144)
> [<c001f60c>] (__do_softirq+0x90/0x144) from [<c001f728>] (run_ksoftirqd+0x68/
> 0x104)
> [<c001f728>] (run_ksoftirqd+0x68/0x104) from [<c0030870>] (kthread+0x80/0x90)
> [<c0030870>] (kthread+0x80/0x90) from [<c000e348>] (kernel_thread_exit+0x0/0x8)
> Code: e8bd8070 c05ac0b8 e92d4070 e1a04000 (e5903088)
> ---[ end trace 15b8a9aeaf7b457f ]---
>
>
> Testing wise I originally used a BUG_ON statement in atmel_tasklet_func to
> panic before the null deference hit. BUG_ON confirmed that tty was NULL at the
> very start of the tasklet being called.
And did you test that this patch actually fixed it?
> The atmel_shutdown function should be killing the tasklet (after patch "Handle
> shutdown more safely") and does disable interrupts so I've been at a loss at
> where the race condition was occurring that a tasklet could escape beyond
> shutdown.
Are you sure you aren't just racing with shutdown? Need a lock
somewhere?
greg k-h
WARNING: multiple messages have this Message-ID (diff)
From: gregkh@linuxfoundation.org (Greg KH)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/4] tty/serial: at91: prevent null dereference in tasklet function
Date: Tue, 7 Jan 2014 19:44:53 -0800 [thread overview]
Message-ID: <20140108034453.GA8664@kroah.com> (raw)
In-Reply-To: <CAJjB1qLDd6JFb4LwD5iokg5=O8Bwp5MkKsrr45QDodZkZrx75g@mail.gmail.com>
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Tue, Jan 07, 2014 at 08:52:34PM -0500, Mark Roszko wrote:
> This patch I was somewhat hesitant to pushing to Atmel when I did the other two
> patches.
Then why did you? :)
> The main issue is the use of systemd causes the serial driver to somehow get
> out of sync on startups randomly. i.e. on one bootup it's fine, and on other
> it'll kernel panic.?
> It occurs because systemd causes the startup and shutdown driver ops to be
> fired in rapid succession.?
How does systemd cause this? Is this when the serial port is being used
as a console or something else?
> Every single service start message causes the _startup callback first, then the
> message prints and _shutdown callbacks fires.?
So something is opening and closing the port quickly? That should be
easy to test without systemd.
Any console involved?
> And a kernel panic always occurs somewhere during the service status output,
> never before when it's just the kernel startup and never after once systemd
> finishes and getty for example takes over.?
>
> The stacktrace looked like this:
> Unable to handle kernel NULL pointer dereference at virtual address 00000088
> pgd = c0004000
> [00000088] *pgd=00000000
> Internal error: Oops: 17 [#1] ARM
> Modules linked in: autofs4
> CPU: 0 ? ?Not tainted ?(3.6.9-yocto-standard #1)
3.6.9 is _very_ old, loads of things have happened in the tty layer
since then, can you duplicate this on 3.12 or 3.13-rc?
> PC is at tty_wakeup+0x8/0x58
> LR is at atmel_tasklet_func+0x238/0x80c
> pc : [<c01efd84>] ? ?lr : [<c0208fc0>] ? ?psr: a00f0013
> sp : df84ff28 ?ip : 00000000 ?fp : 00000100
> r10: c05d1ec0 ?r9 : 04208040 ?r8 : c05d1e80
> r7 : 0000000a ?r6 : 00000000 ?r5 : dedf7c00 ?r4 : 00000000
> r3 : dedf7c00 ?r2 : 00000000 ?r1 : 600f0013 ?r0 : 00000000
> Flags: NzCv ?IRQs on ?FIQs on ?Mode SVC_32 ?ISA ARM ?Segment kernel
> Control: 10c53c7d ?Table: 3fb0c059 ?DAC: 00000015
> Process ksoftirqd/0 (pid: 3, stack limit = 0xdf84e2f0)
> Stack: (0xdf84ff28 to 0xdf850000)
> ff20: ? ? ? ? ? ? ? ? ? c05e4378 dedf7c00 00000000 c0208fc0 c05aa458 df8496b0
> ff40: df849680 c05aa458 df8496b0 c00394a8 c0039420 c05a8d04 00000000 00000000
> ff60: 0000000a c05d1e80 04208040 c05d1ec0 00000100 c001fd34 00000009 00000018
> ff80: df84e000 c001f60c df84ffbc c03f8e60 00000013 00000006 00000000 c05d1e80
> ffa0: df84e000 00000000 00000013 00000000 00000000 00000000 00000000 c001f728
> ffc0: df84bf3c 00000000 c001f6c0 c0030870 00000000 00000000 00000000 00000000
> ffe0: df84ffe0 df84ffe0 df84bf3c c00307f0 c000e348 c000e348 fff73fbf 3fbeffff
> [<c01efd84>] (tty_wakeup+0x8/0x58) from [<c0208fc0>] (atmel_tasklet_func+0x238/
> 0x80c)
> [<c0208fc0>] (atmel_tasklet_func+0x238/0x80c) from [<c001fd34>]
> (tasklet_action+0x70/0xa8)
> [<c001fd34>] (tasklet_action+0x70/0xa8) from [<c001f60c>] (__do_softirq+0x90/
> 0x144)
> [<c001f60c>] (__do_softirq+0x90/0x144) from [<c001f728>] (run_ksoftirqd+0x68/
> 0x104)
> [<c001f728>] (run_ksoftirqd+0x68/0x104) from [<c0030870>] (kthread+0x80/0x90)
> [<c0030870>] (kthread+0x80/0x90) from [<c000e348>] (kernel_thread_exit+0x0/0x8)
> Code: e8bd8070 c05ac0b8 e92d4070 e1a04000 (e5903088)
> ---[ end trace 15b8a9aeaf7b457f ]---
>
>
> Testing wise I originally used a BUG_ON statement in atmel_tasklet_func to
> panic before the null deference hit. BUG_ON confirmed that tty was NULL at the
> very start of the tasklet being called.?
And did you test that this patch actually fixed it?
> The atmel_shutdown function should be killing the tasklet (after patch "Handle
> shutdown more safely") and does disable interrupts so I've been at a loss at
> where the race condition was occurring that a tasklet could escape beyond
> shutdown.
Are you sure you aren't just racing with shutdown? Need a lock
somewhere?
greg k-h
WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <gregkh@linuxfoundation.org>
To: Mark Roszko <mark.roszko@gmail.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>,
Leilei Zhao <leilei.zhao@atmel.com>,
mdeneen@gmail.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH 3/4] tty/serial: at91: prevent null dereference in tasklet function
Date: Tue, 7 Jan 2014 19:44:53 -0800 [thread overview]
Message-ID: <20140108034453.GA8664@kroah.com> (raw)
In-Reply-To: <CAJjB1qLDd6JFb4LwD5iokg5=O8Bwp5MkKsrr45QDodZkZrx75g@mail.gmail.com>
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Tue, Jan 07, 2014 at 08:52:34PM -0500, Mark Roszko wrote:
> This patch I was somewhat hesitant to pushing to Atmel when I did the other two
> patches.
Then why did you? :)
> The main issue is the use of systemd causes the serial driver to somehow get
> out of sync on startups randomly. i.e. on one bootup it's fine, and on other
> it'll kernel panic.�
> It occurs because systemd causes the startup and shutdown driver ops to be
> fired in rapid succession.�
How does systemd cause this? Is this when the serial port is being used
as a console or something else?
> Every single service start message causes the _startup callback first, then the
> message prints and _shutdown callbacks fires.�
So something is opening and closing the port quickly? That should be
easy to test without systemd.
Any console involved?
> And a kernel panic always occurs somewhere during the service status output,
> never before when it's just the kernel startup and never after once systemd
> finishes and getty for example takes over.�
>
> The stacktrace looked like this:
> Unable to handle kernel NULL pointer dereference at virtual address 00000088
> pgd = c0004000
> [00000088] *pgd=00000000
> Internal error: Oops: 17 [#1] ARM
> Modules linked in: autofs4
> CPU: 0 � �Not tainted �(3.6.9-yocto-standard #1)
3.6.9 is _very_ old, loads of things have happened in the tty layer
since then, can you duplicate this on 3.12 or 3.13-rc?
> PC is at tty_wakeup+0x8/0x58
> LR is at atmel_tasklet_func+0x238/0x80c
> pc : [<c01efd84>] � �lr : [<c0208fc0>] � �psr: a00f0013
> sp : df84ff28 �ip : 00000000 �fp : 00000100
> r10: c05d1ec0 �r9 : 04208040 �r8 : c05d1e80
> r7 : 0000000a �r6 : 00000000 �r5 : dedf7c00 �r4 : 00000000
> r3 : dedf7c00 �r2 : 00000000 �r1 : 600f0013 �r0 : 00000000
> Flags: NzCv �IRQs on �FIQs on �Mode SVC_32 �ISA ARM �Segment kernel
> Control: 10c53c7d �Table: 3fb0c059 �DAC: 00000015
> Process ksoftirqd/0 (pid: 3, stack limit = 0xdf84e2f0)
> Stack: (0xdf84ff28 to 0xdf850000)
> ff20: � � � � � � � � � c05e4378 dedf7c00 00000000 c0208fc0 c05aa458 df8496b0
> ff40: df849680 c05aa458 df8496b0 c00394a8 c0039420 c05a8d04 00000000 00000000
> ff60: 0000000a c05d1e80 04208040 c05d1ec0 00000100 c001fd34 00000009 00000018
> ff80: df84e000 c001f60c df84ffbc c03f8e60 00000013 00000006 00000000 c05d1e80
> ffa0: df84e000 00000000 00000013 00000000 00000000 00000000 00000000 c001f728
> ffc0: df84bf3c 00000000 c001f6c0 c0030870 00000000 00000000 00000000 00000000
> ffe0: df84ffe0 df84ffe0 df84bf3c c00307f0 c000e348 c000e348 fff73fbf 3fbeffff
> [<c01efd84>] (tty_wakeup+0x8/0x58) from [<c0208fc0>] (atmel_tasklet_func+0x238/
> 0x80c)
> [<c0208fc0>] (atmel_tasklet_func+0x238/0x80c) from [<c001fd34>]
> (tasklet_action+0x70/0xa8)
> [<c001fd34>] (tasklet_action+0x70/0xa8) from [<c001f60c>] (__do_softirq+0x90/
> 0x144)
> [<c001f60c>] (__do_softirq+0x90/0x144) from [<c001f728>] (run_ksoftirqd+0x68/
> 0x104)
> [<c001f728>] (run_ksoftirqd+0x68/0x104) from [<c0030870>] (kthread+0x80/0x90)
> [<c0030870>] (kthread+0x80/0x90) from [<c000e348>] (kernel_thread_exit+0x0/0x8)
> Code: e8bd8070 c05ac0b8 e92d4070 e1a04000 (e5903088)
> ---[ end trace 15b8a9aeaf7b457f ]---
>
>
> Testing wise I originally used a BUG_ON statement in atmel_tasklet_func to
> panic before the null deference hit. BUG_ON confirmed that tty was NULL at the
> very start of the tasklet being called.�
And did you test that this patch actually fixed it?
> The atmel_shutdown function should be killing the tasklet (after patch "Handle
> shutdown more safely") and does disable interrupts so I've been at a loss at
> where the race condition was occurring that a tasklet could escape beyond
> shutdown.
Are you sure you aren't just racing with shutdown? Need a lock
somewhere?
greg k-h
next prev parent reply other threads:[~2014-01-08 3:44 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-07 10:41 [PATCH 0/4] tty/serial: at91: fixes dealing with port shutdown Nicolas Ferre
2014-01-07 10:41 ` Nicolas Ferre
2014-01-07 10:41 ` Nicolas Ferre
2014-01-07 10:45 ` [PATCH 1/4] tty/serial: at91: Handle shutdown more safely Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` [PATCH 2/4] tty/serial: at91: fix race condition in atmel_serial_remove Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` [PATCH 3/4] tty/serial: at91: prevent null dereference in tasklet function Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-08 1:11 ` Greg KH
2014-01-08 1:11 ` Greg KH
[not found] ` <CAJjB1qLDd6JFb4LwD5iokg5=O8Bwp5MkKsrr45QDodZkZrx75g@mail.gmail.com>
2014-01-08 3:44 ` Greg KH [this message]
2014-01-08 3:44 ` Greg KH
2014-01-08 3:44 ` Greg KH
2014-01-07 10:45 ` [PATCH 4/4] tty/serial: at91: reset rx_ring when port is shutdown Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:45 ` Nicolas Ferre
2014-01-07 10:48 ` [PATCH 0/4] tty/serial: at91: fixes dealing with port shutdown Nicolas Ferre
2014-01-07 10:48 ` Nicolas Ferre
2014-01-07 10:48 ` Nicolas Ferre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140108034453.GA8664@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=leilei.zhao@atmel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=mark.roszko@gmail.com \
--cc=mdeneen@gmail.com \
--cc=nicolas.ferre@atmel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.