* Kernel Oops caused by high uart write loads
[not found] ` <5EC8D4ACC923FE4D871AFFACC976127959E1A0@stwexdb.stww2k.local>
@ 2013-01-30 7:53 ` Belser Florian
2013-02-04 11:44 ` Thomas Gleixner
0 siblings, 1 reply; 3+ messages in thread
From: Belser Florian @ 2013-01-30 7:53 UTC (permalink / raw)
To: 'linux-rt-users@vger.kernel.org'
Hi all,
I'm running 3.4.17-rt28 on my mpc5200 based system. The complete system works pretty good until I select the "Fully Preemptible Kernel" option in the kernel settings.
In that case, if I generate a high uart write load (sending a lot of stuff via Bluetooth) I get the following kernel Oops:
# ------------[ cut here ]------------
Kernel BUG at c03d1728 [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT mpc5200-simple-platform
Modules linked in:
NIP: c03d1728 LR: c03d170c CTR: c01efc78
REGS: c716fd30 TRAP: 0700 Not tainted (3.4.17-rt28/STW-V3.00r0+)
MSR: 00029032 <EE,ME,IR,DR,RI> CR: 88002022 XER: 00000000
TASK = c7125880[633] 'irq/129-mpc52xx' THREAD: c716e000
GPR00: 00000001 c716fde0 c7125880 00000000 c7125880 00000000 00000001 00000000
GPR08: c7125880 c7125880 c7125880 c7125881 88002022 fbfdffff 07fff000 00000004
GPR16: 00000024 00000000 000000c0 00000000 c0537e70 00000004 00000000 00000004
GPR24: c716ad5c c0537e8c c7a73000 00000000 c7bb2a00 c7125880 c78b9800 c0537e8c
NIP [c03d1728] rt_spin_lock_slowlock+0x78/0x1e0
LR [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0
Call Trace:
[c716fde0] [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0 (unreliable)
[c716fe40] [c01efcd4] uart_write+0x5c/0x114
[c716fe70] [c028f3f4] hci_uart_tx_wakeup+0xe0/0x1fc
[c716fea0] [c01d3398] tty_wakeup+0x78/0xac
[c716feb0] [c01ee9e0] uart_write_wakeup+0x24/0x34
[c716fec0] [c01f1c38] mpc52xx_psc_handle_irq+0x3f8/0x4b0
[c716ff20] [c01f13e4] mpc52xx_uart_int+0x38/0x60
[c716ff30] [c005f660] irq_forced_thread_fn+0x38/0x9c
[c716ff50] [c005f42c] irq_thread+0x13c/0x1c0
[c716ff90] [c00391d4] kthread+0x8c/0x90
[c716fff0] [c000dd4c] kernel_thread+0x4c/0x68
Instruction dump:
7fe3fb78 7fa4eb78 38a00000 38c00001 4bc86591 2f830000 409e0134 801f0018
5400003c 7fa00278 7c000034 5400d97e <0f000000> 3bdd0418 3b810008 7fc3f378
If I switch the preemption modelt o "Preemptible Kernel (Basic RT)" everything works fine.
Hope someone already had the same or similar problem and can help me solving it.
Maybe a update to 3.4.27-rt39 helps?
Thanks a lot
Florian Belser
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Kernel Oops caused by high uart write loads
2013-01-30 7:53 ` Kernel Oops caused by high uart write loads Belser Florian
@ 2013-02-04 11:44 ` Thomas Gleixner
0 siblings, 0 replies; 3+ messages in thread
From: Thomas Gleixner @ 2013-02-04 11:44 UTC (permalink / raw)
To: Belser Florian
Cc: 'linux-rt-users@vger.kernel.org', linux-serial,
linux-bluetooth
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3080 bytes --]
On Wed, 30 Jan 2013, Belser Florian wrote:
> I'm running 3.4.17-rt28 on my mpc5200 based system. The complete
> system works pretty good until I select the "Fully Preemptible
> Kernel" option in the kernel settings. In that case, if I generate
> a high uart write load (sending a lot of stuff via Bluetooth) I get
> the following kernel Oops:
> # ------------[ cut here ]------------
> Kernel BUG at c03d1728 [verbose debug info unavailable]
I bet this is: BUG_ON(rt_mutex_owner(lock) == self);
> Oops: Exception in kernel mode, sig: 5 [#1]
> PREEMPT mpc5200-simple-platform
> Modules linked in:
> NIP: c03d1728 LR: c03d170c CTR: c01efc78
> REGS: c716fd30 TRAP: 0700 Not tainted (3.4.17-rt28/STW-V3.00r0+)
> MSR: 00029032 <EE,ME,IR,DR,RI> CR: 88002022 XER: 00000000
> TASK = c7125880[633] 'irq/129-mpc52xx' THREAD: c716e000
> GPR00: 00000001 c716fde0 c7125880 00000000 c7125880 00000000 00000001 00000000
> GPR08: c7125880 c7125880 c7125880 c7125881 88002022 fbfdffff 07fff000 00000004
> GPR16: 00000024 00000000 000000c0 00000000 c0537e70 00000004 00000000 00000004
> GPR24: c716ad5c c0537e8c c7a73000 00000000 c7bb2a00 c7125880 c78b9800 c0537e8c
> NIP [c03d1728] rt_spin_lock_slowlock+0x78/0x1e0
> LR [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0
> Call Trace:
> [c716fde0] [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0 (unreliable)
> [c716fe40] [c01efcd4] uart_write+0x5c/0x114
> [c716fe70] [c028f3f4] hci_uart_tx_wakeup+0xe0/0x1fc
> [c716fea0] [c01d3398] tty_wakeup+0x78/0xac
> [c716feb0] [c01ee9e0] uart_write_wakeup+0x24/0x34
> [c716fec0] [c01f1c38] mpc52xx_psc_handle_irq+0x3f8/0x4b0
> [c716ff20] [c01f13e4] mpc52xx_uart_int+0x38/0x60
> [c716ff30] [c005f660] irq_forced_thread_fn+0x38/0x9c
> [c716ff50] [c005f42c] irq_thread+0x13c/0x1c0
> [c716ff90] [c00391d4] kthread+0x8c/0x90
> [c716fff0] [c000dd4c] kernel_thread+0x4c/0x68
> Instruction dump:
> 7fe3fb78 7fa4eb78 38a00000 38c00001 4bc86591 2f830000 409e0134 801f0018
> 5400003c 7fa00278 7c000034 5400d97e <0f000000> 3bdd0418 3b810008 7fc3f378
> If I switch the preemption modelt o "Preemptible Kernel (Basic RT)"
> everything works fine.
By some definition of works. It works w/o RT_FULL because locks are
NOPs on uniprocessor, except you enable lock debugging.
This is a classic recursive dead lock. If you enable
CONFIG_PROVE_LOCKING, then you should see the same issue even on a
completely unpatched mainline kernel.
> Hope someone already had the same or similar problem and can help me solving it.
> Maybe a update to 3.4.27-rt39 helps?
No, wont help.
The problem is:
mpc52xx_uart_int()
lock(port->lock);
mpc52xx_psc_handle_irq()
mpc52xx_uart_int_tx_chars()
uart_write_wakeup()
tty_wakeup()
hci_uart_tx_wakeup()
len = tty->ops->write(tty, skb->data, skb->len);
The associated write function is uart_write
uart_write()
lock(port->lock) --> deadlock
I have no idea how that bluetooth "uart" gets connected to the
physical uart, but the backtrace is pretty obvious. What are you doing
to reproduce this?
Thanks,
tglx
^ permalink raw reply [flat|nested] 3+ messages in thread
* AW: Kernel Oops caused by high uart write loads
@ 2013-02-06 6:36 Belser Florian
2013-02-06 6:51 ` Sven-Thorsten Dietrich
0 siblings, 1 reply; 3+ messages in thread
From: Belser Florian @ 2013-02-06 6:36 UTC (permalink / raw)
To: 'linux-rt-users@vger.kernel.org'; +Cc: Waibel Georg
Hi,
here is the solution for the kernel oops. As tglx mentioned, the kernel crashed because of recursive spin_locking.
Here's a patch to fix this issue in the mpc52xx_uart driver. It just releases the lock before calling uart_write_wakeup.
Our kernel version 3.4.17-rt28.
What would we have to do to get this bug fix mainline?
Regards,
Florian
Index: drivers/tty/serial/mpc52xx_uart.c ===================================================================
--- drivers/tty/serial/mpc52xx_uart.c
+++ drivers/tty/serial/mpc52xx_uart.c
@@ -1053,8 +1053,11 @@
}
/* Wake up */
- if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
+ if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS) {
+ spin_unlock(&port->lock);
uart_write_wakeup(port);
+ spin_lock(&port->lock);
+ }
/* Maybe we're done after all */
if (uart_circ_empty(xmit)) {
-----Ursprüngliche Nachricht-----
Von: linux-rt-users-owner@vger.kernel.org [mailto:linux-rt-users-owner@vger.kernel.org] Im Auftrag von Thomas Gleixner
Gesendet: Montag, 4. Februar 2013 12:45
An: Belser Florian
Cc: 'linux-rt-users@vger.kernel.org'; linux-serial@vger.kernel.org; linux-bluetooth@vger.kernel.org
Betreff: Re: Kernel Oops caused by high uart write loads
On Wed, 30 Jan 2013, Belser Florian wrote:
> I'm running 3.4.17-rt28 on my mpc5200 based system. The complete
> system works pretty good until I select the "Fully Preemptible Kernel"
> option in the kernel settings. In that case, if I generate a high
> uart write load (sending a lot of stuff via Bluetooth) I get the
> following kernel Oops:
> # ------------[ cut here ]------------ Kernel BUG at c03d1728 [verbose
> debug info unavailable]
I bet this is: BUG_ON(rt_mutex_owner(lock) == self);
> Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT
> mpc5200-simple-platform Modules linked in:
> NIP: c03d1728 LR: c03d170c CTR: c01efc78
> REGS: c716fd30 TRAP: 0700 Not tainted (3.4.17-rt28/STW-V3.00r0+)
> MSR: 00029032 <EE,ME,IR,DR,RI> CR: 88002022 XER: 00000000 TASK =
> c7125880[633] 'irq/129-mpc52xx' THREAD: c716e000
> GPR00: 00000001 c716fde0 c7125880 00000000 c7125880 00000000 00000001
> 00000000
> GPR08: c7125880 c7125880 c7125880 c7125881 88002022 fbfdffff 07fff000
> 00000004
> GPR16: 00000024 00000000 000000c0 00000000 c0537e70 00000004 00000000
> 00000004
> GPR24: c716ad5c c0537e8c c7a73000 00000000 c7bb2a00 c7125880 c78b9800
> c0537e8c NIP [c03d1728] rt_spin_lock_slowlock+0x78/0x1e0 LR [c03d170c]
> rt_spin_lock_slowlock+0x5c/0x1e0 Call Trace:
> [c716fde0] [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0 (unreliable)
> [c716fe40] [c01efcd4] uart_write+0x5c/0x114 [c716fe70] [c028f3f4]
> hci_uart_tx_wakeup+0xe0/0x1fc [c716fea0] [c01d3398]
> tty_wakeup+0x78/0xac [c716feb0] [c01ee9e0] uart_write_wakeup+0x24/0x34
> [c716fec0] [c01f1c38] mpc52xx_psc_handle_irq+0x3f8/0x4b0
> [c716ff20] [c01f13e4] mpc52xx_uart_int+0x38/0x60 [c716ff30] [c005f660]
> irq_forced_thread_fn+0x38/0x9c [c716ff50] [c005f42c]
> irq_thread+0x13c/0x1c0 [c716ff90] [c00391d4] kthread+0x8c/0x90
> [c716fff0] [c000dd4c] kernel_thread+0x4c/0x68 Instruction dump:
> 7fe3fb78 7fa4eb78 38a00000 38c00001 4bc86591 2f830000 409e0134
> 801f0018 5400003c 7fa00278 7c000034 5400d97e <0f000000> 3bdd0418
> 3b810008 7fc3f378
> If I switch the preemption modelt o "Preemptible Kernel (Basic RT)"
> everything works fine.
By some definition of works. It works w/o RT_FULL because locks are NOPs on uniprocessor, except you enable lock debugging.
This is a classic recursive dead lock. If you enable CONFIG_PROVE_LOCKING, then you should see the same issue even on a completely unpatched mainline kernel.
> Hope someone already had the same or similar problem and can help me solving it.
> Maybe a update to 3.4.27-rt39 helps?
No, wont help.
The problem is:
mpc52xx_uart_int()
lock(port->lock);
mpc52xx_psc_handle_irq()
mpc52xx_uart_int_tx_chars()
uart_write_wakeup()
tty_wakeup()
hci_uart_tx_wakeup()
len = tty->ops->write(tty, skb->data, skb->len);
The associated write function is uart_write
uart_write()
lock(port->lock) --> deadlock
I have no idea how that bluetooth "uart" gets connected to the physical uart, but the backtrace is pretty obvious. What are you doing to reproduce this?
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: Kernel Oops caused by high uart write loads
2013-02-06 6:36 AW: " Belser Florian
@ 2013-02-06 6:51 ` Sven-Thorsten Dietrich
0 siblings, 0 replies; 3+ messages in thread
From: Sven-Thorsten Dietrich @ 2013-02-06 6:51 UTC (permalink / raw)
To: Belser Florian; +Cc: 'linux-rt-users@vger.kernel.org', Waibel Georg
On Feb 5, 2013, at 10:36 PM, Belser Florian <Florian.Belser@sensor-technik.de> wrote:
> Hi,
> here is the solution for the kernel oops. As tglx mentioned, the kernel crashed because of recursive spin_locking.
> Here's a patch to fix this issue in the mpc52xx_uart driver. It just releases the lock before calling uart_write_wakeup.
> Our kernel version 3.4.17-rt28.
> What would we have to do to get this bug fix mainline?
IIUC this should also trigger lockdep warning in mainline.
If you can reproduce the lockdep warning in mainline, then send that and the patch
Sven
>
> Regards,
> Florian
>
> Index: drivers/tty/serial/mpc52xx_uart.c ===================================================================
> --- drivers/tty/serial/mpc52xx_uart.c
> +++ drivers/tty/serial/mpc52xx_uart.c
> @@ -1053,8 +1053,11 @@
> }
>
> /* Wake up */
> - if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
> + if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS) {
> + spin_unlock(&port->lock);
> uart_write_wakeup(port);
> + spin_lock(&port->lock);
> + }
>
> /* Maybe we're done after all */
> if (uart_circ_empty(xmit)) {
>
>
> -----Ursprüngliche Nachricht-----
> Von: linux-rt-users-owner@vger.kernel.org [mailto:linux-rt-users-owner@vger.kernel.org] Im Auftrag von Thomas Gleixner
> Gesendet: Montag, 4. Februar 2013 12:45
> An: Belser Florian
> Cc: 'linux-rt-users@vger.kernel.org'; linux-serial@vger.kernel.org; linux-bluetooth@vger.kernel.org
> Betreff: Re: Kernel Oops caused by high uart write loads
>
> On Wed, 30 Jan 2013, Belser Florian wrote:
>
>> I'm running 3.4.17-rt28 on my mpc5200 based system. The complete
>> system works pretty good until I select the "Fully Preemptible Kernel"
>> option in the kernel settings. In that case, if I generate a high
>> uart write load (sending a lot of stuff via Bluetooth) I get the
>> following kernel Oops:
>
>> # ------------[ cut here ]------------ Kernel BUG at c03d1728 [verbose
>> debug info unavailable]
>
> I bet this is: BUG_ON(rt_mutex_owner(lock) == self);
>
>> Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT
>> mpc5200-simple-platform Modules linked in:
>> NIP: c03d1728 LR: c03d170c CTR: c01efc78
>> REGS: c716fd30 TRAP: 0700 Not tainted (3.4.17-rt28/STW-V3.00r0+)
>> MSR: 00029032 <EE,ME,IR,DR,RI> CR: 88002022 XER: 00000000 TASK =
>> c7125880[633] 'irq/129-mpc52xx' THREAD: c716e000
>> GPR00: 00000001 c716fde0 c7125880 00000000 c7125880 00000000 00000001
>> 00000000
>> GPR08: c7125880 c7125880 c7125880 c7125881 88002022 fbfdffff 07fff000
>> 00000004
>> GPR16: 00000024 00000000 000000c0 00000000 c0537e70 00000004 00000000
>> 00000004
>> GPR24: c716ad5c c0537e8c c7a73000 00000000 c7bb2a00 c7125880 c78b9800
>> c0537e8c NIP [c03d1728] rt_spin_lock_slowlock+0x78/0x1e0 LR [c03d170c]
>> rt_spin_lock_slowlock+0x5c/0x1e0 Call Trace:
>> [c716fde0] [c03d170c] rt_spin_lock_slowlock+0x5c/0x1e0 (unreliable)
>> [c716fe40] [c01efcd4] uart_write+0x5c/0x114 [c716fe70] [c028f3f4]
>> hci_uart_tx_wakeup+0xe0/0x1fc [c716fea0] [c01d3398]
>> tty_wakeup+0x78/0xac [c716feb0] [c01ee9e0] uart_write_wakeup+0x24/0x34
>> [c716fec0] [c01f1c38] mpc52xx_psc_handle_irq+0x3f8/0x4b0
>> [c716ff20] [c01f13e4] mpc52xx_uart_int+0x38/0x60 [c716ff30] [c005f660]
>> irq_forced_thread_fn+0x38/0x9c [c716ff50] [c005f42c]
>> irq_thread+0x13c/0x1c0 [c716ff90] [c00391d4] kthread+0x8c/0x90
>> [c716fff0] [c000dd4c] kernel_thread+0x4c/0x68 Instruction dump:
>> 7fe3fb78 7fa4eb78 38a00000 38c00001 4bc86591 2f830000 409e0134
>> 801f0018 5400003c 7fa00278 7c000034 5400d97e <0f000000> 3bdd0418
>> 3b810008 7fc3f378
>
>> If I switch the preemption modelt o "Preemptible Kernel (Basic RT)"
>> everything works fine.
>
> By some definition of works. It works w/o RT_FULL because locks are NOPs on uniprocessor, except you enable lock debugging.
>
> This is a classic recursive dead lock. If you enable CONFIG_PROVE_LOCKING, then you should see the same issue even on a completely unpatched mainline kernel.
>
>> Hope someone already had the same or similar problem and can help me solving it.
>> Maybe a update to 3.4.27-rt39 helps?
>
> No, wont help.
>
> The problem is:
>
> mpc52xx_uart_int()
>
> lock(port->lock);
>
> mpc52xx_psc_handle_irq()
>
> mpc52xx_uart_int_tx_chars()
>
> uart_write_wakeup()
>
> tty_wakeup()
>
> hci_uart_tx_wakeup()
>
> len = tty->ops->write(tty, skb->data, skb->len);
>
> The associated write function is uart_write
>
> uart_write()
>
> lock(port->lock) --> deadlock
>
> I have no idea how that bluetooth "uart" gets connected to the physical uart, but the backtrace is pretty obvious. What are you doing to reproduce this?
>
> Thanks,
>
> tglx
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-02-06 6:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <5EC8D4ACC923FE4D871AFFACC976127959E18A@stwexdb.stww2k.local>
[not found] ` <5EC8D4ACC923FE4D871AFFACC976127959E1A0@stwexdb.stww2k.local>
2013-01-30 7:53 ` Kernel Oops caused by high uart write loads Belser Florian
2013-02-04 11:44 ` Thomas Gleixner
2013-02-06 6:36 AW: " Belser Florian
2013-02-06 6:51 ` Sven-Thorsten Dietrich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox