* Oops with 2.6.29-rc7 on POWER5
@ 2009-03-10 0:05 Josh Boyer
2009-03-10 0:36 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 3+ messages in thread
From: Josh Boyer @ 2009-03-10 0:05 UTC (permalink / raw)
To: linuxppc-dev
I get the following oops on a ppc64 machine using a Fedora rawhide kernel,
which is very close to 2.6.29-rc7.
It's a POWER5, pSeries CHRP IBM,9123-710.
Haven't looked into it just quite yet, but I found it interesting and was
wondering if anyone had seen anything like this or could recreate.
josh
BUG: sleeping function called from invalid context at kernel/mutex.c:207
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper
------------[ cut here ]------------
Badness at kernel/mutex.c:135
NIP: c0000000005fe54c LR: c0000000005fe530 CTR: 0000000000000001
REGS: c00000000fffb5b0 TRAP: 0700 Not tainted (2.6.29-0.215.rc7.fc11.ppc64)
MSR: 8000000000021032 <ME,CE,IR,DR> CR: 28000082 XER: 0000000f
TASK = c000000000f69e90[0] 'swapper' THREAD: c000000001010000 CPU: 0
GPR00: 0000000000000000 c00000000fffb830 c0000000010106b8 0000000000000001
GPR04: c000000000f69e90 0000000000000070 0000000000000000 0000000000000002
GPR08: 0000000000000000 c00000000179a3b8 c00000000104cb58 c000000001086a10
GPR12: 000000000000003c c000000001058400 c00000006f09b4d0 c00000006f09b270
GPR16: c00000006f09b408 c00000000fffba60 0000000000000001 c00000006f09b3c8
GPR20: 0000000000000001 c00000006e570129 0000000000000000 c00000006f09b8c0
GPR24: 0000000000000000 c00000000039d520 c00000006f09b248 c000000000f69e90
GPR28: c00000006f09b8c0 c00000006f09b8c0 c000000000fa15c0 c00000000fffb830
NIP [c0000000005fe54c] .mutex_lock_nested+0xc0/0x4b0
LR [c0000000005fe530] .mutex_lock_nested+0xa4/0x4b0
Call Trace:
[c00000000fffb830] [c0000000005fe504] .mutex_lock_nested+0x78/0x4b0 (unreliable)
[c00000000fffb950] [c00000000039d520] .echo_char_raw+0x40/0x98
[c00000000fffb9f0] [c00000000039fd68] .n_tty_receive_buf+0xb48/0x1104
[c00000000fffbbb0] [c0000000003a3a08] .flush_to_ldisc+0x160/0x244
[c00000000fffbc80] [c0000000003a3b5c] .tty_flip_buffer_push+0x70/0x9c
[c00000000fffbd10] [c0000000003b9e94] .hvsi_interrupt+0x464/0x590
[c00000000fffbe50] [c000000000119168] .handle_IRQ_event+0x60/0xdc
[c00000000fffbef0] [c00000000011baf0] .handle_fasteoi_irq+0x108/0x1a8
[c00000000fffbf90] [c00000000002f1c4] .call_handle_irq+0x1c/0x2c
[c000000001013970] [c00000000000e0ac] .do_IRQ+0x144/0x258
[c000000001013a30] [c000000000004d28] hardware_interrupt_entry+0x28/0x2c
--- Exception: 501 at .raw_local_irq_restore+0xa4/0xc0
LR = .cpu_idle+0x13c/0x1e0
[c000000001013d20] [c000000000f9af28] mv88e6131_switch_driver+0x8d08/0x275f8 (unreliable)
[c000000001013dc0] [c000000000014d34] .cpu_idle+0x13c/0x1e0
[c000000001013e60] [c0000000006062b8] .rest_init+0x94/0xb0
[c000000001013ee0] [c00000000088bd08] .start_kernel+0x4a4/0x4c8
[c000000001013f90] [c000000000008408] .start_here_common+0x2c/0xa4
Instruction dump:
78290464 80090014 5409012f 41a20028 4bcb199d 60000000 2fa30000 419e0018
e93e8008 80090000 2f800000 409e0008 <0fe00000> 38000000 8b8d01da 980d01da
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Oops with 2.6.29-rc7 on POWER5
2009-03-10 0:05 Oops with 2.6.29-rc7 on POWER5 Josh Boyer
@ 2009-03-10 0:36 ` Benjamin Herrenschmidt
2009-03-10 11:43 ` Josh Boyer
0 siblings, 1 reply; 3+ messages in thread
From: Benjamin Herrenschmidt @ 2009-03-10 0:36 UTC (permalink / raw)
To: Josh Boyer; +Cc: linuxppc-dev, Alan Cox
On Mon, 2009-03-09 at 20:05 -0400, Josh Boyer wrote:
> [c00000000fffb830] [c0000000005fe504] .mutex_lock_nested+0x78/0x4b0
> (unreliable)
> [c00000000fffb950] [c00000000039d520] .echo_char_raw+0x40/0x98
> [c00000000fffb9f0] [c00000000039fd68] .n_tty_receive_buf+0xb48/0x1104
> [c00000000fffbbb0] [c0000000003a3a08] .flush_to_ldisc+0x160/0x244
> [c00000000fffbc80] [c0000000003a3b5c] .tty_flip_buffer_push+0x70/0x9c
> [c00000000fffbd10] [c0000000003b9e94] .hvsi_interrupt+0x464/0x590
> [c00000000fffbe50] [c000000000119168] .handle_IRQ_event+0x60/0xdc
> [c00000000fffbef0] [c00000000011baf0] .handle_fasteoi_irq+0x108/0x1a8
>
Do that patch help ?
Alan, any comment about the races talked about in those comments ? Are
they still something I should worry about ?
hvc: Remove tty->low_latency on pseries backends
The hvcs and hvsi backends both set tty->low_latency to one, along
with more or less scary comments regarding bugs or races that would
happen if not doing so.
However, they also both call tty_flip_buffer_push() in conexts where
it's illegal to do so since some recent tty changes (or at least it
may have been illegal always but it nows blows) when low_latency is
set (ie, hard interrupt or with spinlock held and irqs disabled).
This removes the setting for now to get them back to working condition,
we'll have to address the races described in the comments separately
if they are still an issue (some of this might have been fixed already).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
Index: linux-work/drivers/char/hvcs.c
===================================================================
--- linux-work.orig/drivers/char/hvcs.c 2009-03-10 11:28:03.000000000 +1100
+++ linux-work/drivers/char/hvcs.c 2009-03-10 11:28:08.000000000 +1100
@@ -1139,15 +1139,6 @@ static int hvcs_open(struct tty_struct *
hvcsd->tty = tty;
tty->driver_data = hvcsd;
- /*
- * Set this driver to low latency so that we actually have a chance at
- * catching a throttled TTY after we flip_buffer_push. Otherwise the
- * flush_to_async may not execute until after the kernel_thread has
- * yielded and resumed the next flip_buffer_push resulting in data
- * loss.
- */
- tty->low_latency = 1;
-
memset(&hvcsd->buffer[0], 0x00, HVCS_BUFF_LEN);
/*
Index: linux-work/drivers/char/hvsi.c
===================================================================
--- linux-work.orig/drivers/char/hvsi.c 2009-03-10 11:27:19.000000000 +1100
+++ linux-work/drivers/char/hvsi.c 2009-03-10 11:27:22.000000000 +1100
@@ -810,7 +810,6 @@ static int hvsi_open(struct tty_struct *
hp = &hvsi_ports[line];
tty->driver_data = hp;
- tty->low_latency = 1; /* avoid throttle/tty_flip_buffer_push race */
mb();
if (hp->state == HVSI_FSP_DIED)
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Oops with 2.6.29-rc7 on POWER5
2009-03-10 0:36 ` Benjamin Herrenschmidt
@ 2009-03-10 11:43 ` Josh Boyer
0 siblings, 0 replies; 3+ messages in thread
From: Josh Boyer @ 2009-03-10 11:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Alan Cox
On Tue, Mar 10, 2009 at 11:36:15AM +1100, Benjamin Herrenschmidt wrote:
>On Mon, 2009-03-09 at 20:05 -0400, Josh Boyer wrote:
>> [c00000000fffb830] [c0000000005fe504] .mutex_lock_nested+0x78/0x4b0
>> (unreliable)
>> [c00000000fffb950] [c00000000039d520] .echo_char_raw+0x40/0x98
>> [c00000000fffb9f0] [c00000000039fd68] .n_tty_receive_buf+0xb48/0x1104
>> [c00000000fffbbb0] [c0000000003a3a08] .flush_to_ldisc+0x160/0x244
>> [c00000000fffbc80] [c0000000003a3b5c] .tty_flip_buffer_push+0x70/0x9c
>> [c00000000fffbd10] [c0000000003b9e94] .hvsi_interrupt+0x464/0x590
>> [c00000000fffbe50] [c000000000119168] .handle_IRQ_event+0x60/0xdc
>> [c00000000fffbef0] [c00000000011baf0] .handle_fasteoi_irq+0x108/0x1a8
>>
>Do that patch help ?
Yes. No more oopses on the hvc console when I hit 'enter'.
>Alan, any comment about the races talked about in those comments ? Are
>they still something I should worry about ?
If not, we should look at getting this into 2.6.29. I know it's getting
very late, but having kernels oops whenever someone hits enter on the
console probably isn't very good either.
josh
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-03-10 11:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-10 0:05 Oops with 2.6.29-rc7 on POWER5 Josh Boyer
2009-03-10 0:36 ` Benjamin Herrenschmidt
2009-03-10 11:43 ` Josh Boyer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).