* Re: T1000 CPU lockups
2006-11-01 20:49 T1000 CPU lockups Dennis Gilmore
@ 2006-11-02 0:12 ` David Miller
2006-11-02 17:05 ` Fabio Massimo Di Nitto
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2006-11-02 0:12 UTC (permalink / raw)
To: sparclinux
From: Dennis Gilmore <dennis@ausil.us>
Date: Wed, 1 Nov 2006 14:49:37 -0600
> I have had my T1000 lockup intermittently. I'm running a vanilla 2.6.18
> kernel. I get in the system logs for quite a few of the cpu's
>
> Oct 31 20:58:19 daedalus kernel: BUG: soft lockup detected on CPU#21!
> Oct 31 20:58:19 daedalus kernel: Call Trace:
> Oct 31 20:58:20 daedalus kernel: [00000000004319f4]
> smp_percpu_timer_interrupt+0xd4/0x144
> Oct 31 20:58:20 daedalus kernel: [00000000004109d4] tl0_irq14+0x1c/0x20
> Oct 31 20:58:20 daedalus kernel: [0000000000468f28] futex_lock_pi+0x130/0x7cc
> Oct 31 20:58:20 daedalus kernel: [000000000046a154] do_futex+0xb90/0xbbc
> Oct 31 20:58:20 daedalus kernel: [000000000046a6f4]
> compat_sys_futex+0x11c/0x130
> Oct 31 20:58:20 daedalus kernel: [0000000000406c94]
> linux_sparc_syscall32+0x3c/0x40
> Oct 31 20:58:20 daedalus kernel: [00000000701004b0] 0x701004b8
>
> after these. while the system stays alive I cant actually do anything
I think Fabbione was seeing this one too, CC:'d for more testing.
This patch below should fix it, thanks for the most excellent bug
report.
diff --git a/include/asm-sparc64/futex.h b/include/asm-sparc64/futex.h
index dee4020..7392fc4 100644
--- a/include/asm-sparc64/futex.h
+++ b/include/asm-sparc64/futex.h
@@ -87,24 +87,22 @@ static inline int
futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
{
__asm__ __volatile__(
- "\n1: lduwa [%2] %%asi, %0\n"
- "2: casa [%2] %%asi, %0, %1\n"
- "3:\n"
+ "\n1: casa [%3] %%asi, %2, %0\n"
+ "2:\n"
" .section .fixup,#alloc,#execinstr\n"
" .align 4\n"
- "4: ba 3b\n"
- " mov %3, %0\n"
+ "3: ba 2b\n"
+ " mov %4, %0\n"
" .previous\n"
" .section __ex_table,\"a\"\n"
" .align 4\n"
- " .word 1b, 4b\n"
- " .word 2b, 4b\n"
+ " .word 1b, 3b\n"
" .previous\n"
- : "=&r" (oldval)
- : "r" (newval), "r" (uaddr), "i" (-EFAULT)
+ : "=r" (newval)
+ : "0" (newval), "r" (oldval), "r" (uaddr), "i" (-EFAULT)
: "memory");
- return oldval;
+ return newval;
}
#endif /* !(_SPARC64_FUTEX_H) */
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: T1000 CPU lockups
2006-11-01 20:49 T1000 CPU lockups Dennis Gilmore
2006-11-02 0:12 ` David Miller
@ 2006-11-02 17:05 ` Fabio Massimo Di Nitto
2006-11-02 19:26 ` Dennis Gilmore
2006-11-03 0:03 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Fabio Massimo Di Nitto @ 2006-11-02 17:05 UTC (permalink / raw)
To: sparclinux
Hi David,
the patch does indeed fix the problem. I was able to build glibc-2.5 all the way
trough on Niagara. I am CC Ben and Martin because we need this in .17 kernels
too. A similar bug has been found on PPC and it has been escalated to local DoS
since any user can actually kill a system just running the test suite.
Thanks
Fabio
David Miller wrote:
> From: Dennis Gilmore <dennis@ausil.us>
> Date: Wed, 1 Nov 2006 14:49:37 -0600
>
>> I have had my T1000 lockup intermittently. I'm running a vanilla 2.6.18
>> kernel. I get in the system logs for quite a few of the cpu's
>>
>> Oct 31 20:58:19 daedalus kernel: BUG: soft lockup detected on CPU#21!
>> Oct 31 20:58:19 daedalus kernel: Call Trace:
>> Oct 31 20:58:20 daedalus kernel: [00000000004319f4]
>> smp_percpu_timer_interrupt+0xd4/0x144
>> Oct 31 20:58:20 daedalus kernel: [00000000004109d4] tl0_irq14+0x1c/0x20
>> Oct 31 20:58:20 daedalus kernel: [0000000000468f28] futex_lock_pi+0x130/0x7cc
>> Oct 31 20:58:20 daedalus kernel: [000000000046a154] do_futex+0xb90/0xbbc
>> Oct 31 20:58:20 daedalus kernel: [000000000046a6f4]
>> compat_sys_futex+0x11c/0x130
>> Oct 31 20:58:20 daedalus kernel: [0000000000406c94]
>> linux_sparc_syscall32+0x3c/0x40
>> Oct 31 20:58:20 daedalus kernel: [00000000701004b0] 0x701004b8
>>
>> after these. while the system stays alive I cant actually do anything
>
> I think Fabbione was seeing this one too, CC:'d for more testing.
>
> This patch below should fix it, thanks for the most excellent bug
> report.
>
> diff --git a/include/asm-sparc64/futex.h b/include/asm-sparc64/futex.h
> index dee4020..7392fc4 100644
> --- a/include/asm-sparc64/futex.h
> +++ b/include/asm-sparc64/futex.h
> @@ -87,24 +87,22 @@ static inline int
> futex_atomic_cmpxchg_inatomic(int __user *uaddr, int oldval, int newval)
> {
> __asm__ __volatile__(
> - "\n1: lduwa [%2] %%asi, %0\n"
> - "2: casa [%2] %%asi, %0, %1\n"
> - "3:\n"
> + "\n1: casa [%3] %%asi, %2, %0\n"
> + "2:\n"
> " .section .fixup,#alloc,#execinstr\n"
> " .align 4\n"
> - "4: ba 3b\n"
> - " mov %3, %0\n"
> + "3: ba 2b\n"
> + " mov %4, %0\n"
> " .previous\n"
> " .section __ex_table,\"a\"\n"
> " .align 4\n"
> - " .word 1b, 4b\n"
> - " .word 2b, 4b\n"
> + " .word 1b, 3b\n"
> " .previous\n"
> - : "=&r" (oldval)
> - : "r" (newval), "r" (uaddr), "i" (-EFAULT)
> + : "=r" (newval)
> + : "0" (newval), "r" (oldval), "r" (uaddr), "i" (-EFAULT)
> : "memory");
>
> - return oldval;
> + return newval;
> }
>
> #endif /* !(_SPARC64_FUTEX_H) */
--
I'm going to make him an offer he can't refuse.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: T1000 CPU lockups
2006-11-01 20:49 T1000 CPU lockups Dennis Gilmore
2006-11-02 0:12 ` David Miller
2006-11-02 17:05 ` Fabio Massimo Di Nitto
@ 2006-11-02 19:26 ` Dennis Gilmore
2006-11-03 0:03 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Dennis Gilmore @ 2006-11-02 19:26 UTC (permalink / raw)
To: sparclinux
On Wednesday 01 November 2006 18:12, David Miller wrote:
> From: Dennis Gilmore <dennis@ausil.us>
> Date: Wed, 1 Nov 2006 14:49:37 -0600
>
> > I have had my T1000 lockup intermittently. I'm running a vanilla 2.6.18
> > kernel. I get in the system logs for quite a few of the cpu's
> >
> > Oct 31 20:58:19 daedalus kernel: BUG: soft lockup detected on CPU#21!
> > Oct 31 20:58:19 daedalus kernel: Call Trace:
> > Oct 31 20:58:20 daedalus kernel: [00000000004319f4]
> > smp_percpu_timer_interrupt+0xd4/0x144
> > Oct 31 20:58:20 daedalus kernel: [00000000004109d4] tl0_irq14+0x1c/0x20
> > Oct 31 20:58:20 daedalus kernel: [0000000000468f28]
> > futex_lock_pi+0x130/0x7cc Oct 31 20:58:20 daedalus kernel:
> > [000000000046a154] do_futex+0xb90/0xbbc Oct 31 20:58:20 daedalus kernel:
> > [000000000046a6f4]
> > compat_sys_futex+0x11c/0x130
> > Oct 31 20:58:20 daedalus kernel: [0000000000406c94]
> > linux_sparc_syscall32+0x3c/0x40
> > Oct 31 20:58:20 daedalus kernel: [00000000701004b0] 0x701004b8
> >
> > after these. while the system stays alive I cant actually do anything
>
> I think Fabbione was seeing this one too, CC:'d for more testing.
>
> This patch below should fix it, thanks for the most excellent bug
> report.
Thanks for the patch, its Applied and built, it has usually taken a few days
to manifest. so ill watch it and report back later.
--
Dennis Gilmore, RHCE
Proud Australian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: T1000 CPU lockups
2006-11-01 20:49 T1000 CPU lockups Dennis Gilmore
` (2 preceding siblings ...)
2006-11-02 19:26 ` Dennis Gilmore
@ 2006-11-03 0:03 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2006-11-03 0:03 UTC (permalink / raw)
To: sparclinux
From: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
Date: Thu, 02 Nov 2006 18:05:54 +0100
> the patch does indeed fix the problem. I was able to build glibc-2.5
> all the way trough on Niagara. I am CC Ben and Martin because we
> need this in .17 kernels too. A similar bug has been found on PPC
> and it has been escalated to local DoS since any user can actually
> kill a system just running the test suite.
Thanks for the testing and confirmation.
Indeed, 2.6.17 has the problem too. I have already submitted this
patch to the -stable folks for 2.6.17 and 2.6.18 -stable inclusion.
^ permalink raw reply [flat|nested] 5+ messages in thread