* Fix for sparc64 cpu hangs.
@ 2007-11-07 4:34 David Miller
2007-11-07 5:13 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2007-11-07 4:34 UTC (permalink / raw)
To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd
[ Bernd, Josip, and Fabio, I think I finally nailed this
cpu hang bug we were all seeing on sparc64. ]
[FUTEX]: Fix address computation in compat code.
compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:
(void __user *)entry + futex_offset
'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').
Things explode if the 32-bit sign bit is set in futex_offset.
Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.
This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().
Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.
Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bits clear.
Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():
1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
for address '0xf7f16bd8' which succeeds
3) goto #1
I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c
index 00b5726..8089e7e 100644
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -76,11 +76,16 @@ void compat_exit_robust_list(struct task_struct *curr)
* A pending lock might already be on the list, so
* dont process it twice:
*/
- if (entry != pending)
- if (handle_futex_death((void __user *)entry + futex_offset,
- curr, pi))
- return;
+ if (entry != pending) {
+ void __user *uaddr;
+ compat_uptr_t base;
+
+ base = ptr_to_compat(entry);
+ uaddr = compat_ptr(base + futex_offset);
+ if (handle_futex_death(uaddr, curr, pi))
+ return;
+ }
if (rc)
return;
uentry = next_uentry;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs.
2007-11-07 4:34 Fix for sparc64 cpu hangs David Miller
@ 2007-11-07 5:13 ` David Miller
2007-11-09 20:22 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2007-11-07 5:13 UTC (permalink / raw)
To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd
From: David Miller <davem@davemloft.net>
Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST)
> [FUTEX]: Fix address computation in compat code.
Sorry, I just noticed there is a second handle_futex_death()
call in compat_exit_robust_list() which has the same
address computation bug.
Here is an updated patch:
[FUTEX]: Fix address computation in compat code.
compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:
(void __user *)entry + futex_offset
'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').
Things explode if the 32-bit sign bit is set in futex_offset.
Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.
This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().
Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.
Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.
Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():
1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
for address '0xf7f16bd8' which succeeds
3) goto #1
I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c
index 00b5726..1931457 100644
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -30,6 +30,15 @@ fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry,
return 0;
}
+static void __user *futex_uaddr(struct robust_list *entry,
+ compat_long_t futex_offset)
+{
+ compat_uptr_t base = ptr_to_compat(entry);
+ void __user *uaddr = compat_ptr(base + futex_offset);
+
+ return uaddr;
+}
+
/*
* Walk curr->robust_list (very carefully, it's a userspace list!)
* and mark any locks found there dead, and notify any waiters.
@@ -76,11 +85,13 @@ void compat_exit_robust_list(struct task_struct *curr)
* A pending lock might already be on the list, so
* dont process it twice:
*/
- if (entry != pending)
- if (handle_futex_death((void __user *)entry + futex_offset,
- curr, pi))
- return;
+ if (entry != pending) {
+ void __user *uaddr = futex_uaddr(entry,
+ futex_offset);
+ if (handle_futex_death(uaddr, curr, pi))
+ return;
+ }
if (rc)
return;
uentry = next_uentry;
@@ -94,9 +105,11 @@ void compat_exit_robust_list(struct task_struct *curr)
cond_resched();
}
- if (pending)
- handle_futex_death((void __user *)pending + futex_offset,
- curr, pip);
+ if (pending) {
+ void __user *uaddr = futex_uaddr(pending, futex_offset);
+
+ handle_futex_death(uaddr, curr, pip);
+ }
}
asmlinkage long
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs.
2007-11-07 5:13 ` David Miller
@ 2007-11-09 20:22 ` Andrew Morton
2007-11-09 22:14 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2007-11-09 20:22 UTC (permalink / raw)
To: David Miller
Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy,
fabbione, arnd, stable
On Tue, 06 Nov 2007 21:13:56 -0800 (PST)
David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST)
>
> > [FUTEX]: Fix address computation in compat code.
>
> Sorry, I just noticed there is a second handle_futex_death()
> call in compat_exit_robust_list() which has the same
> address computation bug.
>
> Here is an updated patch:
>
> [FUTEX]: Fix address computation in compat code.
>
> compat_exit_robust_list() computes a pointer to the
> futex entry in userspace as follows:
>
> (void __user *)entry + futex_offset
>
> 'entry' is a 'struct robust_list __user *', and
> 'futex_offset' is a 'compat_long_t' (typically a 's32').
>
> Things explode if the 32-bit sign bit is set in futex_offset.
>
> Type promotion sign extends futex_offset to a 64-bit value before
> adding it to 'entry'.
>
> This triggered a problem on sparc64 running 32-bit applications which
> would lock up a cpu looping forever in the fault handling for the
> userspace load in handle_futex_death().
>
> Compat userspace runs with address masking (wherein the cpu zeros out
> the top 32-bits of every effective address given to a memory operation
> instruction) so the sparc64 fault handler accounts for this by
> zero'ing out the top 32-bits of the fault address too.
>
> Since the kernel properly uses the compat_uptr interfaces, kernel side
> accesses to compat userspace work too since they will only use
> addresses with the top 32-bit clear.
>
> Because of this compat futex layer bug we get into the following loop
> when executing the get_user() load near the top of handle_futex_death():
>
> 1) load from address '0xfffffffff7f16bd8', FAULT
> 2) fault handler clears upper 32-bits, processes fault
> for address '0xf7f16bd8' which succeeds
> 3) goto #1
>
> I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
> for their tireless efforts helping me track down this bug.
>
I tagged this as needed-in-2.6.23.x. Please let me know if that is not
appropriate.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs.
2007-11-09 20:22 ` Andrew Morton
@ 2007-11-09 22:14 ` David Miller
0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2007-11-09 22:14 UTC (permalink / raw)
To: akpm
Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy,
fabbione, arnd, stable
From: Andrew Morton <akpm@linux-foundation.org>
Date: Fri, 9 Nov 2007 12:22:08 -0800
> I tagged this as needed-in-2.6.23.x. Please let me know if that is not
> appropriate.
It is. I have it queued up for -stable already.
I'm just waiting for it Linus to get back from wherever he has been
the past few days so he can suck it in and it's upstream before I
submit it to -stable.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-11-09 22:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-07 4:34 Fix for sparc64 cpu hangs David Miller
2007-11-07 5:13 ` David Miller
2007-11-09 20:22 ` Andrew Morton
2007-11-09 22:14 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox