* Fix for sparc64 cpu hangs. @ 2007-11-07 4:34 David Miller 2007-11-07 5:13 ` David Miller 0 siblings, 1 reply; 4+ messages in thread From: David Miller @ 2007-11-07 4:34 UTC (permalink / raw) To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd [ Bernd, Josip, and Fabio, I think I finally nailed this cpu hang bug we were all seeing on sparc64. ] [FUTEX]: Fix address computation in compat code. compat_exit_robust_list() computes a pointer to the futex entry in userspace as follows: (void __user *)entry + futex_offset 'entry' is a 'struct robust_list __user *', and 'futex_offset' is a 'compat_long_t' (typically a 's32'). Things explode if the 32-bit sign bit is set in futex_offset. Type promotion sign extends futex_offset to a 64-bit value before adding it to 'entry'. This triggered a problem on sparc64 running 32-bit applications which would lock up a cpu looping forever in the fault handling for the userspace load in handle_futex_death(). Compat userspace runs with address masking (wherein the cpu zeros out the top 32-bits of every effective address given to a memory operation instruction) so the sparc64 fault handler accounts for this by zero'ing out the top 32-bits of the fault address too. Since the kernel properly uses the compat_uptr interfaces, kernel side accesses to compat userspace work too since they will only use addresses with the top 32-bits clear. Because of this compat futex layer bug we get into the following loop when executing the get_user() load near the top of handle_futex_death(): 1) load from address '0xfffffffff7f16bd8', FAULT 2) fault handler clears upper 32-bits, processes fault for address '0xf7f16bd8' which succeeds 3) goto #1 I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto for their tireless efforts helping me track down this bug. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c index 00b5726..8089e7e 100644 --- a/kernel/futex_compat.c +++ b/kernel/futex_compat.c @@ -76,11 +76,16 @@ void compat_exit_robust_list(struct task_struct *curr) * A pending lock might already be on the list, so * dont process it twice: */ - if (entry != pending) - if (handle_futex_death((void __user *)entry + futex_offset, - curr, pi)) - return; + if (entry != pending) { + void __user *uaddr; + compat_uptr_t base; + + base = ptr_to_compat(entry); + uaddr = compat_ptr(base + futex_offset); + if (handle_futex_death(uaddr, curr, pi)) + return; + } if (rc) return; uentry = next_uentry; ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs. 2007-11-07 4:34 Fix for sparc64 cpu hangs David Miller @ 2007-11-07 5:13 ` David Miller 2007-11-09 20:22 ` Andrew Morton 0 siblings, 1 reply; 4+ messages in thread From: David Miller @ 2007-11-07 5:13 UTC (permalink / raw) To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd From: David Miller <davem@davemloft.net> Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST) > [FUTEX]: Fix address computation in compat code. Sorry, I just noticed there is a second handle_futex_death() call in compat_exit_robust_list() which has the same address computation bug. Here is an updated patch: [FUTEX]: Fix address computation in compat code. compat_exit_robust_list() computes a pointer to the futex entry in userspace as follows: (void __user *)entry + futex_offset 'entry' is a 'struct robust_list __user *', and 'futex_offset' is a 'compat_long_t' (typically a 's32'). Things explode if the 32-bit sign bit is set in futex_offset. Type promotion sign extends futex_offset to a 64-bit value before adding it to 'entry'. This triggered a problem on sparc64 running 32-bit applications which would lock up a cpu looping forever in the fault handling for the userspace load in handle_futex_death(). Compat userspace runs with address masking (wherein the cpu zeros out the top 32-bits of every effective address given to a memory operation instruction) so the sparc64 fault handler accounts for this by zero'ing out the top 32-bits of the fault address too. Since the kernel properly uses the compat_uptr interfaces, kernel side accesses to compat userspace work too since they will only use addresses with the top 32-bit clear. Because of this compat futex layer bug we get into the following loop when executing the get_user() load near the top of handle_futex_death(): 1) load from address '0xfffffffff7f16bd8', FAULT 2) fault handler clears upper 32-bits, processes fault for address '0xf7f16bd8' which succeeds 3) goto #1 I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto for their tireless efforts helping me track down this bug. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c index 00b5726..1931457 100644 --- a/kernel/futex_compat.c +++ b/kernel/futex_compat.c @@ -30,6 +30,15 @@ fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry, return 0; } +static void __user *futex_uaddr(struct robust_list *entry, + compat_long_t futex_offset) +{ + compat_uptr_t base = ptr_to_compat(entry); + void __user *uaddr = compat_ptr(base + futex_offset); + + return uaddr; +} + /* * Walk curr->robust_list (very carefully, it's a userspace list!) * and mark any locks found there dead, and notify any waiters. @@ -76,11 +85,13 @@ void compat_exit_robust_list(struct task_struct *curr) * A pending lock might already be on the list, so * dont process it twice: */ - if (entry != pending) - if (handle_futex_death((void __user *)entry + futex_offset, - curr, pi)) - return; + if (entry != pending) { + void __user *uaddr = futex_uaddr(entry, + futex_offset); + if (handle_futex_death(uaddr, curr, pi)) + return; + } if (rc) return; uentry = next_uentry; @@ -94,9 +105,11 @@ void compat_exit_robust_list(struct task_struct *curr) cond_resched(); } - if (pending) - handle_futex_death((void __user *)pending + futex_offset, - curr, pip); + if (pending) { + void __user *uaddr = futex_uaddr(pending, futex_offset); + + handle_futex_death(uaddr, curr, pip); + } } asmlinkage long ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs. 2007-11-07 5:13 ` David Miller @ 2007-11-09 20:22 ` Andrew Morton 2007-11-09 22:14 ` David Miller 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2007-11-09 20:22 UTC (permalink / raw) To: David Miller Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd, stable On Tue, 06 Nov 2007 21:13:56 -0800 (PST) David Miller <davem@davemloft.net> wrote: > From: David Miller <davem@davemloft.net> > Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST) > > > [FUTEX]: Fix address computation in compat code. > > Sorry, I just noticed there is a second handle_futex_death() > call in compat_exit_robust_list() which has the same > address computation bug. > > Here is an updated patch: > > [FUTEX]: Fix address computation in compat code. > > compat_exit_robust_list() computes a pointer to the > futex entry in userspace as follows: > > (void __user *)entry + futex_offset > > 'entry' is a 'struct robust_list __user *', and > 'futex_offset' is a 'compat_long_t' (typically a 's32'). > > Things explode if the 32-bit sign bit is set in futex_offset. > > Type promotion sign extends futex_offset to a 64-bit value before > adding it to 'entry'. > > This triggered a problem on sparc64 running 32-bit applications which > would lock up a cpu looping forever in the fault handling for the > userspace load in handle_futex_death(). > > Compat userspace runs with address masking (wherein the cpu zeros out > the top 32-bits of every effective address given to a memory operation > instruction) so the sparc64 fault handler accounts for this by > zero'ing out the top 32-bits of the fault address too. > > Since the kernel properly uses the compat_uptr interfaces, kernel side > accesses to compat userspace work too since they will only use > addresses with the top 32-bit clear. > > Because of this compat futex layer bug we get into the following loop > when executing the get_user() load near the top of handle_futex_death(): > > 1) load from address '0xfffffffff7f16bd8', FAULT > 2) fault handler clears upper 32-bits, processes fault > for address '0xf7f16bd8' which succeeds > 3) goto #1 > > I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto > for their tireless efforts helping me track down this bug. > I tagged this as needed-in-2.6.23.x. Please let me know if that is not appropriate. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Fix for sparc64 cpu hangs. 2007-11-09 20:22 ` Andrew Morton @ 2007-11-09 22:14 ` David Miller 0 siblings, 0 replies; 4+ messages in thread From: David Miller @ 2007-11-09 22:14 UTC (permalink / raw) To: akpm Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd, stable From: Andrew Morton <akpm@linux-foundation.org> Date: Fri, 9 Nov 2007 12:22:08 -0800 > I tagged this as needed-in-2.6.23.x. Please let me know if that is not > appropriate. It is. I have it queued up for -stable already. I'm just waiting for it Linus to get back from wherever he has been the past few days so he can suck it in and it's upstream before I submit it to -stable. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-11-09 22:15 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-07 4:34 Fix for sparc64 cpu hangs David Miller 2007-11-07 5:13 ` David Miller 2007-11-09 20:22 ` Andrew Morton 2007-11-09 22:14 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox