Fix for sparc64 cpu hangs.

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Fix for sparc64 cpu hangs.
@ 2007-11-07  4:34 David Miller
  2007-11-07  5:13 ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2007-11-07  4:34 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd

[ Bernd, Josip, and Fabio, I think I finally nailed this
  cpu hang bug we were all seeing on sparc64.  ]

[FUTEX]: Fix address computation in compat code.

compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:

	(void __user *)entry + futex_offset

'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').

Things explode if the 32-bit sign bit is set in futex_offset.

Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.

This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().

Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.

Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bits clear.

Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():

1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
   for address '0xf7f16bd8' which succeeds
3) goto #1

I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c
index 00b5726..8089e7e 100644
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -76,11 +76,16 @@ void compat_exit_robust_list(struct task_struct *curr)
 		 * A pending lock might already be on the list, so
 		 * dont process it twice:
 		 */
-		if (entry != pending)
-			if (handle_futex_death((void __user *)entry + futex_offset,
-						curr, pi))
-				return;
+		if (entry != pending) {
+			void __user *uaddr;
+			compat_uptr_t base;
+
+			base = ptr_to_compat(entry);
+			uaddr = compat_ptr(base + futex_offset);

+			if (handle_futex_death(uaddr, curr, pi))
+				return;
+		}
 		if (rc)
 			return;
 		uentry = next_uentry;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Fix for sparc64 cpu hangs.
  2007-11-07  4:34 Fix for sparc64 cpu hangs David Miller
@ 2007-11-07  5:13 ` David Miller
  2007-11-09 20:22   ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2007-11-07  5:13 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, sparclinux, linux-arch, bernd, joy, fabbione, arnd

From: David Miller <davem@davemloft.net>
Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST)

> [FUTEX]: Fix address computation in compat code.

Sorry, I just noticed there is a second handle_futex_death()
call in compat_exit_robust_list() which has the same
address computation bug.

Here is an updated patch:

[FUTEX]: Fix address computation in compat code.

compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:

	(void __user *)entry + futex_offset

'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').

Things explode if the 32-bit sign bit is set in futex_offset.

Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.

This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().

Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.

Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.

Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():

1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
   for address '0xf7f16bd8' which succeeds
3) goto #1

I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c
index 00b5726..1931457 100644
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -30,6 +30,15 @@ fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user **entry,
 	return 0;
 }

+static void __user *futex_uaddr(struct robust_list *entry,
+				compat_long_t futex_offset)
+{
+	compat_uptr_t base = ptr_to_compat(entry);
+	void __user *uaddr = compat_ptr(base + futex_offset);
+
+	return uaddr;
+}
+
 /*
  * Walk curr->robust_list (very carefully, it's a userspace list!)
  * and mark any locks found there dead, and notify any waiters.
@@ -76,11 +85,13 @@ void compat_exit_robust_list(struct task_struct *curr)
 		 * A pending lock might already be on the list, so
 		 * dont process it twice:
 		 */
-		if (entry != pending)
-			if (handle_futex_death((void __user *)entry + futex_offset,
-						curr, pi))
-				return;
+		if (entry != pending) {
+			void __user *uaddr = futex_uaddr(entry,
+							 futex_offset);

+			if (handle_futex_death(uaddr, curr, pi))
+				return;
+		}
 		if (rc)
 			return;
 		uentry = next_uentry;
@@ -94,9 +105,11 @@ void compat_exit_robust_list(struct task_struct *curr)

 		cond_resched();
 	}
-	if (pending)
-		handle_futex_death((void __user *)pending + futex_offset,
-				   curr, pip);
+	if (pending) {
+		void __user *uaddr = futex_uaddr(pending, futex_offset);
+
+		handle_futex_death(uaddr, curr, pip);
+	}
 }

 asmlinkage long

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Fix for sparc64 cpu hangs.
  2007-11-07  5:13 ` David Miller
@ 2007-11-09 20:22   ` Andrew Morton
  2007-11-09 22:14     ` David Miller
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2007-11-09 20:22 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy,
	fabbione, arnd, stable

On Tue, 06 Nov 2007 21:13:56 -0800 (PST)
David Miller <davem@davemloft.net> wrote:

> From: David Miller <davem@davemloft.net>
> Date: Tue, 06 Nov 2007 20:34:33 -0800 (PST)
> 
> > [FUTEX]: Fix address computation in compat code.
> 
> Sorry, I just noticed there is a second handle_futex_death()
> call in compat_exit_robust_list() which has the same
> address computation bug.
> 
> Here is an updated patch:
> 
> [FUTEX]: Fix address computation in compat code.
> 
> compat_exit_robust_list() computes a pointer to the
> futex entry in userspace as follows:
> 
> 	(void __user *)entry + futex_offset
> 
> 'entry' is a 'struct robust_list __user *', and
> 'futex_offset' is a 'compat_long_t' (typically a 's32').
> 
> Things explode if the 32-bit sign bit is set in futex_offset.
> 
> Type promotion sign extends futex_offset to a 64-bit value before
> adding it to 'entry'.
> 
> This triggered a problem on sparc64 running 32-bit applications which
> would lock up a cpu looping forever in the fault handling for the
> userspace load in handle_futex_death().
> 
> Compat userspace runs with address masking (wherein the cpu zeros out
> the top 32-bits of every effective address given to a memory operation
> instruction) so the sparc64 fault handler accounts for this by
> zero'ing out the top 32-bits of the fault address too.
> 
> Since the kernel properly uses the compat_uptr interfaces, kernel side
> accesses to compat userspace work too since they will only use
> addresses with the top 32-bit clear.
> 
> Because of this compat futex layer bug we get into the following loop
> when executing the get_user() load near the top of handle_futex_death():
> 
> 1) load from address '0xfffffffff7f16bd8', FAULT
> 2) fault handler clears upper 32-bits, processes fault
>    for address '0xf7f16bd8' which succeeds
> 3) goto #1
> 
> I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
> for their tireless efforts helping me track down this bug.
> 

I tagged this as needed-in-2.6.23.x.  Please let me know if that is not
appropriate.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fix for sparc64 cpu hangs.
  2007-11-09 20:22   ` Andrew Morton
@ 2007-11-09 22:14     ` David Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2007-11-09 22:14 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, linux-kernel, sparclinux, linux-arch, bernd, joy,
	fabbione, arnd, stable

From: Andrew Morton <akpm@linux-foundation.org>
Date: Fri, 9 Nov 2007 12:22:08 -0800

> I tagged this as needed-in-2.6.23.x.  Please let me know if that is not
> appropriate.

It is.  I have it queued up for -stable already.

I'm just waiting for it Linus to get back from wherever he has been
the past few days so he can suck it in and it's upstream before I
submit it to -stable.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-11-09 22:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-07  4:34 Fix for sparc64 cpu hangs David Miller
2007-11-07  5:13 ` David Miller
2007-11-09 20:22   ` Andrew Morton
2007-11-09 22:14     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox