linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch -rt] Fix infinite loop with 2.6.31.4-rt14
@ 2009-10-23 13:47 Dinakar Guniguntala
  2009-10-23 16:21 ` Darren Hart
  0 siblings, 1 reply; 6+ messages in thread
From: Dinakar Guniguntala @ 2009-10-23 13:47 UTC (permalink / raw)
  To: tglx; +Cc: Darren Hart, linux-kernel, linux-rt-users

Hi Thomas,

I see an application hang in 2.6.31.4-rt14 when running some java tests.

The kernel seems to be continuously looping in 
       futex_wait_requeue_pi -> futex_wait_setup ->
       ret -EAGAIN -> goto retry -> futex_wait_setup -> on and on

===============================================================================

    java-5544  [001] 79682.800631: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800631: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800632: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800632: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800632: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800632: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800633: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800633: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800633: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800633: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800633: put_futex_key <-do_futex
    java-5544  [001] 79682.800634: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800634: get_futex_key <-do_futex
    java-5544  [001] 79682.800634: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800634: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800635: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800635: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800635: hash_futex <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800636: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800636: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800637: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800637: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800637: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800637: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800637: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800638: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800638: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800638: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800638: put_futex_key <-do_futex
    java-5544  [001] 79682.800639: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800639: get_futex_key <-do_futex
    java-5544  [001] 79682.800639: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800639: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800639: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800640: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800640: hash_futex <-queue_lock
    java-5544  [001] 79682.800640: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800641: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800641: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800641: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800641: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800642: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800642: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800642: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800642: rt_spin_lock_fastunlock <-rt_spin_unlock


===============================================================================

This looks to be caused by the patch below
      -> http://patchwork.kernel.org/patch/53483/

Not sure if this the best way to go here, but the patch below seems to resolve
the problem for me

If this is fine, I'll send a separate patch for mainline. Currently mainline
seems to be missing the earlier patch referenced above as well

Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>

	-Dinakar

---
 kernel/futex.c |   84 +++++++++++++++++++++------------------------------------
 1 file changed, 32 insertions(+), 52 deletions(-)

Index: linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
===================================================================
--- linux-2.6.31.4-rt14-lbf-f1.orig/kernel/futex.c
+++ linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
@@ -2048,54 +2048,6 @@ pi_faulted:
 }
 
 /**
- * handle_early_requeue_pi_wakeup() - Detect early wakeup on the initial futex
- * @hb:		the hash_bucket futex_q was original enqueued on
- * @q:		the futex_q woken while waiting to be requeued
- * @key2:	the futex_key of the requeue target futex
- * @timeout:	the timeout associated with the wait (NULL if none)
- *
- * Detect if the task was woken on the initial futex as opposed to the requeue
- * target futex.  If so, determine if it was a timeout or a signal that caused
- * the wakeup and return the appropriate error code to the caller.  Must be
- * called with the hb lock held.
- *
- * Returns
- *  0 - no early wakeup detected
- * <0 - -ETIMEDOUT or -ERESTARTNOINTR
- */
-static inline
-int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
-				   struct futex_q *q, union futex_key *key2,
-				   struct hrtimer_sleeper *timeout)
-{
-	int ret = 0;
-
-	/*
-	 * With the hb lock held, we avoid races while we process the wakeup.
-	 * We only need to hold hb (and not hb2) to ensure atomicity as the
-	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
-	 * It can't be requeued from uaddr2 to something else since we don't
-	 * support a PI aware source futex for requeue.
-	 */
-	if (!match_futex(&q->key, key2)) {
-		WARN_ON(q->lock_ptr && (&hb->lock != q->lock_ptr));
-		/*
-		 * We were woken prior to requeue by a timeout or a signal.
-		 * Unqueue the futex_q and determine which it was.
-		 */
-		plist_del(&q->list, &q->list.plist);
-
-		/* Handle spurious wakeups gracefully */
-		ret = -EAGAIN;
-		if (timeout && !timeout->task)
-			ret = -ETIMEDOUT;
-		else if (signal_pending(current))
-			ret = -ERESTARTNOINTR;
-	}
-	return ret;
-}
-
-/**
  * futex_wait_requeue_pi() - Wait on uaddr and take uaddr2
  * @uaddr:	the futex we initialyl wait on (non-pi)
  * @fshared:	whether the futexes are shared (1) or not (0).  They must be
@@ -2186,8 +2138,39 @@ retry:
 	futex_wait_queue_me(hb, &q, to);
 
 	spin_lock(&hb->lock);
-	ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
+	/*
+	 * Detect if the task was woken on the initial futex as opposed to the requeue
+	 * target futex.  If so, determine if it was a timeout or a signal that caused
+	 * the wakeup and return the appropriate error code to the caller.  Must be
+	 * called with the hb lock held.
+	 * With the hb lock held, we avoid races while we process the wakeup.
+	 * We only need to hold hb (and not hb2) to ensure atomicity as the
+	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
+	 * It can't be requeued from uaddr2 to something else since we don't
+	 * support a PI aware source futex for requeue.
+	 */
+	if (!match_futex(&q.key, &key2)) {
+		WARN_ON(q.lock_ptr && (&hb->lock != q.lock_ptr));
+		/*
+		 * We were woken prior to requeue by a timeout or a signal.
+		 * Unqueue the futex_q and determine which it was.
+		 */
+		plist_del(&q.list, &q.list.plist);
+
+		/* Handle spurious wakeups gracefully */
+		ret = -EAGAIN;
+		if (to && !to->task)
+			ret = -ETIMEDOUT;
+		else if (signal_pending(current))
+			ret = -ERESTARTNOINTR;
+	}
 	spin_unlock(&hb->lock);
+	if (ret == -EAGAIN) {
+		/* Retry on spurious wakeup */
+		put_futex_key(fshared, &q.key);
+		put_futex_key(fshared, &key2);
+		goto retry;
+	}
 	if (ret)
 		goto out_put_keys;
 
@@ -2264,9 +2247,6 @@ out_put_keys:
 out_key2:
 	put_futex_key(fshared, &key2);
 
-	/* Spurious wakeup ? */
-	if (ret == -EAGAIN)
-		goto retry;
 out:
 	if (to) {
 		hrtimer_cancel(&to->timer);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch -rt] Fix infinite loop with 2.6.31.4-rt14
  2009-10-23 13:47 [patch -rt] Fix infinite loop with 2.6.31.4-rt14 Dinakar Guniguntala
@ 2009-10-23 16:21 ` Darren Hart
  2009-10-23 20:08   ` [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2 Dinakar Guniguntala
  0 siblings, 1 reply; 6+ messages in thread
From: Darren Hart @ 2009-10-23 16:21 UTC (permalink / raw)
  To: dino; +Cc: tglx, linux-kernel, linux-rt-users

Hey Dino,

Dinakar Guniguntala wrote:
> 
> Not sure if this the best way to go here, but the patch below seems to resolve
> the problem for me
> 
> If this is fine, I'll send a separate patch for mainline. Currently mainline
> seems to be missing the earlier patch referenced above as well

So... something else sets ret to -EAGAIN which should be returned to 
userspace, rather than used to retry.


>  /**
> - * handle_early_requeue_pi_wakeup() - Detect early wakeup on the initial futex
> - * @hb:		the hash_bucket futex_q was original enqueued on
> - * @q:		the futex_q woken while waiting to be requeued
> - * @key2:	the futex_key of the requeue target futex
> - * @timeout:	the timeout associated with the wait (NULL if none)
> - *
> - * Detect if the task was woken on the initial futex as opposed to the requeue
> - * target futex.  If so, determine if it was a timeout or a signal that caused
> - * the wakeup and return the appropriate error code to the caller.  Must be
> - * called with the hb lock held.
> - *
> - * Returns
> - *  0 - no early wakeup detected
> - * <0 - -ETIMEDOUT or -ERESTARTNOINTR
> - */
> -static inline
> -int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
> -				   struct futex_q *q, union futex_key *key2,
> -				   struct hrtimer_sleeper *timeout)
> -{

Cramming this entire function into futex_wait_requeue_pi() doesn't 
appear relevant to the problem. Please don't make 
futex_wait_requeue_pi() any longer than it already is. My fault that, 
but let's not make it worse!

>  	spin_unlock(&hb->lock);
> +	if (ret == -EAGAIN) {
> +		/* Retry on spurious wakeup */
> +		put_futex_key(fshared, &q.key);
> +		put_futex_key(fshared, &key2);
> +		goto retry;
> +	}

The above is the core of the change yes?

>  	if (ret)
>  		goto out_put_keys;
> 
> @@ -2264,9 +2247,6 @@ out_put_keys:
>  out_key2:
>  	put_futex_key(fshared, &key2);
> 
> -	/* Spurious wakeup ? */
> -	if (ret == -EAGAIN)
> -		goto retry;

I did a little digging trying to see what else might be returning EAGAIN 
after the handle_early..() call.  I only see fixup_pi_state_owner() and 
rt_mutex_finish_proxy_lock().  Neither of those has an obvious return of 
EAGAIN.  I'll keep looking.

If this turns out to be the proper fix, please reduce it down to simply 
check for EAGAIN after handle_early..() and return from there, with a 
comment, and avoid moving the logic into this overly large function.

Thanks,

-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2
  2009-10-23 16:21 ` Darren Hart
@ 2009-10-23 20:08   ` Dinakar Guniguntala
  2009-10-23 20:41     ` Darren Hart
  0 siblings, 1 reply; 6+ messages in thread
From: Dinakar Guniguntala @ 2009-10-23 20:08 UTC (permalink / raw)
  To: Darren Hart; +Cc: tglx, linux-kernel, linux-rt-users


Application threads calling futex_wait_requeue_pi run in an infinite loop
in the kernel if the futex value changes during the call. The following
patch fixes the problem.

Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Darren Hart <dvhltc@us.ibm.com>

---
 kernel/futex.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Index: linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
===================================================================
--- linux-2.6.31.4-rt14-lbf-f1.orig/kernel/futex.c
+++ linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
@@ -2188,6 +2188,12 @@ retry:
 	spin_lock(&hb->lock);
 	ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
 	spin_unlock(&hb->lock);
+	if (ret == -EAGAIN) {
+		/* Retry on spurious wakeup */
+		put_futex_key(fshared, &q.key);
+		put_futex_key(fshared, &key2);
+		goto retry;
+	}
 	if (ret)
 		goto out_put_keys;
 
@@ -2264,9 +2270,6 @@ out_put_keys:
 out_key2:
 	put_futex_key(fshared, &key2);
 
-	/* Spurious wakeup ? */
-	if (ret == -EAGAIN)
-		goto retry;
 out:
 	if (to) {
 		hrtimer_cancel(&to->timer);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2
  2009-10-23 20:08   ` [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2 Dinakar Guniguntala
@ 2009-10-23 20:41     ` Darren Hart
  2009-10-23 23:29       ` Darren Hart
  0 siblings, 1 reply; 6+ messages in thread
From: Darren Hart @ 2009-10-23 20:41 UTC (permalink / raw)
  To: dino
  Cc: tglx, linux-kernel, linux-rt-users, Peter Zijlstra, Ingo Molnar,
	Eric Dumazet, John Stultz, stable

Dinakar Guniguntala wrote:
 > Application threads calling futex_wait_requeue_pi run in an infinite loop
 > in the kernel if the futex value changes during the call. The following
 > patch fixes the problem.

The key bit here being that EAGAIN == EWOULDBLOCK - who thought that was 
a good idea?

 >
 > Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
 > Cc: Thomas Gleixner <tglx@linutronix.de>
 > Cc: Darren Hart <dvhltc@us.ibm.com>

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>

Adding the usual CC list for futexes as well as stable:

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: John Stultz <johnstul@us.ibm.com>
CC: stable@kernel.org

 >
 > ---
 >  kernel/futex.c |    9 ++++++---
 >  1 file changed, 6 insertions(+), 3 deletions(-)
 >
 > Index: linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
 > ===================================================================
 > --- linux-2.6.31.4-rt14-lbf-f1.orig/kernel/futex.c
 > +++ linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
 > @@ -2188,6 +2188,12 @@ retry:
 >  	spin_lock(&hb->lock);
 >  	ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
 >  	spin_unlock(&hb->lock);
 > +	if (ret == -EAGAIN) {
 > +		/* Retry on spurious wakeup */
 > +		put_futex_key(fshared, &q.key);
 > +		put_futex_key(fshared, &key2);
 > +		goto retry;
 > +	}
 >  	if (ret)
 >  		goto out_put_keys;
 >
 > @@ -2264,9 +2270,6 @@ out_put_keys:
 >  out_key2:
 >  	put_futex_key(fshared, &key2);
 >
 > -	/* Spurious wakeup ? */
 > -	if (ret == -EAGAIN)
 > -		goto retry;
 >  out:
 >  	if (to) {
 >  		hrtimer_cancel(&to->timer);


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2
  2009-10-23 20:41     ` Darren Hart
@ 2009-10-23 23:29       ` Darren Hart
  2009-10-26 19:01         ` Darren Hart
  0 siblings, 1 reply; 6+ messages in thread
From: Darren Hart @ 2009-10-23 23:29 UTC (permalink / raw)
  To: dino
  Cc: tglx, linux-kernel, linux-rt-users, Peter Zijlstra, Ingo Molnar,
	Eric Dumazet, John Stultz

Darren Hart wrote:
> Dinakar Guniguntala wrote:
>  > Application threads calling futex_wait_requeue_pi run in an infinite 
> loop
>  > in the kernel if the futex value changes during the call. The following
>  > patch fixes the problem.
> 
> The key bit here being that EAGAIN == EWOULDBLOCK - who thought that was 
> a good idea?

And now that I think about it, when I reviewed this original patch I
remember mentioning that this isn't even needed for
futex_wait_requeue_pi() because we don't have the same wake-up race as
futex_wait() suffers from - since we don't use the same lock_ptr == NULL
test (nor do we use the wake_list in the requeue code). I suspect the
only case where -EAGAIN is being used here is when the uval doesn't
match val - no spurious wakeups.

Dino, can you try with the following patch which just reverts the
spurious wakeup handling for the requeue_pi path.


>From c21e762bf384e0a559fdf964e0ba7576550d90ec Mon Sep 17 00:00:00 2001
From: Darren Hart <dvhltc@us.ibm.com>
Date: Fri, 23 Oct 2009 16:18:48 -0700
Subject: [PATCH] futex: revert spurious wakeup fix for requeue_pi

The requeue_pi path doesn't use unqueue_me() (and the racy lock_ptr ==
NULL test) nor does it use the wake_list of futex_wake() which led to
the following fix.

41890f2... futex: Handle spurious wake up

See debugging discussing on LKML Message-ID: <4AD4080C.20703@us.ibm.com>

The changes in this fix to the requeue_pi path were considered to be a
likely unecessary, but harmless safety net. Since they are in fact
causing a problem, just remove them and insert a warning in their place.
We can remove the warning later, or even in this commit if folks would
rather.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dinakar Guniguntala <dino@in.ibm.com>
CC: John Stultz <johnstul@us.ibm.com>

Witholding CC to stable for further discussion.
---
 kernel/futex.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 7c4a6ac..7e4e8b2 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2085,12 +2085,19 @@ int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
 		 */
 		plist_del(&q->list, &q->list.plist);
 
-		/* Handle spurious wakeups gracefully */
-		ret = -EAGAIN;
 		if (timeout && !timeout->task)
 			ret = -ETIMEDOUT;
 		else if (signal_pending(current))
 			ret = -ERESTARTNOINTR;
+		else {
+			/*
+			 * We don't use the racy unqueue_me() path with the
+			 * q.lock_ptr NULL test, nor does requeue use a
+			 * wake_list. All wakeups here should be accounted for.
+			 */
+			printk(KERN_ERR "Spurious wakeup in %s\n",
+			       __FUNCTION__);
+		}
 	}
 	return ret;
 }
@@ -2171,7 +2178,6 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, int fshared,
 	q.bitset = bitset;
 	q.rt_waiter = &rt_waiter;
 
-retry:
 	key2 = FUTEX_KEY_INIT;
 	ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_WRITE);
 	if (unlikely(ret != 0))
@@ -2264,9 +2270,6 @@ out_put_keys:
 out_key2:
 	put_futex_key(fshared, &key2);
 
-	/* Spurious wakeup ? */
-	if (ret == -EAGAIN)
-		goto retry;
 out:
 	if (to) {
 		hrtimer_cancel(&to->timer);
-- 
1.6.0.4


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2
  2009-10-23 23:29       ` Darren Hart
@ 2009-10-26 19:01         ` Darren Hart
  0 siblings, 0 replies; 6+ messages in thread
From: Darren Hart @ 2009-10-26 19:01 UTC (permalink / raw)
  To: dino
  Cc: tglx, linux-kernel, linux-rt-users, Peter Zijlstra, Ingo Molnar,
	Eric Dumazet, John Stultz

Darren Hart wrote:
> Darren Hart wrote:
>> Dinakar Guniguntala wrote:
>>  > Application threads calling futex_wait_requeue_pi run in an 
>> infinite loop
>>  > in the kernel if the futex value changes during the call. The 
>> following
>>  > patch fixes the problem.
>>
>> The key bit here being that EAGAIN == EWOULDBLOCK - who thought that 
>> was a good idea?
> 
> And now that I think about it, when I reviewed this original patch I
> remember mentioning that this isn't even needed for
> futex_wait_requeue_pi() because we don't have the same wake-up race as
> futex_wait() suffers from - since we don't use the same lock_ptr == NULL
> test (nor do we use the wake_list in the requeue code). I suspect the
> only case where -EAGAIN is being used here is when the uval doesn't
> match val - no spurious wakeups.
> 
> Dino, can you try with the following patch which just reverts the
> spurious wakeup handling for the requeue_pi path.


Dino mentioned in IRC that this is basically what he tried originally 
and that it worked fine. Thomas, any objections to this patch?

--
Darren

>  From c21e762bf384e0a559fdf964e0ba7576550d90ec Mon Sep 17 00:00:00 2001
> From: Darren Hart <dvhltc@us.ibm.com>
> Date: Fri, 23 Oct 2009 16:18:48 -0700
> Subject: [PATCH] futex: revert spurious wakeup fix for requeue_pi
> 
> The requeue_pi path doesn't use unqueue_me() (and the racy lock_ptr ==
> NULL test) nor does it use the wake_list of futex_wake() which led to
> the following fix.
> 
> 41890f2... futex: Handle spurious wake up
> 
> See debugging discussing on LKML Message-ID: <4AD4080C.20703@us.ibm.com>
> 
> The changes in this fix to the requeue_pi path were considered to be a
> likely unecessary, but harmless safety net. Since they are in fact
> causing a problem, just remove them and insert a warning in their place.
> We can remove the warning later, or even in this commit if folks would
> rather.
> 
> Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@elte.hu>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Dinakar Guniguntala <dino@in.ibm.com>
> CC: John Stultz <johnstul@us.ibm.com>
> 
> Witholding CC to stable for further discussion.
> ---
> kernel/futex.c |   15 +++++++++------
> 1 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 7c4a6ac..7e4e8b2 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -2085,12 +2085,19 @@ int handle_early_requeue_pi_wakeup(struct 
> futex_hash_bucket *hb,
>          */
>         plist_del(&q->list, &q->list.plist);
> 
> -        /* Handle spurious wakeups gracefully */
> -        ret = -EAGAIN;
>         if (timeout && !timeout->task)
>             ret = -ETIMEDOUT;
>         else if (signal_pending(current))
>             ret = -ERESTARTNOINTR;
> +        else {
> +            /*
> +             * We don't use the racy unqueue_me() path with the
> +             * q.lock_ptr NULL test, nor does requeue use a
> +             * wake_list. All wakeups here should be accounted for.
> +             */
> +            printk(KERN_ERR "Spurious wakeup in %s\n",
> +                   __FUNCTION__);
> +        }
>     }
>     return ret;
> }
> @@ -2171,7 +2178,6 @@ static int futex_wait_requeue_pi(u32 __user 
> *uaddr, int fshared,
>     q.bitset = bitset;
>     q.rt_waiter = &rt_waiter;
> 
> -retry:
>     key2 = FUTEX_KEY_INIT;
>     ret = get_futex_key(uaddr2, fshared, &key2, VERIFY_WRITE);
>     if (unlikely(ret != 0))
> @@ -2264,9 +2270,6 @@ out_put_keys:
> out_key2:
>     put_futex_key(fshared, &key2);
> 
> -    /* Spurious wakeup ? */
> -    if (ret == -EAGAIN)
> -        goto retry;
> out:
>     if (to) {
>         hrtimer_cancel(&to->timer);


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-26 19:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-23 13:47 [patch -rt] Fix infinite loop with 2.6.31.4-rt14 Dinakar Guniguntala
2009-10-23 16:21 ` Darren Hart
2009-10-23 20:08   ` [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2 Dinakar Guniguntala
2009-10-23 20:41     ` Darren Hart
2009-10-23 23:29       ` Darren Hart
2009-10-26 19:01         ` Darren Hart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).