linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dinakar Guniguntala <dino@in.ibm.com>
To: tglx@linutronix.de
Cc: Darren Hart <dvhltc@us.ibm.com>,
	linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org
Subject: [patch -rt] Fix infinite loop with 2.6.31.4-rt14
Date: Fri, 23 Oct 2009 19:17:00 +0530	[thread overview]
Message-ID: <20091023134700.GA5578@in.ibm.com> (raw)

Hi Thomas,

I see an application hang in 2.6.31.4-rt14 when running some java tests.

The kernel seems to be continuously looping in 
       futex_wait_requeue_pi -> futex_wait_setup ->
       ret -EAGAIN -> goto retry -> futex_wait_setup -> on and on

===============================================================================

    java-5544  [001] 79682.800631: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800631: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800632: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800632: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800632: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800632: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800633: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800633: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800633: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800633: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800633: put_futex_key <-do_futex
    java-5544  [001] 79682.800634: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800634: get_futex_key <-do_futex
    java-5544  [001] 79682.800634: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800634: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800635: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800635: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800635: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800635: hash_futex <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800636: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800636: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800636: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800637: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800637: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800637: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800637: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800637: rt_spin_lock_fastunlock <-rt_spin_unlock
    java-5544  [001] 79682.800638: drop_futex_key_refs <-queue_unlock
    java-5544  [001] 79682.800638: put_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800638: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800638: put_futex_key <-do_futex
    java-5544  [001] 79682.800639: drop_futex_key_refs <-put_futex_key
    java-5544  [001] 79682.800639: get_futex_key <-do_futex
    java-5544  [001] 79682.800639: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800639: futex_wait_setup <-do_futex
    java-5544  [001] 79682.800639: get_futex_key <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-get_futex_key
    java-5544  [001] 79682.800640: queue_lock <-futex_wait_setup
    java-5544  [001] 79682.800640: get_futex_key_refs <-queue_lock
    java-5544  [001] 79682.800640: hash_futex <-queue_lock
    java-5544  [001] 79682.800640: rt_spin_lock <-queue_lock
    java-5544  [001] 79682.800641: rt_spin_lock_fastlock <-rt_spin_lock
    java-5544  [001] 79682.800641: __might_sleep <-rt_spin_lock_fastlock
    java-5544  [001] 79682.800641: get_futex_value_locked <-futex_wait_setup
    java-5544  [001] 79682.800641: pagefault_disable <-get_futex_value_locked
    java-5544  [001] 79682.800642: pagefault_enable <-get_futex_value_locked
    java-5544  [001] 79682.800642: queue_unlock <-futex_wait_setup
    java-5544  [001] 79682.800642: rt_spin_unlock <-queue_unlock
    java-5544  [001] 79682.800642: rt_spin_lock_fastunlock <-rt_spin_unlock


===============================================================================

This looks to be caused by the patch below
      -> http://patchwork.kernel.org/patch/53483/

Not sure if this the best way to go here, but the patch below seems to resolve
the problem for me

If this is fine, I'll send a separate patch for mainline. Currently mainline
seems to be missing the earlier patch referenced above as well

Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>

	-Dinakar

---
 kernel/futex.c |   84 +++++++++++++++++++++------------------------------------
 1 file changed, 32 insertions(+), 52 deletions(-)

Index: linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
===================================================================
--- linux-2.6.31.4-rt14-lbf-f1.orig/kernel/futex.c
+++ linux-2.6.31.4-rt14-lbf-f1/kernel/futex.c
@@ -2048,54 +2048,6 @@ pi_faulted:
 }
 
 /**
- * handle_early_requeue_pi_wakeup() - Detect early wakeup on the initial futex
- * @hb:		the hash_bucket futex_q was original enqueued on
- * @q:		the futex_q woken while waiting to be requeued
- * @key2:	the futex_key of the requeue target futex
- * @timeout:	the timeout associated with the wait (NULL if none)
- *
- * Detect if the task was woken on the initial futex as opposed to the requeue
- * target futex.  If so, determine if it was a timeout or a signal that caused
- * the wakeup and return the appropriate error code to the caller.  Must be
- * called with the hb lock held.
- *
- * Returns
- *  0 - no early wakeup detected
- * <0 - -ETIMEDOUT or -ERESTARTNOINTR
- */
-static inline
-int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
-				   struct futex_q *q, union futex_key *key2,
-				   struct hrtimer_sleeper *timeout)
-{
-	int ret = 0;
-
-	/*
-	 * With the hb lock held, we avoid races while we process the wakeup.
-	 * We only need to hold hb (and not hb2) to ensure atomicity as the
-	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
-	 * It can't be requeued from uaddr2 to something else since we don't
-	 * support a PI aware source futex for requeue.
-	 */
-	if (!match_futex(&q->key, key2)) {
-		WARN_ON(q->lock_ptr && (&hb->lock != q->lock_ptr));
-		/*
-		 * We were woken prior to requeue by a timeout or a signal.
-		 * Unqueue the futex_q and determine which it was.
-		 */
-		plist_del(&q->list, &q->list.plist);
-
-		/* Handle spurious wakeups gracefully */
-		ret = -EAGAIN;
-		if (timeout && !timeout->task)
-			ret = -ETIMEDOUT;
-		else if (signal_pending(current))
-			ret = -ERESTARTNOINTR;
-	}
-	return ret;
-}
-
-/**
  * futex_wait_requeue_pi() - Wait on uaddr and take uaddr2
  * @uaddr:	the futex we initialyl wait on (non-pi)
  * @fshared:	whether the futexes are shared (1) or not (0).  They must be
@@ -2186,8 +2138,39 @@ retry:
 	futex_wait_queue_me(hb, &q, to);
 
 	spin_lock(&hb->lock);
-	ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
+	/*
+	 * Detect if the task was woken on the initial futex as opposed to the requeue
+	 * target futex.  If so, determine if it was a timeout or a signal that caused
+	 * the wakeup and return the appropriate error code to the caller.  Must be
+	 * called with the hb lock held.
+	 * With the hb lock held, we avoid races while we process the wakeup.
+	 * We only need to hold hb (and not hb2) to ensure atomicity as the
+	 * wakeup code can't change q.key from uaddr to uaddr2 if we hold hb.
+	 * It can't be requeued from uaddr2 to something else since we don't
+	 * support a PI aware source futex for requeue.
+	 */
+	if (!match_futex(&q.key, &key2)) {
+		WARN_ON(q.lock_ptr && (&hb->lock != q.lock_ptr));
+		/*
+		 * We were woken prior to requeue by a timeout or a signal.
+		 * Unqueue the futex_q and determine which it was.
+		 */
+		plist_del(&q.list, &q.list.plist);
+
+		/* Handle spurious wakeups gracefully */
+		ret = -EAGAIN;
+		if (to && !to->task)
+			ret = -ETIMEDOUT;
+		else if (signal_pending(current))
+			ret = -ERESTARTNOINTR;
+	}
 	spin_unlock(&hb->lock);
+	if (ret == -EAGAIN) {
+		/* Retry on spurious wakeup */
+		put_futex_key(fshared, &q.key);
+		put_futex_key(fshared, &key2);
+		goto retry;
+	}
 	if (ret)
 		goto out_put_keys;
 
@@ -2264,9 +2247,6 @@ out_put_keys:
 out_key2:
 	put_futex_key(fshared, &key2);
 
-	/* Spurious wakeup ? */
-	if (ret == -EAGAIN)
-		goto retry;
 out:
 	if (to) {
 		hrtimer_cancel(&to->timer);

             reply	other threads:[~2009-10-23 13:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-23 13:47 Dinakar Guniguntala [this message]
2009-10-23 16:21 ` [patch -rt] Fix infinite loop with 2.6.31.4-rt14 Darren Hart
2009-10-23 20:08   ` [patch -rt] Fix infinite loop with 2.6.31.4-rt14 V2 Dinakar Guniguntala
2009-10-23 20:41     ` Darren Hart
2009-10-23 23:29       ` Darren Hart
2009-10-26 19:01         ` Darren Hart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091023134700.GA5578@in.ibm.com \
    --to=dino@in.ibm.com \
    --cc=dvhltc@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).