All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Ilya Dryomov <ilya.dryomov@inktank.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Ceph Development <ceph-devel@vger.kernel.org>,
	davidlohr@hp.com, jason.low2@hp.com
Subject: Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"
Date: Thu, 31 Jul 2014 15:13:31 +0200	[thread overview]
Message-ID: <20140731131331.GT19379@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CALFYKtBGQpxaZ0EFXbdvuEHeDeASeU-7XxQ_1a6ZDM=aJS651w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2107 bytes --]

On Thu, Jul 31, 2014 at 04:37:29PM +0400, Ilya Dryomov wrote:

> This didn't make sense to me at first too, and I'll be happy to be
> proven wrong, but we can reproduce this with rbd very reliably under
> higher than usual load, and the revert makes it go away.  What we are
> seeing in the rbd scenario is the following.

This is drivers/block/rbd.c ? I can find but a single mutex_lock() in
there.

> Suppose foo needs mutexes A and B, bar needs mutex B.  foo acquires
> A and then wants to acquire B, but B is held by bar.  foo spins
> a little and ends up calling schedule_preempt_disabled() on line 484
> above, but that call never returns, even though a hundred usecs later
> bar releases B.  foo ends up stuck in mutex_lock() indefinitely, but
> still holds A and everybody else who needs A gets behind A.  Given that
> this A happens to be a central libceph mutex all rbd activity halts.
> Deadlock may not be the best term for this, but never returning from
> mutex_lock(&B) even though B has been unlocked is *a* problem.
> 
> This obviously doesn't happen every time schedule_preempt_disabled() on
> line 484 is called, so there must be some sort of race here.  I'll send
> along the actual rbd stack traces shortly.

Smells like maybe current->state != TASK_RUNNING, does the below
trigger?

If so, you've wrecked something in whatever...

---
 kernel/locking/mutex.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index ae712b25e492..3d726fdaa764 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -473,8 +473,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 	 * reschedule now, before we try-lock the mutex. This avoids getting
 	 * scheduled out right after we obtained the mutex.
 	 */
-	if (need_resched())
+	if (need_resched()) {
+		if (WARN_ON_ONCE(current->state != TASK_RUNNING))
+			__set_current_state(TASK_RUNNING);
+
 		schedule_preempt_disabled();
+	}
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);
 

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2014-07-31 13:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-31 10:16 [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point" Ilya Dryomov
2014-07-31 11:57 ` Peter Zijlstra
2014-07-31 12:37   ` Ilya Dryomov
2014-07-31 13:13     ` Peter Zijlstra [this message]
2014-07-31 13:25       ` Ilya Dryomov
2014-07-31 13:44       ` Ingo Molnar
2014-07-31 13:56         ` Peter Zijlstra
2014-08-02 20:04           ` [RFC][PATCH] locking: Debug nested wait/locking primitives Peter Zijlstra
2014-07-31 14:30       ` [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point" Mike Galbraith
2014-07-31 14:37         ` Ilya Dryomov
2014-07-31 14:39         ` Peter Zijlstra
2014-08-01 12:56           ` Ilya Dryomov
2014-08-01 13:27             ` Peter Zijlstra
2014-08-01 13:50               ` Ilya Dryomov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140731131331.GT19379@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=davidlohr@hp.com \
    --cc=ilya.dryomov@inktank.com \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.