All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Jason Low <jason.low2@hp.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Waiman Long <Waiman.Long@hp.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Davidlohr Bueso <davidlohr@hp.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>,
	chegu_vinod@hp.com
Subject: Re: [RFC][PATCH v2 5/5] mutex: Give spinners a chance to spin_on_owner if need_resched() triggered while queued
Date: Mon, 3 Feb 2014 20:25:25 +0100	[thread overview]
Message-ID: <20140203192525.GN8874@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1391452760.7498.26.camel@j-VirtualBox>

On Mon, Feb 03, 2014 at 10:39:20AM -0800, Jason Low wrote:
> > To avoid the xchg on every loop.
> 
> Ah yes, we want to use xchg() on &node->next.
> 
> Since the cmpxchg() is now in a loop in the unlock function, an
> additional (*lock == node) check before the cmpxchg() would also be nice
> to avoid spinning on cmpxchg() there too.

Right, I have the below; you can find the patches this depends upon
here:

  http://programming.kicks-ass.net/sekrit/patches.tar.bz2

---
Subject: locking, mutex: Cancelable MCS lock for adaptive spinning
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 29 Jan 2014 12:51:42 +0100

Since we want a task waiting for a mutex_lock() to go to sleep and
reschedule on need_resched() we must be able to abort the
mcs_spin_lock() around the adaptive spin.

Therefore implement a cancelable mcs lock.

XXX: anybody got a better name than m_spinlock?

Cc: paulmck@linux.vnet.ibm.com
Cc: Waiman.Long@hp.com
Cc: torvalds@linux-foundation.org
Cc: tglx@linutronix.de
Cc: riel@redhat.com
Cc: akpm@linux-foundation.org
Cc: davidlohr@hp.com
Cc: hpa@zytor.com
Cc: andi@firstfloor.org
Cc: aswin@hp.com
Cc: scott.norton@hp.com
Cc: chegu_vinod@hp.com
Cc: mingo@redhat.com
Cc: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-7jr68p4f447w2e0ck7y1yl06@git.kernel.org
---
 include/linux/mutex.h         |    4 -
 kernel/locking/Makefile       |    2 
 kernel/locking/mcs_spinlock.c |  156 ++++++++++++++++++++++++++++++++++++++++++
 kernel/locking/mcs_spinlock.h |   18 ++++
 kernel/locking/mutex.c        |   10 +-
 5 files changed, 183 insertions(+), 7 deletions(-)

--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -46,7 +46,7 @@
  * - detects multi-task circular deadlocks and prints out all affected
  *   locks and tasks (and only those tasks)
  */
-struct mcs_spinlock;
+struct m_spinlock;
 struct mutex {
 	/* 1: unlocked, 0: locked, negative: locked, possible waiters */
 	atomic_t		count;
@@ -56,7 +56,7 @@ struct mutex {
 	struct task_struct	*owner;
 #endif
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-	struct mcs_spinlock	*mcs_lock;	/* Spinner MCS lock */
+	struct m_spinlock	*m_lock;	/* Spinner MCS lock */
 #endif
 #ifdef CONFIG_DEBUG_MUTEXES
 	const char 		*name;
--- a/kernel/locking/Makefile
+++ b/kernel/locking/Makefile
@@ -1,5 +1,5 @@
 
-obj-y += mutex.o semaphore.o rwsem.o lglock.o
+obj-y += mutex.o semaphore.o rwsem.o lglock.o mcs_spinlock.o
 
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_lockdep.o = -pg
--- /dev/null
+++ b/kernel/locking/mcs_spinlock.c
@@ -0,0 +1,156 @@
+
+#include <linux/percpu.h>
+#include <linux/mutex.h>
+#include <linux/sched.h>
+#include "mcs_spinlock.h"
+
+#ifdef CONFIG_SMP
+
+/*
+ * Using a single mcs node per CPU is safe because mutex_lock() should not be
+ * called from interrupt context and we have preemption disabled over the mcs
+ * lock usage.
+ */
+static DEFINE_PER_CPU_SHARED_ALIGNED(struct m_spinlock, m_node);
+
+/*
+ * Get a stable @node->next pointer, either for unlock() or unqueue() purposes.
+ * Can return NULL in case we were the last queued and we updated @lock instead.
+ */
+static inline struct m_spinlock *
+m_spin_wait_next(struct m_spinlock **lock, struct m_spinlock *node,
+		 struct m_spinlock *prev)
+{
+	struct m_spinlock *next = NULL;
+
+	for (;;) {
+		if (*lock == node && cmpxchg(lock, node, prev) == node) {
+			/*
+			 * We were the last queued, we moved @lock back. @prev
+			 * will now observe @lock and will complete its
+			 * unlock()/unqueue().
+			 */
+			break;
+		}
+
+		/*
+		 * We must xchg() the @node->next value, because if we were to
+		 * leave it in, a concurrent unlock()/unqueue() from
+		 * @node->next might complete Step-A and think its @prev is
+		 * still valid.
+		 *
+		 * If the concurrent unlock()/unqueue() wins the race, we'll
+		 * wait for either @lock to point to us, through its Step-B, or
+		 * wait for a new @node->next from its Step-C.
+		 */
+		if (node->next) {
+			next = xchg(&node->next, NULL);
+			if (next)
+				break;
+		}
+
+		arch_mutex_cpu_relax();
+	}
+
+	return next;
+}
+
+bool m_spin_lock(struct m_spinlock **lock)
+{
+	struct m_spinlock *node = this_cpu_ptr(&m_node);
+	struct m_spinlock *prev, *next;
+
+	node->locked = 0;
+	node->next = NULL;
+
+	node->prev = prev = xchg(lock, node);
+	if (likely(prev == NULL))
+		return true;
+
+	ACCESS_ONCE(prev->next) = node;
+
+	/*
+	 * Normally @prev is untouchable after the above store; because at that
+	 * moment unlock can proceed and wipe the node element from stack.
+	 *
+	 * However, since our nodes are static per-cpu storage, we're
+	 * guaranteed their existence -- this allows us to apply
+	 * cmpxchg in an attempt to undo our queueing.
+	 */
+
+	while (!smp_load_acquire(&node->locked)) {
+		if (need_resched())
+			goto unqueue;
+		arch_mutex_cpu_relax();
+	}
+	return true;
+
+unqueue:
+	/*
+	 * Step - A  -- stabilize @prev
+	 *
+	 * Undo our @prev->next assignment; this will make @prev's
+	 * unlock()/unqueue() wait for a next pointer since @lock points to us
+	 * (or later).
+	 */
+
+	for (;;) {
+		if (prev->next == node &&
+		    cmpxchg(&prev->next, node, NULL) == node)
+			break;
+
+		/*
+		 * We can only fail the cmpxchg() racing against an unlock(),
+		 * in which case we should observe @node->locked becomming
+		 * true.
+		 */
+		if (smp_load_acquire(&node->locked))
+			return true;
+
+		/*
+		 * Or we race against a concurrent unqueue()'s step-B, in which
+		 * case its step-C will write us a new @node->prev pointer.
+		 */
+		prev = ACCESS_ONCE(node->prev);
+	}
+
+	/*
+	 * Step - B -- stabilize @next
+	 *
+	 * Similar to unlock(), wait for @node->next or move @lock from @node
+	 * back to @prev.
+	 */
+
+	next = m_spin_wait_next(lock, node, prev);
+	if (!next)
+		return false;
+
+	/*
+	 * Step - C -- unlink
+	 *
+	 * @prev is stable because its still waiting for a new @prev->next
+	 * pointer, @next is stable because our @node->next pointer is NULL and
+	 * it will wait in Step-A.
+	 */
+
+	ACCESS_ONCE(next->prev) = prev;
+	ACCESS_ONCE(prev->next) = next;
+
+	return false;
+}
+
+void m_spin_unlock(struct m_spinlock **lock)
+{
+	struct m_spinlock *node = this_cpu_ptr(&m_node);
+	struct m_spinlock *next;
+
+	if (likely(cmpxchg(lock, node, NULL) == node))
+		return;
+
+	next = m_spin_wait_next(lock, node, NULL);
+	if (next)
+		ACCESS_ONCE(next->locked) = 1;
+}
+
+#endif
+
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -109,4 +109,22 @@ void mcs_spin_unlock(struct mcs_spinlock
 	arch_mcs_spin_unlock_contended(&next->locked);
 }
 
+/*
+ * Cancellable version of the MCS lock above.
+ *
+ * This version can fail acquisition and unqueue a spinner; it assumes no
+ * nesting.
+ *
+ * Intended for adaptive spinning of sleeping locks:
+ * mutex_lock()/rwsem_down_{read,write}() etc.
+ */
+
+struct m_spinlock {
+	struct m_spinlock *next, *prev;
+	int locked; /* 1 if lock acquired */
+};
+
+extern bool m_spin_lock(struct m_spinlock **lock);
+extern void m_spin_unlock(struct m_spinlock **lock);
+
 #endif /* __LINUX_MCS_SPINLOCK_H */
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -53,7 +53,7 @@ __mutex_init(struct mutex *lock, const c
 	INIT_LIST_HEAD(&lock->wait_list);
 	mutex_clear_owner(lock);
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-	lock->mcs_lock = NULL;
+	lock->m_lock = NULL;
 #endif
 
 	debug_mutex_init(lock, name, key);
@@ -403,7 +403,9 @@ __mutex_lock_common(struct mutex *lock,
 	if (!mutex_can_spin_on_owner(lock))
 		goto slowpath;
 
-	mcs_spin_lock(&lock->mcs_lock);
+	if (!m_spin_lock(&lock->m_lock))
+		goto slowpath;
+
 	for (;;) {
 		struct task_struct *owner;
 
@@ -442,7 +444,7 @@ __mutex_lock_common(struct mutex *lock,
 			}
 
 			mutex_set_owner(lock);
-			mcs_spin_unlock(&lock->mcs_lock);
+			m_spin_unlock(&lock->m_lock);
 			preempt_enable();
 			return 0;
 		}
@@ -464,7 +466,7 @@ __mutex_lock_common(struct mutex *lock,
 		 */
 		arch_mutex_cpu_relax();
 	}
-	mcs_spin_unlock(&lock->mcs_lock);
+	m_spin_unlock(&lock->m_lock);
 slowpath:
 #endif
 	spin_lock_mutex(&lock->wait_lock, flags);

  reply	other threads:[~2014-02-03 19:25 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-28 19:13 [PATCH v2 0/5] mutex: Mutex scalability patches Jason Low
2014-01-28 19:13 ` [PATCH v2 1/5] mutex: In mutex_can_spin_on_owner(), return false if task need_resched() Jason Low
2014-01-28 20:20   ` Paul E. McKenney
2014-01-28 22:01     ` Jason Low
2014-01-28 21:09   ` Davidlohr Bueso
2014-03-11 12:41   ` [tip:core/locking] locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner() tip-bot for Jason Low
2014-01-28 19:13 ` [PATCH v2 2/5] mutex: Modify the way optimistic spinners are queued Jason Low
2014-01-28 20:23   ` Paul E. McKenney
2014-01-28 20:24     ` Paul E. McKenney
2014-01-28 21:17     ` Davidlohr Bueso
2014-01-28 22:10     ` Jason Low
2014-02-02 21:58       ` Paul E. McKenney
2014-03-11 12:41   ` [tip:core/locking] locking/mutexes: " tip-bot for Jason Low
2014-03-11 15:24     ` Jason Low
2014-03-11 15:33       ` Peter Zijlstra
2014-01-28 19:13 ` [PATCH v2 3/5] mutex: Unlock the mutex without the wait_lock Jason Low
2014-03-11 12:41   ` [tip:core/locking] locking/mutexes: " tip-bot for Jason Low
2014-03-12 12:24     ` Peter Zijlstra
2014-03-12 18:44       ` Jason Low
2014-03-13  7:28       ` [tip:core/locking] locking/mutex: Fix debug checks tip-bot for Peter Zijlstra
2014-01-28 19:13 ` [RFC][PATCH v2 4/5] mutex: Disable preemtion between modifying lock->owner and locking/unlocking mutex Jason Low
2014-01-28 20:54   ` Peter Zijlstra
2014-01-28 22:17     ` Jason Low
2014-01-28 19:13 ` [RFC][PATCH v2 5/5] mutex: Give spinners a chance to spin_on_owner if need_resched() triggered while queued Jason Low
2014-01-28 21:07   ` Peter Zijlstra
2014-01-28 22:51     ` Jason Low
2014-01-29 11:51       ` Peter Zijlstra
2014-01-31  3:29         ` Jason Low
2014-01-31 14:09           ` Peter Zijlstra
2014-01-31 20:01             ` Jason Low
2014-01-31 20:08               ` Peter Zijlstra
2014-02-02 21:01                 ` Jason Low
2014-02-02 21:12                   ` Peter Zijlstra
2014-02-03 18:39                     ` Jason Low
2014-02-03 19:25                       ` Peter Zijlstra [this message]
2014-02-03 20:55                         ` Jason Low
2014-02-03 21:06                           ` Peter Zijlstra
2014-02-03 21:56                             ` Jason Low
2014-02-04  7:13                         ` Jason Low
2014-02-02 22:02                 ` Paul E. McKenney
2014-02-02 20:02             ` Peter Zijlstra
2014-02-05 21:44         ` Waiman Long
2014-02-06 14:04           ` Peter Zijlstra
2014-02-06 18:45             ` Waiman Long
2014-02-06 20:10               ` Norton, Scott J
2014-02-10 17:01                 ` Peter Zijlstra
2014-02-06 17:44           ` Jason Low
2014-02-06 18:37             ` Waiman Long
2014-01-28 21:08 ` [PATCH v2 0/5] mutex: Mutex scalability patches Davidlohr Bueso
2014-01-28 23:11   ` Jason Low
  -- strict thread matches above, loose matches on Subject: below --
2014-02-06 14:52 [RFC][PATCH v2 5/5] mutex: Give spinners a chance to spin_on_owner if need_resched() triggered while queued Daniel J Blueman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140203192525.GN8874@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Waiman.Long@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=chegu_vinod@hp.com \
    --cc=davidlohr@hp.com \
    --cc=hpa@zytor.com \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=riel@redhat.com \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.