linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner
@ 2015-07-01 20:29 Davidlohr Bueso
  2015-07-01 20:29 ` [PATCH 2/2] rtmutex: Delete scriptable tester Davidlohr Bueso
  2015-07-01 22:27 ` [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner Thomas Gleixner
  0 siblings, 2 replies; 5+ messages in thread
From: Davidlohr Bueso @ 2015-07-01 20:29 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Thomas Gleixner
  Cc: Darren Hart, Steven Rostedt, Mike Galbraith, Paul E. McKenney,
	Sebastian Andrzej Siewior, Davidlohr Bueso, linux-kernel,
	Davidlohr Bueso

Similar to what we have in other locks, particularly regular mutexes, the
idea is that as long as the owner is running, there is a fair chance it'll
release the lock soon, and thus a task trying to acquire the rtmutex will
better off spinning instead of blocking immediately after the fastpath.
Conditions to stop spinning and enter the slowpath are simple:

(1) Upon need_resched()
(2) Current lock owner blocks

Because rtmutexes track the lock owner atomically, we can extend the fastpath
to continue polling on the lock owner via cmpxchg(lock->owner, NULL, current).

However, this is a conservative approach, such that if there are any waiters
in-line, we stop spinning and immediately take the traditional slowpath. This
allows priority boosting to take precedence over spinning, as otherwise we
could starve a higher priority queued-up task (ie: top waiter) if spinners
constantly steal the lock. This is done by locklesly checking the rbtree
for any waiters (which is safe as if we race all we do is iterate again).
Obviously this means that spinners ignore the "has waiters" flag/bit and thus
tasks can extend their spinning time until the first task queues itself into
the tree. In addition, tasks that failed their spinning and are in the slowpath
still sync with tasks unlocking, respecting try_to_take_rt_mutex() -- spinning
only allows possible lock stealing. Because this of its rt nature, there is
only limited benefits of this approach, as most systems that really need
real-time locks will commonly end up boosting prio, which means we have queued
waiters.

Passes futextests and was seen to improve total inversions performed on a 60-core
box running pi_stress (with a 10 minute window) in ~5% -- which given the program
characteristics, is a non-trivial speed up). More importantly, it shows that we
continue to avoid any deadlock scenarios the benchmark purposely introduces and
avoids by using prio boosting.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
Changes from v1 (from tglx):
 o Removed the osq usage, too bad we cannot have it as is
   due to queue order.
 o Changed waiter check to use rbtree instead of has waiters bit.
 o Comment cleanups.
 o Rebased & more testing.

 kernel/Kconfig.locks     |   4 ++
 kernel/locking/rtmutex.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index ebdb004..38ea896 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -227,6 +227,10 @@ config MUTEX_SPIN_ON_OWNER
 	def_bool y
 	depends on SMP && !DEBUG_MUTEXES && ARCH_SUPPORTS_ATOMIC_RMW
 
+config RT_MUTEX_SPIN_ON_OWNER
+	def_bool y
+	depends on SMP && RT_MUTEXES && !DEBUG_RT_MUTEXES
+
 config RWSEM_SPIN_ON_OWNER
        def_bool y
        depends on SMP && RWSEM_XCHGADD_ALGORITHM && ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 5674b07..66b8aa3 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1150,6 +1150,116 @@ static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
 	}
 }
 
+#ifdef CONFIG_RT_MUTEX_SPIN_ON_OWNER
+static noinline
+bool rt_mutex_spin_on_owner(struct rt_mutex *lock, struct task_struct *owner)
+{
+	bool ret = true;
+
+	rcu_read_lock();
+	while (rt_mutex_owner(lock) == owner) {
+		/*
+		 * Ensure we emit the owner->on_cpu, dereference _after_
+		 * checking lock->owner still matches owner. If that fails,
+		 * owner might point to freed memory. If it still matches,
+		 * the rcu_read_lock() ensures the memory stays valid.
+		 */
+		barrier();
+
+		if (!owner->on_cpu || need_resched()) {
+			ret = false;
+			break;
+		}
+
+		cpu_relax_lowlatency();
+	}
+	rcu_read_unlock();
+
+	return ret;
+}
+
+/*
+ * Initial check for entering the mutex spinning loop.
+ * If no owner, the lock is free.
+ */
+static inline bool rt_mutex_can_spin_on_owner(struct rt_mutex *lock)
+{
+	struct task_struct *owner;
+	int ret = true;
+
+	if (need_resched())
+		return 0;
+
+	rcu_read_lock();
+	owner = rt_mutex_owner(lock);
+	if (owner)
+		ret = owner->on_cpu;
+	rcu_read_unlock();
+
+	return ret;
+}
+
+static bool rt_mutex_optimistic_spin(struct rt_mutex *lock)
+{
+	bool taken = false;
+
+	preempt_disable();
+
+	if (!rt_mutex_can_spin_on_owner(lock))
+		goto done;
+
+	while (true) {
+		struct task_struct *owner;
+
+		/*
+		 * When another task is competing for the lock in the
+		 * slowpath avoid the cmpxchg and immediately block.
+		 *
+		 * This is a lockless check against the waiters rb
+		 * root, meaning that only _real_ waiters are
+		 * acknowledged. The sometimes-phony "has waiters"
+		 * bit is only set unconditionally, in the slowpath
+		 * acquire path, to sync up with any owner releasing
+		 * the lock.
+		 */
+		if (rt_mutex_has_waiters(lock))
+			goto done;
+
+		/*
+		 * If there's an owner, wait for it to either
+		 * release the lock or go to sleep.
+		 */
+		owner = rt_mutex_owner(lock);
+		if (owner && !rt_mutex_spin_on_owner(lock, owner))
+			break;
+
+		/* Try to acquire the lock, if it is unlocked. */
+		if (rt_mutex_cmpxchg(lock, NULL, current)) {
+			taken = true;
+			goto done;
+		}
+
+		/*
+		 * The cpu_relax() call is a compiler barrier which forces
+		 * everything in this loop to be re-loaded. We don't need
+		 * memory barriers as we'll eventually observe the right
+		 * values at the cost of a few extra spins.
+		 */
+		cpu_relax_lowlatency();
+	}
+
+done:
+	preempt_enable();
+	return taken;
+}
+
+#else
+static bool rt_mutex_optimistic_spin(struct rt_mutex *lock)
+{
+	return false;
+}
+#endif
+
 /*
  * Slow path lock function:
  */
@@ -1161,6 +1271,9 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 	struct rt_mutex_waiter waiter;
 	int ret = 0;
 
+	if (rt_mutex_optimistic_spin(lock))
+		return ret;
+
 	debug_rt_mutex_init_waiter(&waiter);
 	RB_CLEAR_NODE(&waiter.pi_tree_entry);
 	RB_CLEAR_NODE(&waiter.tree_entry);
@@ -1524,7 +1637,6 @@ void __rt_mutex_init(struct rt_mutex *lock, const char *name)
 	raw_spin_lock_init(&lock->wait_lock);
 	lock->waiters = RB_ROOT;
 	lock->waiters_leftmost = NULL;
-
 	debug_rt_mutex_init(lock, name);
 }
 EXPORT_SYMBOL_GPL(__rt_mutex_init);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-20 10:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-01 20:29 [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner Davidlohr Bueso
2015-07-01 20:29 ` [PATCH 2/2] rtmutex: Delete scriptable tester Davidlohr Bueso
2015-07-20 10:58   ` [tip:locking/core] " tip-bot for Davidlohr Bueso
2015-07-01 22:27 ` [PATCH -tip v2 1/2] locking/rtmutex: Support spin on owner Thomas Gleixner
2015-07-02  0:16   ` Davidlohr Bueso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).