From: Gregory Haskins <ghaskins@novell.com>
To: mingo@elte.hu
Cc: rostedt@goodmis.org, peterz@infradead.org,
linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
npiggin@suse.de, gregory.haskins@gmail.com
Subject: [TIP/SCHED/DEVEL PATCH v3 3/6] sched: make double-lock-balance fair
Date: Thu, 04 Sep 2008 08:55:15 -0400 [thread overview]
Message-ID: <20080904125514.23633.91459.stgit@dev.haskins.net> (raw)
In-Reply-To: <20080904125029.23633.88210.stgit@dev.haskins.net>
double_lock balance() currently favors logically lower cpus since they
often do not have to release their own lock to acquire a second lock.
The result is that logically higher cpus can get starved when there is
a lot of pressure on the RQs. This can result in higher latencies on
higher cpu-ids.
This patch makes the algorithm more fair by forcing all paths to have
to release both locks before acquiring them again. Since callsites to
double_lock_balance already consider it a potential preemption/reschedule
point, they have the proper logic to recheck for atomicity violations.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
---
kernel/sched.c | 52 +++++++++++++++++++++++++++++++++++++++++++++-------
1 files changed, 45 insertions(+), 7 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 35e1f21..af4c6fa 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2782,21 +2782,43 @@ static void double_rq_unlock(struct rq *rq1, struct rq *rq2)
__release(rq2->lock);
}
+#ifdef CONFIG_PREEMPT
+
/*
- * double_lock_balance - lock the busiest runqueue, this_rq is locked already.
+ * fair double_lock_balance: Safely acquires both rq->locks in a fair
+ * way at the expense of forcing extra atomic operations in all
+ * invocations. This assures that the double_lock is acquired using the
+ * same underlying policy as the spinlock_t on this architecture, which
+ * reduces latency compared to the unfair variant below. However, it
+ * also adds more overhead and therefore may reduce throughput.
*/
-static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
+static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
+ __releases(this_rq->lock)
+ __acquires(busiest->lock)
+ __acquires(this_rq->lock)
+{
+ spin_unlock(&this_rq->lock);
+ double_rq_lock(this_rq, busiest);
+
+ return 1;
+}
+
+#else
+
+/*
+ * Unfair double_lock_balance: Optimizes throughput at the expense of
+ * latency by eliminating extra atomic operations when the locks are
+ * already in proper order on entry. This favors lower cpu-ids and will
+ * grant the double lock to lower cpus over higher ids under contention,
+ * regardless of entry order into the function.
+ */
+static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
__releases(this_rq->lock)
__acquires(busiest->lock)
__acquires(this_rq->lock)
{
int ret = 0;
- if (unlikely(!irqs_disabled())) {
- /* printk() doesn't work good under rq->lock */
- spin_unlock(&this_rq->lock);
- BUG_ON(1);
- }
if (unlikely(!spin_trylock(&busiest->lock))) {
if (busiest < this_rq) {
spin_unlock(&this_rq->lock);
@@ -2809,6 +2831,22 @@ static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
return ret;
}
+#endif /* CONFIG_PREEMPT */
+
+/*
+ * double_lock_balance - lock the busiest runqueue, this_rq is locked already.
+ */
+static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
+{
+ if (unlikely(!irqs_disabled())) {
+ /* printk() doesn't work good under rq->lock */
+ spin_unlock(&this_rq->lock);
+ BUG_ON(1);
+ }
+
+ return _double_lock_balance(this_rq, busiest);
+}
+
static void double_unlock_balance(struct rq *this_rq, struct rq *busiest)
__releases(busiest->lock)
{
next prev parent reply other threads:[~2008-09-04 12:57 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-25 20:15 [PATCH 0/5] sched: misc rt fixes for tip/sched/devel Gregory Haskins
2008-08-25 20:15 ` [PATCH 1/5] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-08-25 20:15 ` [PATCH 2/5] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-08-26 6:21 ` Nick Piggin
2008-08-26 11:36 ` Gregory Haskins
2008-08-27 6:41 ` Nick Piggin
2008-08-27 11:50 ` Gregory Haskins
2008-08-27 11:57 ` Nick Piggin
2008-08-25 20:15 ` [PATCH 3/5] sched: make double-lock-balance fair Gregory Haskins
2008-08-26 6:14 ` Nick Piggin
2008-08-26 12:23 ` Gregory Haskins
2008-08-27 6:36 ` Nick Piggin
2008-08-27 11:41 ` Gregory Haskins
2008-08-27 11:53 ` Nick Piggin
2008-08-27 12:10 ` Gregory Haskins
2008-08-25 20:15 ` [PATCH 4/5] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-08-25 20:15 ` [PATCH 5/5] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-08-26 17:34 ` [PATCH v2 0/6] Series short description Gregory Haskins
2008-08-26 17:34 ` [PATCH v2 1/6] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-08-26 17:34 ` [PATCH v2 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-08-26 17:35 ` [PATCH v2 3/6] sched: make double-lock-balance fair Gregory Haskins
2008-08-27 8:21 ` Peter Zijlstra
2008-08-27 8:25 ` Peter Zijlstra
2008-08-27 10:26 ` Nick Piggin
2008-08-27 10:41 ` Peter Zijlstra
2008-08-27 10:56 ` Nick Piggin
2008-08-27 10:57 ` Nick Piggin
2008-08-27 12:03 ` Gregory Haskins
2008-08-27 11:07 ` Peter Zijlstra
2008-08-27 11:17 ` Russell King
2008-08-27 12:00 ` Gregory Haskins
2008-08-29 12:49 ` Ralf Baechle
2008-08-27 12:13 ` Gregory Haskins
2008-08-27 12:02 ` Gregory Haskins
2008-08-26 17:35 ` [PATCH v2 4/6] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-08-26 17:35 ` [PATCH v2 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled Gregory Haskins
2008-08-26 17:35 ` [PATCH v2 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-08-29 13:24 ` Gregory Haskins
2008-08-26 18:16 ` [PATCH v2 0/6] sched: misc rt fixes for tip/sched/devel (was: Series short description) Gregory Haskins
2008-08-27 8:33 ` [PATCH v2 0/6] Series short description Peter Zijlstra
2008-09-04 12:54 ` [TIP/SCHED/DEVEL PATCH v3 0/6] sched: misc rt fixes Gregory Haskins
2008-09-04 12:55 ` [TIP/SCHED/DEVEL PATCH v3 1/6] sched: only try to push a task on wakeup if it is migratable Gregory Haskins
2008-09-04 12:55 ` [TIP/SCHED/DEVEL PATCH v3 2/6] sched: pull only one task during NEWIDLE balancing to limit critical section Gregory Haskins
2008-09-04 12:55 ` Gregory Haskins [this message]
2008-09-04 12:55 ` [TIP/SCHED/DEVEL PATCH v3 4/6] sched: add sched_class->needs_post_schedule() member Gregory Haskins
2008-09-04 20:36 ` Steven Rostedt
2008-09-04 20:36 ` Gregory Haskins
2008-09-04 12:55 ` [TIP/SCHED/DEVEL PATCH v3 5/6] plist: fix PLIST_NODE_INIT to work with debug enabled Gregory Haskins
2008-09-04 12:55 ` [TIP/SCHED/DEVEL PATCH v3 6/6] sched: create "pushable_tasks" list to limit pushing to one attempt Gregory Haskins
2008-09-04 21:16 ` Steven Rostedt
2008-09-04 21:26 ` Gregory Haskins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080904125514.23633.91459.stgit@dev.haskins.net \
--to=ghaskins@novell.com \
--cc=gregory.haskins@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).