All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <mason@suse.com>
To: Con Kolivas <kernel@kolivas.org>, linux-kernel@vger.kernel.org
Subject: [PATCH RFC] smt nice introduces significant lock contention
Date: Thu, 1 Jun 2006 18:55:37 -0400	[thread overview]
Message-ID: <200606011855.38110.mason@suse.com> (raw)

Hello everyone,

Recent benchmarks showed some performance regressions between 2.6.16 and 
2.6.5.  We tracked down one of the regressions to lock contention in schedule 
heavy workloads (~70,000 context switches per second)

kernel/sched.c:dependent_sleeper() was responsible for most of the lock 
contention, hammering on the run queue locks.  The patch below is more of 
a discussion point than a suggested fix (although it does reduce lock 
contention significantly).  The dependent_sleeper code looks very expensive 
to me, especially for using a spinlock to bounce control between two different 
siblings in the same cpu.

--- a/kernel/sched.c	Thu May 18 15:55:43 2006 -0400
+++ b/kernel/sched.c	Tue May 23 21:13:52 2006 -0400
@@ -2630,6 +2630,27 @@ static inline void wakeup_busy_runqueue(
 		resched_task(rq->idle);
 }
 
+static int trylock_smt_cpus(cpumask_t sibling_map)
+{
+	int ret = 1;
+	int numlocked = 0;
+	int i;
+	for_each_cpu_mask(i, sibling_map) {
+		ret = spin_trylock(&cpu_rq(i)->lock);
+		if (!ret)
+			break;
+		numlocked++;
+	}
+	if (ret || !numlocked)
+		return ret;
+	for_each_cpu_mask(i, sibling_map) {
+		spin_unlock(&cpu_rq(i)->lock);
+		if (--numlocked == 0)
+			break;
+	}
+	return 0;
+}
+
 static void wake_sleeping_dependent(int this_cpu, runqueue_t *this_rq)
 {
 	struct sched_domain *tmp, *sd = NULL;
@@ -2643,22 +2664,16 @@ static void wake_sleeping_dependent(int 
 	if (!sd)
 		return;
 
-	/*
-	 * Unlock the current runqueue because we have to lock in
-	 * CPU order to avoid deadlocks. Caller knows that we might
-	 * unlock. We keep IRQs disabled.
-	 */
-	spin_unlock(&this_rq->lock);
-
 	sibling_map = sd->span;
 
-	for_each_cpu_mask(i, sibling_map)
-		spin_lock(&cpu_rq(i)->lock);
 	/*
 	 * We clear this CPU from the mask. This both simplifies the
 	 * inner loop and keps this_rq locked when we exit:
 	 */
 	cpu_clear(this_cpu, sibling_map);
+
+	if (!trylock_smt_cpus(sibling_map))
+		return;
 
 	for_each_cpu_mask(i, sibling_map) {
 		runqueue_t *smt_rq = cpu_rq(i);
@@ -2703,11 +2718,10 @@ static int dependent_sleeper(int this_cp
 	 * The same locking rules and details apply as for
 	 * wake_sleeping_dependent():
 	 */
-	spin_unlock(&this_rq->lock);
 	sibling_map = sd->span;
-	for_each_cpu_mask(i, sibling_map)
-		spin_lock(&cpu_rq(i)->lock);
 	cpu_clear(this_cpu, sibling_map);
+	if (!trylock_smt_cpus(sibling_map))
+		return 0;
 
 	/*
 	 * Establish next task to be run - it might have gone away because

             reply	other threads:[~2006-06-01 22:55 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-01 22:55 Chris Mason [this message]
2006-06-01 23:57 ` [PATCH RFC] smt nice introduces significant lock contention Chen, Kenneth W
2006-06-02  1:59   ` Con Kolivas
2006-06-02  2:28     ` Con Kolivas
2006-06-02  3:55       ` Con Kolivas
2006-06-02  4:18         ` Nick Piggin
2006-06-02  6:08           ` Con Kolivas
2006-06-02  7:53             ` Nick Piggin
2006-06-02  8:17               ` Con Kolivas
2006-06-02  8:28                 ` Nick Piggin
2006-06-02  8:34                   ` Chen, Kenneth W
2006-06-02  8:56                     ` Nick Piggin
2006-06-02  9:17                       ` Chen, Kenneth W
2006-06-02  9:25                         ` Con Kolivas
2006-06-02  9:31                           ` Chen, Kenneth W
2006-06-02  9:34                             ` Con Kolivas
2006-06-02  9:53                               ` Chen, Kenneth W
2006-06-02 10:12                                 ` Con Kolivas
2006-06-02 20:53                                   ` Chen, Kenneth W
2006-06-02 22:15                                     ` Con Kolivas
2006-06-02 22:19                                       ` Chen, Kenneth W
2006-06-02 22:31                                         ` Con Kolivas
2006-06-02 22:58                                           ` Chen, Kenneth W
2006-06-03  0:02                                             ` Con Kolivas
2006-06-03  0:08                                               ` Chen, Kenneth W
2006-06-03  0:27                                                 ` Con Kolivas
2006-06-02  9:36                       ` Chen, Kenneth W
2006-06-02 10:30                       ` Con Kolivas
2006-06-02 13:16                         ` Con Kolivas
2006-06-02 21:54                           ` Chen, Kenneth W
2006-06-02 22:04                             ` Con Kolivas
2006-06-02 22:14                       ` Chen, Kenneth W
2006-06-02 10:19                   ` Con Kolivas
2006-06-02 20:59                     ` Chen, Kenneth W
2006-06-02  8:38               ` Chen, Kenneth W
2006-06-02  8:24           ` Chen, Kenneth W
2006-06-02  8:31         ` Chen, Kenneth W
2006-06-02  8:50         ` Chen, Kenneth W
2006-06-02  2:35     ` Chen, Kenneth W
2006-06-02  3:04       ` Con Kolivas
2006-06-02  3:23         ` Con Kolivas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200606011855.38110.mason@suse.com \
    --to=mason@suse.com \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.