public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andrew Morton <akpm@osdl.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>,
	"Siddha, Suresh B" <suresh.b.siddha@intel.com>
Subject: [patch 1/5] sched: remove degenerate domains
Date: Wed, 06 Apr 2005 09:44:32 +1000	[thread overview]
Message-ID: <425322E0.9070307@yahoo.com.au> (raw)

[-- Attachment #1: Type: text/plain, Size: 76 bytes --]

This is Suresh's patch with some modifications.

-- 
SUSE Labs, Novell Inc.

[-- Attachment #2: sched-remove-degenerate-domains.patch --]
[-- Type: text/plain, Size: 3953 bytes --]

Remove degenerate scheduler domains during the sched-domain init.

For example on x86_64, we always have NUMA configured in. On Intel EM64T
systems, top most sched domain will be of NUMA and with only one sched_group in
it. 

With fork/exec balances(recent Nick's fixes in -mm tree), we always endup 
taking wrong decisions because of this topmost domain (as it contains only 
one group and find_idlest_group always returns NULL). We will endup loading 
HT package completely first, letting active load balance kickin and correct it.

In general, this patch also makes sense with out recent Nick's fixes
in -mm.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

Modified to account for more than just sched_groups when scanning for
degenerate domains by Nick Piggin. Allow a runqueue's sd to go NULL, which
required small changes to the smtnice code.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>


Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c	2005-04-05 16:38:21.000000000 +1000
+++ linux-2.6/kernel/sched.c	2005-04-05 18:39:09.000000000 +1000
@@ -2583,11 +2583,15 @@ out:
 #ifdef CONFIG_SCHED_SMT
 static inline void wake_sleeping_dependent(int this_cpu, runqueue_t *this_rq)
 {
-	struct sched_domain *sd = this_rq->sd;
+	struct sched_domain *tmp, *sd = NULL;
 	cpumask_t sibling_map;
 	int i;
+	
+	for_each_domain(this_cpu, tmp)
+		if (tmp->flags & SD_SHARE_CPUPOWER)
+			sd = tmp;
 
-	if (!(sd->flags & SD_SHARE_CPUPOWER))
+	if (!sd)
 		return;
 
 	/*
@@ -2628,13 +2632,17 @@ static inline void wake_sleeping_depende
 
 static inline int dependent_sleeper(int this_cpu, runqueue_t *this_rq)
 {
-	struct sched_domain *sd = this_rq->sd;
+	struct sched_domain *tmp, *sd = NULL;
 	cpumask_t sibling_map;
 	prio_array_t *array;
 	int ret = 0, i;
 	task_t *p;
 
-	if (!(sd->flags & SD_SHARE_CPUPOWER))
+	for_each_domain(this_cpu, tmp)
+		if (tmp->flags & SD_SHARE_CPUPOWER)
+			sd = tmp;
+
+	if (!sd)
 		return 0;
 
 	/*
@@ -4604,6 +4612,11 @@ static void sched_domain_debug(struct sc
 {
 	int level = 0;
 
+	if (!sd) {
+		printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
+		return;
+	}
+		
 	printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
 
 	do {
@@ -4809,6 +4822,50 @@ static void init_sched_domain_sysctl(voi
 }
 #endif
 
+static int __devinit sd_degenerate(struct sched_domain *sd)
+{
+	if (cpus_weight(sd->span) == 1)
+		return 1;
+
+	/* Following flags need at least 2 groups */
+	if (sd->flags & (SD_LOAD_BALANCE |
+			 SD_BALANCE_NEWIDLE |
+			 SD_BALANCE_FORK |
+			 SD_BALANCE_EXEC)) {
+		if (sd->groups != sd->groups->next)
+			return 0;
+	}
+
+	/* Following flags don't use groups */
+	if (sd->flags & (SD_WAKE_IDLE |
+			 SD_WAKE_AFFINE |
+			 SD_WAKE_BALANCE))
+		return 0;
+
+	return 1;
+}
+
+static int __devinit sd_parent_degenerate(struct sched_domain *sd,
+						struct sched_domain *parent)
+{
+	unsigned long cflags = sd->flags, pflags = parent->flags;
+
+	if (sd_degenerate(parent))
+		return 1;
+
+	if (!cpus_equal(sd->span, parent->span))
+		return 0;
+
+	/* Does parent contain flags not in child? */
+	/* WAKE_BALANCE is a subset of WAKE_AFFINE */
+	if (cflags & SD_WAKE_AFFINE)
+		pflags &= ~SD_WAKE_BALANCE;
+	if ((~sd->flags) & parent->flags)
+		return 0;
+
+	return 1;
+}
+
 /*
  * Attach the domain 'sd' to 'cpu' as its base domain.  Callers must
  * hold the hotplug lock.
@@ -4819,6 +4876,19 @@ void __devinit cpu_attach_domain(struct 
 	unsigned long flags;
 	runqueue_t *rq = cpu_rq(cpu);
 	int local = 1;
+	struct sched_domain *tmp;
+
+	/* Remove the sched domains which do not contribute to scheduling. */
+	for (tmp = sd; tmp; tmp = tmp->parent) {
+		struct sched_domain *parent = tmp->parent;
+		if (!parent)
+			break;
+		if (sd_parent_degenerate(tmp, parent))
+			tmp->parent = parent->parent;
+	}
+
+	if (sd_degenerate(sd))
+		sd = sd->parent;
 
 	sched_domain_debug(sd, cpu);
 

             reply	other threads:[~2005-04-05 23:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-05 23:44 Nick Piggin [this message]
2005-04-05 23:45 ` [patch 2/5] sched: NULL domains Nick Piggin
2005-04-05 23:46   ` [patch 3/5] sched: multilevel sbe and sbf Nick Piggin
2005-04-05 23:47     ` [patch 4/5] sched: RCU sched domains Nick Piggin
2005-04-05 23:49       ` [patch 5/5] sched: consolidate sbe sbf Nick Piggin
2005-04-06  6:27         ` Ingo Molnar
2005-04-06  8:09           ` Nick Piggin
2005-04-06  8:16             ` Nick Piggin
2005-04-07  7:17               ` Ingo Molnar
2005-04-07  7:15             ` Ingo Molnar
2005-04-06  6:18       ` [patch 4/5] sched: RCU sched domains Ingo Molnar
2005-04-06  8:01         ` Nick Piggin
2005-04-07  7:11           ` Ingo Molnar
2005-04-07  7:58             ` Nick Piggin
2005-04-11 22:15               ` Paul E. McKenney
2005-04-12  0:03                 ` Nick Piggin
2005-04-06  5:54     ` [patch 3/5] sched: multilevel sbe and sbf Ingo Molnar
2005-04-06  7:53       ` Nick Piggin
2005-04-06  5:45   ` [patch 2/5] sched: NULL domains Ingo Molnar
2005-04-06  5:48     ` Ingo Molnar
2005-04-06  7:51       ` Nick Piggin
2005-04-06  5:44 ` [patch 1/5] sched: remove degenerate domains Ingo Molnar
2005-04-06  7:10   ` Siddha, Suresh B
2005-04-06  7:13     ` Ingo Molnar
2005-04-06  8:12       ` Nick Piggin
2005-04-06  7:49   ` Nick Piggin
2005-04-07  7:00     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=425322E0.9070307@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=suresh.b.siddha@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox