SMP Panic caused by [PATCH] sched: consolidate sched domains

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* SMP Panic caused by [PATCH] sched: consolidate sched domains
@ 2004-08-29 13:39 James Bottomley
  2004-08-29 16:48 ` Jesse Barnes
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-08-29 13:39 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin; +Cc: Linux Kernel

This patch causes an immediate panic when the secondary processors come
on-line because sd->next is NULL.

The fix is to use cpu_possible_map instead of nodemask (which expands,
probably erroneously, to cpu_online_map in the non-numa case).

Any use of cpu_online_map in initialisation code is almost invariably
wrong, so please don't do it in future.

I know I'm sounding like a broken record, but it would be a lot easier
to spot mistakes like this immediately if every arch used the hotplug
paths to bring SMP up.

Anyway, the attached fixes our panic.

James

===== kernel/sched.c 1.329 vs edited =====
--- 1.329/kernel/sched.c	2004-08-24 02:08:09 -07:00
+++ edited/kernel/sched.c	2004-08-29 06:17:26 -07:00
@@ -3756,7 +3756,7 @@
 		sd = &per_cpu(phys_domains, i);
 		group = cpu_to_phys_group(i);
 		*sd = SD_CPU_INIT;
-		sd->span = nodemask;
+		sd->span = cpu_possible_map;
 		sd->parent = p;
 		sd->groups = &sched_group_phys[group];

@@ -3790,7 +3790,7 @@
 		if (cpus_empty(nodemask))
 			continue;

-		init_sched_build_groups(sched_group_phys, nodemask,
+		init_sched_build_groups(sched_group_phys, cpu_possible_map,
 						&cpu_to_phys_group);
 	}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 13:39 SMP Panic caused by [PATCH] sched: consolidate sched domains James Bottomley
@ 2004-08-29 16:48 ` Jesse Barnes
  2004-08-29 16:58   ` James Bottomley
  2004-08-29 17:03   ` William Lee Irwin III
  0 siblings, 2 replies; 16+ messages in thread
From: Jesse Barnes @ 2004-08-29 16:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin,
	Linux Kernel

On Sunday, August 29, 2004 6:39 am, James Bottomley wrote:
> This patch causes an immediate panic when the secondary processors come
> on-line because sd->next is NULL.
>
> The fix is to use cpu_possible_map instead of nodemask (which expands,
> probably erroneously, to cpu_online_map in the non-numa case).
>
> Any use of cpu_online_map in initialisation code is almost invariably
> wrong, so please don't do it in future.
>
> I know I'm sounding like a broken record, but it would be a lot easier
> to spot mistakes like this immediately if every arch used the hotplug
> paths to bring SMP up.
>
> Anyway, the attached fixes our panic.
>
> James
>
> ===== kernel/sched.c 1.329 vs edited =====
> --- 1.329/kernel/sched.c 2004-08-24 02:08:09 -07:00
> +++ edited/kernel/sched.c 2004-08-29 06:17:26 -07:00
> @@ -3756,7 +3756,7 @@
>    sd = &per_cpu(phys_domains, i);
>    group = cpu_to_phys_group(i);
>    *sd = SD_CPU_INIT;
> -  sd->span = nodemask;
> +  sd->span = cpu_possible_map;
>    sd->parent = p;
>    sd->groups = &sched_group_phys[group];
>
> @@ -3790,7 +3790,7 @@
>    if (cpus_empty(nodemask))
>     continue;
>
> -  init_sched_build_groups(sched_group_phys, nodemask,
> +  init_sched_build_groups(sched_group_phys, cpu_possible_map,
>        &cpu_to_phys_group);
>   }

But I think this breaks what the code is supposed to do.  You're right that we 
shouldn't use cpu_online_map, but we should leave the nodemask in there and 
fix the code that sets it in the non-NUMA case instead.

Jesse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 16:48 ` Jesse Barnes
@ 2004-08-29 16:58   ` James Bottomley
  2004-08-29 17:07     ` Jesse Barnes
  2004-08-29 17:03   ` William Lee Irwin III
  1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-08-29 16:58 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin,
	Linux Kernel

On Sun, 2004-08-29 at 12:48, Jesse Barnes wrote:
> But I think this breaks what the code is supposed to do.  You're right that we 
> shouldn't use cpu_online_map, but we should leave the nodemask in there and 
> fix the code that sets it in the non-NUMA case instead.

Well, let's say it puts back the original behaviour. If you look at even
the NUMA code before these changes, it had cpu_possible_map in there.

I totally agree about fixing NUMA, it looks completely broken to me in
the way it handles cpu maps because node_to_cpumask(i) needs to expand
to cpu_possible_map for initialisation and cpu_online_map for
operation.  Has anyone ever checked NUMA for hotplug CPU?

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 16:48 ` Jesse Barnes
  2004-08-29 16:58   ` James Bottomley
@ 2004-08-29 17:03   ` William Lee Irwin III
  2004-08-29 17:09     ` James Bottomley
  1 sibling, 1 reply; 16+ messages in thread
From: William Lee Irwin III @ 2004-08-29 17:03 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: James Bottomley, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sunday, August 29, 2004 6:39 am, James Bottomley wrote:
>> This patch causes an immediate panic when the secondary processors come
>> on-line because sd->next is NULL.
>> The fix is to use cpu_possible_map instead of nodemask (which expands,
>> probably erroneously, to cpu_online_map in the non-numa case).
>> Any use of cpu_online_map in initialisation code is almost invariably
>> wrong, so please don't do it in future.
>> I know I'm sounding like a broken record, but it would be a lot easier
>> to spot mistakes like this immediately if every arch used the hotplug
>> paths to bring SMP up.
>> Anyway, the attached fixes our panic.

On Sun, Aug 29, 2004 at 09:48:06AM -0700, Jesse Barnes wrote:
> But I think this breaks what the code is supposed to do.  You're
> right that we shouldn't use cpu_online_map, but we should leave the
> nodemask in there and fix the code that sets it in the non-NUMA case
> instead.

Index: wait-2.6.9-rc1-mm1/include/asm-generic/topology.h
===================================================================
--- wait-2.6.9-rc1-mm1.orig/include/asm-generic/topology.h	2004-08-24 00:01:54.000000000 -0700
+++ wait-2.6.9-rc1-mm1/include/asm-generic/topology.h	2004-08-29 10:02:01.513753008 -0700
@@ -36,7 +36,7 @@
 #define parent_node(node)	(0)
 #endif
 #ifndef node_to_cpumask
-#define node_to_cpumask(node)	(cpu_online_map)
+#define node_to_cpumask(node)	(cpu_possible_map)
 #endif
 #ifndef node_to_first_cpu
 #define node_to_first_cpu(node)	(0)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 16:58   ` James Bottomley
@ 2004-08-29 17:07     ` Jesse Barnes
  2004-08-29 17:16       ` James Bottomley
  2004-08-29 17:24       ` James Bottomley
  0 siblings, 2 replies; 16+ messages in thread
From: Jesse Barnes @ 2004-08-29 17:07 UTC (permalink / raw)
  To: James Bottomley
  Cc: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin,
	Linux Kernel

[-- Attachment #1: Type: text/plain, Size: 881 bytes --]

On Sunday, August 29, 2004 9:58 am, James Bottomley wrote:
> On Sun, 2004-08-29 at 12:48, Jesse Barnes wrote:
> > But I think this breaks what the code is supposed to do.  You're right
> > that we shouldn't use cpu_online_map, but we should leave the nodemask in
> > there and fix the code that sets it in the non-NUMA case instead.
>
> Well, let's say it puts back the original behaviour. If you look at even
> the NUMA code before these changes, it had cpu_possible_map in there.
>
> I totally agree about fixing NUMA, it looks completely broken to me in
> the way it handles cpu maps because node_to_cpumask(i) needs to expand
> to cpu_possible_map for initialisation and cpu_online_map for
> operation.  Has anyone ever checked NUMA for hotplug CPU?

I've up and downed a few CPUs on an Altix, and it seems to work ok, but that's 
a pretty basic test.  How about this?

Jesse


[-- Attachment #2: node-to-cpumask-fix.patch --]
[-- Type: text/x-diff, Size: 634 bytes --]

===== include/asm-generic/topology.h 1.6 vs edited =====
--- 1.6/include/asm-generic/topology.h	2004-02-03 21:35:17 -08:00
+++ edited/include/asm-generic/topology.h	2004-08-29 10:06:17 -07:00
@@ -36,13 +36,13 @@
 #define parent_node(node)	(0)
 #endif
 #ifndef node_to_cpumask
-#define node_to_cpumask(node)	(cpu_online_map)
+#define node_to_cpumask(node)	(cpu_possible_map)
 #endif
 #ifndef node_to_first_cpu
 #define node_to_first_cpu(node)	(0)
 #endif
 #ifndef pcibus_to_cpumask
-#define pcibus_to_cpumask(bus)	(cpu_online_map)
+#define pcibus_to_cpumask(bus)	(cpu_possible_map)
 #endif
 
 /* Cross-node load balancing interval. */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:03   ` William Lee Irwin III
@ 2004-08-29 17:09     ` James Bottomley
  2004-08-29 17:22       ` William Lee Irwin III
  0 siblings, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-08-29 17:09 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, 2004-08-29 at 13:03, William Lee Irwin III wrote:
> -#define node_to_cpumask(node)	(cpu_online_map)
> +#define node_to_cpumask(node)	(cpu_possible_map)

I really don't think so.  This macro is also used at runtime, so there
it would return CPUs that aren't online.

It does look like all runtime uses in sched.c pass node_to_cpumask
through any_online_cpu(), so at least for the scheduler, the change may
be safe, but you'd have to audit all other runtime uses.

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:07     ` Jesse Barnes
@ 2004-08-29 17:16       ` James Bottomley
  2004-08-30 19:11         ` Matthew Dobson
  2004-08-29 17:24       ` James Bottomley
  1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-08-29 17:16 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin,
	Linux Kernel

On Sun, 2004-08-29 at 13:07, Jesse Barnes wrote:
> I've up and downed a few CPUs on an Altix, and it seems to work ok, but that's 
> a pretty basic test.  How about this?

Well, like I told Bill.  It's not a priori correct because now you're
altering runtime behaviour.

It may, in fact, work because if the runtime users have an additional
restriction to online cpus, but that's not a given ... have you audited
the code for this?

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:09     ` James Bottomley
@ 2004-08-29 17:22       ` William Lee Irwin III
  2004-08-29 17:29         ` William Lee Irwin III
  0 siblings, 1 reply; 16+ messages in thread
From: William Lee Irwin III @ 2004-08-29 17:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, 2004-08-29 at 13:03, William Lee Irwin III wrote:
>> -#define node_to_cpumask(node)	(cpu_online_map)
>> +#define node_to_cpumask(node)	(cpu_possible_map)

On Sun, Aug 29, 2004 at 01:09:49PM -0400, James Bottomley wrote:
> I really don't think so.  This macro is also used at runtime, so there
> it would return CPUs that aren't online.
> It does look like all runtime uses in sched.c pass node_to_cpumask
> through any_online_cpu(), so at least for the scheduler, the change may
> be safe, but you'd have to audit all other runtime uses.

./drivers/base/node.c:22:	cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
./arch/ia64/sn/io/sn2/ml_SN_intr.c:63:		cpu = first_cpu(node_to_cpumask(cnode));
./arch/ia64/sn/io/sn2/ml_SN_intr.c:69:		cpu = first_cpu(node_to_cpumask(node));
./arch/x86_64/mm/numa.c:30:cpumask_t     node_to_cpumask[MAXNODE]; 
./arch/x86_64/mm/numa.c:163:	set_bit(0, &node_to_cpumask[cpu_to_node(0)]);
./arch/x86_64/mm/numa.c:234:	node_to_cpumask[0] = cpumask_of_cpu(0);
./arch/x86_64/mm/numa.c:242:		set_bit(cpu, &node_to_cpumask[cpu_to_node(cpu)]);
./arch/x86_64/mm/numa.c:277:EXPORT_SYMBOL(node_to_cpumask);
./arch/x86_64/pci/k8-bus.c:57:						node_to_cpumask(NODE_ID(nid));
./include/asm-ppc64/topology.h:24:static inline cpumask_t node_to_cpumask(int node)
./include/asm-ppc64/topology.h:32:	tmp = node_to_cpumask(node);
./include/asm-x86_64/topology.h:16:extern cpumask_t     node_to_cpumask[];
./include/asm-x86_64/topology.h:21:#define node_to_first_cpu(node) 	(__ffs(node_to_cpumask[node]))
./include/asm-x86_64/topology.h:22:#define node_to_cpumask(node)		(node_to_cpumask[node])
./include/asm-generic/topology.h:38:#ifndef node_to_cpumask
./include/asm-generic/topology.h:39:#define node_to_cpumask(node)	(cpu_possible_map)
./include/asm-mips/mach-ip27/topology.h:8:#define node_to_cpumask(node)	(HUB_DATA(node)->h_cpus)
./include/asm-mips/mach-ip27/topology.h:9:#define node_to_first_cpu(node)	(first_cpu(node_to_cpumask(node)))
./include/asm-ia64/topology.h:29:#define node_to_cpumask(node) (node_to_cpu_mask[node])
./include/asm-ia64/topology.h:41:#define node_to_first_cpu(node) (__ffs(node_to_cpumask(node)))
./include/linux/topology.h:41:		__tmp__ = node_to_cpumask(node);				\
./include/asm-alpha/topology.h:25:static inline cpumask_t node_to_cpumask(int node)
./include/asm-i386/topology.h:51:static inline cpumask_t node_to_cpumask(int node)
./include/asm-i386/topology.h:59:	cpumask_t mask = node_to_cpumask(node);
./include/asm-i386/topology.h:66:	return node_to_cpumask(mp_bus_id_to_node[bus]);
./kernel/sched.c:3810:		mask = node_to_cpumask(node);
./kernel/sched.c:4059:		nodemask = node_to_cpumask(next_node);
./kernel/sched.c:4187:		cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
./kernel/sched.c:4267:		cpumask_t nodemask = node_to_cpumask(i);
./mm/page_alloc.c:1324:		tmp = node_to_cpumask(n);
./mm/vmscan.c:1127:	cpumask = node_to_cpumask(pgdat->node_id);
./mm/vmscan.c:1214:			mask = node_to_cpumask(pgdat->node_id);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:07     ` Jesse Barnes
  2004-08-29 17:16       ` James Bottomley
@ 2004-08-29 17:24       ` James Bottomley
  2004-08-29 17:48         ` Nathan Lynch
  1 sibling, 1 reply; 16+ messages in thread
From: James Bottomley @ 2004-08-29 17:24 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: Andrew Morton, Linus Torvalds, Matthew Dobson, Nick Piggin,
	Linux Kernel

On Sun, 2004-08-29 at 13:07, Jesse Barnes wrote:
> I've up and downed a few CPUs on an Altix, and it seems to work ok, but that's 
> a pretty basic test.  How about this?

Incidentally, down and up tests won't pick up these initialisation
problems because the SMP paths will already have created the start of
day data structures for these CPUs.  You need to boot with the CPU down
and bring it up after boot to see the issues.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:22       ` William Lee Irwin III
@ 2004-08-29 17:29         ` William Lee Irwin III
  2004-08-29 17:40           ` William Lee Irwin III
  0 siblings, 1 reply; 16+ messages in thread
From: William Lee Irwin III @ 2004-08-29 17:29 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

> On Sun, 2004-08-29 at 13:03, William Lee Irwin III wrote:
> >> -#define node_to_cpumask(node)	(cpu_online_map)
> >> +#define node_to_cpumask(node)	(cpu_possible_map)
> 
> On Sun, Aug 29, 2004 at 01:09:49PM -0400, James Bottomley wrote:
> > I really don't think so.  This macro is also used at runtime, so there
> > it would return CPUs that aren't online.
> > It does look like all runtime uses in sched.c pass node_to_cpumask
> > through any_online_cpu(), so at least for the scheduler, the change may
> > be safe, but you'd have to audit all other runtime uses.

On Sun, Aug 29, 2004 at 10:22:50AM -0700, William Lee Irwin III wrote:
> ./drivers/base/node.c:22:	cpumask_t mask = node_to_cpumask(node_dev->sysdev.id);
> ./arch/ia64/sn/io/sn2/ml_SN_intr.c:63:		cpu = first_cpu(node_to_cpumask(cnode));

Okay, how about:


Index: wait-2.6.9-rc1-mm1/kernel/sched.c
===================================================================
--- wait-2.6.9-rc1-mm1.orig/kernel/sched.c	2004-08-28 11:41:47.000000000 -0700
+++ wait-2.6.9-rc1-mm1/kernel/sched.c	2004-08-29 10:25:00.468546856 -0700
@@ -4262,16 +4262,21 @@
 						&cpu_to_isolated_group);
 	}
 
-	/* Set up physical groups */
-	for (i = 0; i < MAX_NUMNODES; i++) {
-		cpumask_t nodemask = node_to_cpumask(i);
-
-		cpus_and(nodemask, nodemask, cpu_default_map);
-		if (cpus_empty(nodemask))
-			continue;
+	if (MAX_NUMNODES == 1)
+		init_sched_build_groups(sched_group_phys, cpu_possible_map,
+							&cpu_to_phys_group);
+	else {
+		/* Set up physical groups */
+		for (i = 0; i < MAX_NUMNODES; i++) {
+			cpumask_t nodemask = node_to_cpumask(i);
+
+			cpus_and(nodemask, nodemask, cpu_default_map);
+			if (cpus_empty(nodemask))
+				continue;
 
-		init_sched_build_groups(sched_group_phys, nodemask,
+			init_sched_build_groups(sched_group_phys, nodemask,
 						&cpu_to_phys_group);
+		}
 	}
 
 #ifdef CONFIG_NUMA

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:29         ` William Lee Irwin III
@ 2004-08-29 17:40           ` William Lee Irwin III
  2004-08-29 17:50             ` William Lee Irwin III
  0 siblings, 1 reply; 16+ messages in thread
From: William Lee Irwin III @ 2004-08-29 17:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, Aug 29, 2004 at 10:29:23AM -0700, William Lee Irwin III wrote:
> Okay, how about:

Okay, if you prefer the #ifdef:


Index: wait-2.6.9-rc1-mm1/kernel/sched.c
===================================================================
--- wait-2.6.9-rc1-mm1.orig/kernel/sched.c	2004-08-28 11:41:47.000000000 -0700
+++ wait-2.6.9-rc1-mm1/kernel/sched.c	2004-08-29 10:37:04.210521352 -0700
@@ -4262,6 +4262,7 @@
 						&cpu_to_isolated_group);
 	}
 
+#ifdef CONFIG_NUMA
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		cpumask_t nodemask = node_to_cpumask(i);
@@ -4273,6 +4274,10 @@
 		init_sched_build_groups(sched_group_phys, nodemask,
 						&cpu_to_phys_group);
 	}
+#else
+	init_sched_build_groups(sched_group_phys, cpu_possible_map,
+							&cpu_to_phys_group);
+#endif
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:24       ` James Bottomley
@ 2004-08-29 17:48         ` Nathan Lynch
  2004-08-29 22:59           ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: Nathan Lynch @ 2004-08-29 17:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, 2004-08-29 at 12:24, James Bottomley wrote:
> On Sun, 2004-08-29 at 13:07, Jesse Barnes wrote:
> > I've up and downed a few CPUs on an Altix, and it seems to work ok, but that's 
> > a pretty basic test.  How about this?
> 
> Incidentally, down and up tests won't pick up these initialisation
> problems because the SMP paths will already have created the start of
> day data structures for these CPUs.  You need to boot with the CPU down
> and bring it up after boot to see the issues.

I've got a patch which reinitializes sched domains at cpu hotplug time. 
We need something like this on ppc64 for partitioned systems (we run
into the same issue when adding a cpu which wasn't present at boot).  I
had been waiting to post it until some cpu hotplug issues with preempt
were solved, but it seems it would help the case of hotplugging
secondary cpus at boot, so I'll submit that soon.

Nathan



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:40           ` William Lee Irwin III
@ 2004-08-29 17:50             ` William Lee Irwin III
  2004-08-29 23:13               ` James Bottomley
  0 siblings, 1 reply; 16+ messages in thread
From: William Lee Irwin III @ 2004-08-29 17:50 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, Aug 29, 2004 at 10:40:39AM -0700, William Lee Irwin III wrote:
> Okay, if you prefer the #ifdef:

And for the other half of it:


Index: wait-2.6.9-rc1-mm1/kernel/sched.c
===================================================================
--- wait-2.6.9-rc1-mm1.orig/kernel/sched.c	2004-08-28 11:41:47.000000000 -0700
+++ wait-2.6.9-rc1-mm1/kernel/sched.c	2004-08-29 10:46:52.543081208 -0700
@@ -4224,7 +4224,11 @@
 		sd = &per_cpu(phys_domains, i);
 		group = cpu_to_phys_group(i);
 		*sd = SD_CPU_INIT;
+#ifdef CONFIG_NUMA
 		sd->span = nodemask;
+#else
+		sd->span = cpu_possible_map;
+#endif
 		sd->parent = p;
 		sd->groups = &sched_group_phys[group];
 
@@ -4262,6 +4266,7 @@
 						&cpu_to_isolated_group);
 	}
 
+#ifdef CONFIG_NUMA
 	/* Set up physical groups */
 	for (i = 0; i < MAX_NUMNODES; i++) {
 		cpumask_t nodemask = node_to_cpumask(i);
@@ -4273,6 +4278,10 @@
 		init_sched_build_groups(sched_group_phys, nodemask,
 						&cpu_to_phys_group);
 	}
+#else
+	init_sched_build_groups(sched_group_phys, cpu_possible_map,
+							&cpu_to_phys_group);
+#endif
 
 #ifdef CONFIG_NUMA
 	/* Set up node groups */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:48         ` Nathan Lynch
@ 2004-08-29 22:59           ` James Bottomley
  0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2004-08-29 22:59 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, 2004-08-29 at 13:48, Nathan Lynch wrote:
> I've got a patch which reinitializes sched domains at cpu hotplug time. 
> We need something like this on ppc64 for partitioned systems (we run
> into the same issue when adding a cpu which wasn't present at boot).  I
> had been waiting to post it until some cpu hotplug issues with preempt
> were solved, but it seems it would help the case of hotplugging
> secondary cpus at boot, so I'll submit that soon.

Well, but the only time you should need to alter a priori knowledge like
this is if you're actually altering the NUMA topology, isn't it?  Simply
bringing up a CPU in a known numa system shouldn't need to alter
scheduling domain information on CPU hotplug because the cpu
automatically becomes part of an existing domain (where it was
originally accounted for as missing).

However, if you hotplug a numa node, then you're adding to the
scheduling domain and would thus need to initialise the new node and its
CPUs.

James

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:50             ` William Lee Irwin III
@ 2004-08-29 23:13               ` James Bottomley
  0 siblings, 0 replies; 16+ messages in thread
From: James Bottomley @ 2004-08-29 23:13 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Matthew Dobson,
	Nick Piggin, Linux Kernel

On Sun, 2004-08-29 at 13:50, William Lee Irwin III wrote:
> On Sun, Aug 29, 2004 at 10:40:39AM -0700, William Lee Irwin III wrote:
> > Okay, if you prefer the #ifdef:
> 
> And for the other half of it:

Much better ... this one looks fine to me and boots OK on an SMP parisc
box.

James



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: SMP Panic caused by [PATCH] sched: consolidate sched domains
  2004-08-29 17:16       ` James Bottomley
@ 2004-08-30 19:11         ` Matthew Dobson
  0 siblings, 0 replies; 16+ messages in thread
From: Matthew Dobson @ 2004-08-30 19:11 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jesse Barnes, Andrew Morton, Linus Torvalds, Nick Piggin,
	Linux Kernel

On Sun, 2004-08-29 at 10:16, James Bottomley wrote:
> On Sun, 2004-08-29 at 13:07, Jesse Barnes wrote:
> > I've up and downed a few CPUs on an Altix, and it seems to work ok, but that's 
> > a pretty basic test.  How about this?
> 
> Well, like I told Bill.  It's not a priori correct because now you're
> altering runtime behaviour.
> 
> It may, in fact, work because if the runtime users have an additional
> restriction to online cpus, but that's not a given ... have you audited
> the code for this?
> 
> James

Yeah, I don't think that patch will work so well, unless we're very
careful to mask the return values of node_to_cpumask() and
pcibus_to_cpumask() against cpu_online_map where appropriate.  There are
certainly callers of these function who are expecting a map containing
online cpus.

-Matt


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-08-30 19:12 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-29 13:39 SMP Panic caused by [PATCH] sched: consolidate sched domains James Bottomley
2004-08-29 16:48 ` Jesse Barnes
2004-08-29 16:58   ` James Bottomley
2004-08-29 17:07     ` Jesse Barnes
2004-08-29 17:16       ` James Bottomley
2004-08-30 19:11         ` Matthew Dobson
2004-08-29 17:24       ` James Bottomley
2004-08-29 17:48         ` Nathan Lynch
2004-08-29 22:59           ` James Bottomley
2004-08-29 17:03   ` William Lee Irwin III
2004-08-29 17:09     ` James Bottomley
2004-08-29 17:22       ` William Lee Irwin III
2004-08-29 17:29         ` William Lee Irwin III
2004-08-29 17:40           ` William Lee Irwin III
2004-08-29 17:50             ` William Lee Irwin III
2004-08-29 23:13               ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox