public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* sched_domains + NUMA issue
@ 2004-08-29 11:18 Anton Blanchard
  2004-08-29 16:40 ` Jesse Barnes
  2004-08-29 16:48 ` William Lee Irwin III
  0 siblings, 2 replies; 4+ messages in thread
From: Anton Blanchard @ 2004-08-29 11:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: nickpiggin, nathanl, jbarnes


Hi,

We are seeing errors in the sched domains debug code when SMT + NUMA is
enabled. Nathan pointed out that the recent change to limit the number
of nodes in a scheduling group may be causing this - in particular
sched_domain_node_span.

It looks like ia64 are the only ones implementing a reasonable
node_distance, the others just do:

#define node_distance(from,to) (from != to)

On these architectures I wonder if we should disable the
sched_domain_node_span code since we will just get a random grouping of
cpus.

Anton

CPU0:  online
 domain 0: span 00000000,00000000,00000000,00000003
  groups: 00000000,00000000,00000000,00000001 00000000,00000000,00000000,00000002
  domain 1: span 00000000,00000000,00000000,00000003
   groups: 00000000,00000000,00000000,00000003
   domain 2: span 00000000,00000000,00000000,000f0003
    groups: 00000000,00000000,00000000,00000003 00000000,00000000,00000000,000f0000
CPU1:  online
 domain 0: span 00000000,00000000,00000000,00000003
  groups: 00000000,00000000,00000000,00000002 00000000,00000000,00000000,00000001
  domain 1: span 00000000,00000000,00000000,00000003
   groups: 00000000,00000000,00000000,00000003
ERROR parent span is not a superset of domain->span
   domain 2: span 00000000,00000000,00000000,000f0000
ERROR domain->span does not contain CPU1
    groups: 00000000,00000000,00000000,00000003 00000000,00000000,00000000,000f0000
ERROR groups don't span domain->span
CPU16:  online
 domain 0: span 00000000,00000000,00000000,00030000
  groups: 00000000,00000000,00000000,00010000 00000000,00000000,00000000,00020000
  domain 1: span 00000000,00000000,00000000,000f0000
   groups: 00000000,00000000,00000000,00030000 00000000,00000000,00000000,000c0000
   domain 2: span 00000000,00000000,00000000,000f0003
    groups: 00000000,00000000,00000000,000f0000 00000000,00000000,00000000,00000003
CPU17:  online
 domain 0: span 00000000,00000000,00000000,00030000
  groups: 00000000,00000000,00000000,00020000 00000000,00000000,00000000,00010000
  domain 1: span 00000000,00000000,00000000,000f0000
   groups: 00000000,00000000,00000000,00030000 00000000,00000000,00000000,000c0000
   domain 2: span 00000000,00000000,00000000,000f0000
    groups: 00000000,00000000,00000000,000f0000 00000000,00000000,00000000,00000003
ERROR groups don't span domain->span
CPU18:  online
 domain 0: span 00000000,00000000,00000000,000c0000
  groups: 00000000,00000000,00000000,00040000 00000000,00000000,00000000,00080000
  domain 1: span 00000000,00000000,00000000,000f0000
   groups: 00000000,00000000,00000000,000c0000 00000000,00000000,00000000,00030000
ERROR parent span is not a superset of domain->span
   domain 2: span 00000000,00000000,00000000,00000000
ERROR domain->span does not contain CPU18
    groups: 00000000,00000000,00000000,000f0000 00000000,00000000,00000000,00000003
ERROR groups don't span domain->span
CPU19:  online
 domain 0: span 00000000,00000000,00000000,000c0000
  groups: 00000000,00000000,00000000,00080000 00000000,00000000,00000000,00040000
  domain 1: span 00000000,00000000,00000000,000f0000
   groups: 00000000,00000000,00000000,000c0000 00000000,00000000,00000000,00030000
ERROR parent span is not a superset of domain->span
   domain 2: span 00000000,00000000,00000000,00000000
ERROR domain->span does not contain CPU19
    groups: 00000000,00000000,00000000,000f0000 00000000,00000000,00000000,00000003
ERROR groups don't span domain->span

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sched_domains + NUMA issue
  2004-08-29 11:18 sched_domains + NUMA issue Anton Blanchard
@ 2004-08-29 16:40 ` Jesse Barnes
  2004-08-29 16:48 ` William Lee Irwin III
  1 sibling, 0 replies; 4+ messages in thread
From: Jesse Barnes @ 2004-08-29 16:40 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linux-kernel, nickpiggin, nathanl

On Sunday, August 29, 2004 4:18 am, Anton Blanchard wrote:
> Hi,
>
> We are seeing errors in the sched domains debug code when SMT + NUMA is
> enabled. Nathan pointed out that the recent change to limit the number
> of nodes in a scheduling group may be causing this - in particular
> sched_domain_node_span.
>
> It looks like ia64 are the only ones implementing a reasonable
> node_distance, the others just do:
>
> #define node_distance(from,to) (from != to)
>
> On these architectures I wonder if we should disable the
> sched_domain_node_span code since we will just get a random grouping of
> cpus.

Hmm... for now that's probably a good idea.  There's no CONFIG_NUMA_* value we 
could key off of to figure out if node_distance is sane, so it's probably our 
only option.

Jesse

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: sched_domains + NUMA issue
  2004-08-29 11:18 sched_domains + NUMA issue Anton Blanchard
  2004-08-29 16:40 ` Jesse Barnes
@ 2004-08-29 16:48 ` William Lee Irwin III
  2004-08-30 21:34   ` NUMA for abstract bus-masters (was Re: sched_domains + NUMA issue) Guennadi Liakhovetski
  1 sibling, 1 reply; 4+ messages in thread
From: William Lee Irwin III @ 2004-08-29 16:48 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linux-kernel, nickpiggin, nathanl, jbarnes

On Sun, Aug 29, 2004 at 09:18:55PM +1000, Anton Blanchard wrote:
> We are seeing errors in the sched domains debug code when SMT + NUMA is
> enabled. Nathan pointed out that the recent change to limit the number
> of nodes in a scheduling group may be causing this - in particular
> sched_domain_node_span.
> It looks like ia64 are the only ones implementing a reasonable
> node_distance, the others just do:
> #define node_distance(from,to) (from != to)
> On these architectures I wonder if we should disable the
> sched_domain_node_span code since we will just get a random grouping of
> cpus.

For fsck's sake... macro writers need to exercise more discipline.


Index: wait-2.6.9-rc1-mm1/include/linux/topology.h
===================================================================
--- wait-2.6.9-rc1-mm1.orig/include/linux/topology.h	2004-08-24 00:03:18.000000000 -0700
+++ wait-2.6.9-rc1-mm1/include/linux/topology.h	2004-08-29 09:44:35.932705488 -0700
@@ -55,7 +55,7 @@
 	for (node = 0; node < numnodes; node = __next_node_with_cpus(node))
 
 #ifndef node_distance
-#define node_distance(from,to)	(from != to)
+#define node_distance(from,to)	((from) != (to))
 #endif
 #ifndef PENALTY_FOR_NODE_WITH_CPUS
 #define PENALTY_FOR_NODE_WITH_CPUS	(1)
Index: wait-2.6.9-rc1-mm1/include/asm-ia64/numa.h
===================================================================
--- wait-2.6.9-rc1-mm1.orig/include/asm-ia64/numa.h	2004-08-24 00:02:26.000000000 -0700
+++ wait-2.6.9-rc1-mm1/include/asm-ia64/numa.h	2004-08-29 09:45:07.223948496 -0700
@@ -59,7 +59,7 @@
  */
 
 extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
-#define node_distance(from,to) (numa_slit[from * numnodes + to])
+#define node_distance(from,to) (numa_slit[(from) * numnodes + (to)])
 
 extern int paddr_to_nid(unsigned long paddr);
 
Index: wait-2.6.9-rc1-mm1/include/asm-i386/topology.h
===================================================================
--- wait-2.6.9-rc1-mm1.orig/include/asm-i386/topology.h	2004-08-24 00:02:20.000000000 -0700
+++ wait-2.6.9-rc1-mm1/include/asm-i386/topology.h	2004-08-29 09:45:24.973250192 -0700
@@ -67,7 +67,7 @@
 }
 
 /* Node-to-Node distance */
-#define node_distance(from, to) (from != to)
+#define node_distance(from, to) ((from) != (to))
 
 /* Cross-node load balancing interval. */
 #define NODE_BALANCE_RATE 100

^ permalink raw reply	[flat|nested] 4+ messages in thread

* NUMA for abstract bus-masters (was Re: sched_domains + NUMA issue)
  2004-08-29 16:48 ` William Lee Irwin III
@ 2004-08-30 21:34   ` Guennadi Liakhovetski
  0 siblings, 0 replies; 4+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-30 21:34 UTC (permalink / raw)
  To: linux-kernel

Hi

I recently raised a discussion on the ARM-kernel ml, so, I just want to 
check if it would raise any interest here.

I was thinking about architectures, where multiple memory (RAM) pools 
exist on different buses, and various bus-masters on the system have 
different distances to various RAMs. Say, distance can be defined as the 
number of bridges to cross, if that RAM is at all accessible, or infinity 
otherwise.

The API would on one hand allow to register such RAM pools at different 
locations, and on the other hand, requests for RAM with a device pointer 
([dma|pci]_alloc_*) would try to find the nearest RAM available.

And one would have to optimise those allocated buffers for n (typically 2) 
bus-masters (e.g., CPU and a device) with various weights...

And the idea would be to re-use (some of) the NUMA framework.

This would be another approach to tackle problems, addressed by the James' 
dma_declare_coherent_memory patch.

So, does this at all sound reasonable? Anybody finds it useful?

Thanks
Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-08-30 21:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-29 11:18 sched_domains + NUMA issue Anton Blanchard
2004-08-29 16:40 ` Jesse Barnes
2004-08-29 16:48 ` William Lee Irwin III
2004-08-30 21:34   ` NUMA for abstract bus-masters (was Re: sched_domains + NUMA issue) Guennadi Liakhovetski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox