* [RFC] rethinking the topology functions
@ 2002-11-29 19:18 James Bottomley
2002-11-29 20:18 ` Martin J. Bligh
2002-12-05 2:02 ` Matthew Dobson
0 siblings, 2 replies; 3+ messages in thread
From: James Bottomley @ 2002-11-29 19:18 UTC (permalink / raw)
To: linux-kernel; +Cc: Martin J. Bligh, wli, mochel, James.Bottomley
[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]
The attached represents an initial stab at implementing topology functions (or
actually indirecting topology through the subarchitecture features).
Getting this far made me realise that the current topology infrastructure is
rather inadequate (being geared towards the needs of NUMA machines).
All I really need for voyager is the concept of cpu_nodes (voyager CPU cards
have huge L3 caches and up to 4 CPUs each, so scheduling between CPU cards can
end up rather expensive in terms of cache invalidation). I have no use for
memory affinities since the voyager memory map is uniform.
I'd like to rework the current sysfs cpu/node pieces to provide two separate
topologies (one for CPU and one for memory).
Ultimately, the scheduler could be tuned to use the topologies to make
scheduling decisions. When that happens, we can probably fold the current
Pentium Hyperthreading stuff into a simple topology map as well.
I believe Martin Bligh and Bill Irwin are working (or at least thinking)
somewhat along these lines, so I thought I'd gather feedback before jumping
into a wholesale rewrite.
James Bottomley
[-- Attachment #2: tmp.diff --]
[-- Type: text/plain , Size: 14297 bytes --]
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.934 -> 1.935
# include/asm-i386/voyager.h 1.3 -> 1.4
# arch/i386/Kconfig 1.16 -> 1.17
# drivers/base/node.c 1.3 -> 1.4
# include/asm-i386/topology.h 1.2 -> 1.3
# include/asm-i386/numnodes.h 1.2 -> 1.3
# arch/i386/mach-voyager/voyager_cat.c 1.7 -> 1.8
# include/asm-i386/vic.h 1.4 -> 1.5
# arch/i386/mach-voyager/Makefile 1.9 -> 1.10
# drivers/base/Makefile 1.16 -> 1.17
# (new) -> 1.1 arch/i386/mach-generic/machine_topology.h
# (new) -> 1.1 arch/i386/mach-voyager/topology.c
# (new) -> 1.1 arch/i386/mach-voyager/machine_topology.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/11/29 jejb@malley.(none) 1.935
# add topology to voyager
# --------------------------------------------
#
diff -Nru a/arch/i386/Kconfig b/arch/i386/Kconfig
--- a/arch/i386/Kconfig Fri Nov 29 13:01:29 2002
+++ b/arch/i386/Kconfig Fri Nov 29 13:01:29 2002
@@ -1698,3 +1698,13 @@
bool
depends on SMP
default y
+
+config BASE_NODE
+ bool
+ depends on NUMA || VOYAGER
+ default y
+
+config X86_NUMNODES
+ int
+ default "8" if VOYAGER
+ default "1" if !VOYAGER
diff -Nru a/arch/i386/mach-generic/machine_topology.h b/arch/i386/mach-generic/machine_topology.h
--- /dev/null Wed Dec 31 16:00:00 1969
+++ b/arch/i386/mach-generic/machine_topology.h Fri Nov 29 13:01:29 2002
@@ -0,0 +1,96 @@
+/*
+ * linux/include/asm-i386/topology.h
+ *
+ * Written by: Matthew Dobson, IBM Corporation
+ *
+ * Copyright (C) 2002, IBM Corp.
+ *
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ * Send feedback to <colpatch@us.ibm.com>
+ */
+#ifndef _MACHINE_TOPOLOGY_H
+#define _MACHINE_TOPOLOGY_H
+
+#ifdef CONFIG_X86_NUMAQ
+
+#include <asm/smpboot.h>
+
+/* Returns the number of the node containing CPU 'cpu' */
+#define __cpu_to_node(cpu) (cpu_to_logical_apicid(cpu) >> 4)
+
+/* Returns the number of the node containing MemBlk 'memblk' */
+#define __memblk_to_node(memblk) (memblk)
+
+/* Returns the number of the node containing Node 'node'. This architecture is flat,
+ so it is a pretty simple function! */
+#define __parent_node(node) (node)
+
+/* Returns the number of the first CPU on Node 'node'.
+ * This should be changed to a set of cached values
+ * but this will do for now.
+ */
+static inline int __node_to_first_cpu(int node)
+{
+ int i, cpu, logical_apicid = node << 4;
+
+ for(i = 1; i < 16; i <<= 1)
+ /* check to see if the cpu is in the system */
+ if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
+ /* if yes, return it to caller */
+ return cpu;
+
+ BUG(); /* couldn't find a cpu on given node */
+ return -1;
+}
+
+/* Returns a bitmask of CPUs on Node 'node'.
+ * This should be changed to a set of cached bitmasks
+ * but this will do for now.
+ */
+static inline unsigned long __node_to_cpu_mask(int node)
+{
+ int i, cpu, logical_apicid = node << 4;
+ unsigned long mask = 0UL;
+
+ if (sizeof(unsigned long) * 8 < NR_CPUS)
+ BUG();
+
+ for(i = 1; i < 16; i <<= 1)
+ /* check to see if the cpu is in the system */
+ if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
+ /* if yes, add to bitmask */
+ mask |= 1 << cpu;
+
+ return mask;
+}
+
+/* Returns the number of the first MemBlk on Node 'node' */
+#define __node_to_memblk(node) (node)
+
+#else /* !CONFIG_X86_NUMAQ */
+/*
+ * Other i386 platforms should define their own version of the
+ * above macros here.
+ */
+
+#include <asm-generic/topology.h>
+
+#endif /* CONFIG_X86_NUMAQ */
+
+#endif
diff -Nru a/arch/i386/mach-voyager/Makefile b/arch/i386/mach-voyager/Makefile
--- a/arch/i386/mach-voyager/Makefile Fri Nov 29 13:01:29 2002
+++ b/arch/i386/mach-voyager/Makefile Fri Nov 29 13:01:29 2002
@@ -10,7 +10,7 @@
EXTRA_CFLAGS += -I../kernel
export-objs :=
-obj-y := setup.o voyager_basic.o voyager_thread.o
+obj-y := setup.o voyager_basic.o voyager_thread.o topology.o
obj-$(CONFIG_SMP) += voyager_smp.o voyager_cat.o
diff -Nru a/arch/i386/mach-voyager/machine_topology.h b/arch/i386/mach-voyager/machine_topology.h
--- /dev/null Wed Dec 31 16:00:00 1969
+++ b/arch/i386/mach-voyager/machine_topology.h Fri Nov 29 13:01:29 2002
@@ -0,0 +1,45 @@
+/* -*- mode: c; c-basic-offset: 8 -*- */
+
+/* Copyright (C) 1999,2001
+ *
+ * Author: J.E.J.Bottomley@HansenPartnership.com
+ *
+ * linux/arch/i386/mach-voyager/machine_topology.h
+ */
+#include <asm/voyager.h>
+
+extern u32 voyager_node_to_cpu_mask[MAX_PROCESSOR_BOARDS];
+extern u8 voyager_cpu_to_node[NR_CPUS];
+extern u8 voyager_num_nodes;
+
+static inline u8 num_online_nodes(void)
+{
+ return voyager_num_nodes;
+}
+
+static inline int __cpu_to_node(int cpu)
+{
+ return voyager_cpu_to_node[cpu];
+}
+
+static inline int __node_to_first_cpu(int node)
+{
+ return ffs(voyager_node_to_cpu_mask[node]);
+}
+
+static inline unsigned long __node_to_cpu_mask(int node)
+{
+ return voyager_node_to_cpu_mask[node];
+}
+
+static inline int __parent_node(int node)
+{
+ return node;
+}
+
+/* FIXME: these are useless defines just to get topology to compile
+ * The code needs to be NUMA cleaned up to separate node from what
+ * NUMA thinks of as a node */
+#define __node_to_memblk(node) (node)
+#define si_meminfo_node(i, j)
+
diff -Nru a/arch/i386/mach-voyager/topology.c b/arch/i386/mach-voyager/topology.c
--- /dev/null Wed Dec 31 16:00:00 1969
+++ b/arch/i386/mach-voyager/topology.c Fri Nov 29 13:01:29 2002
@@ -0,0 +1,51 @@
+/* -*- mode: c; c-basic-offset: 8 -*- */
+
+/* Copyright (C) 2002
+ *
+ * Author: J.E.J.Bottomley@HansenPartnership.com
+ *
+ * voyager topology functions
+ */
+
+#include <linux/init.h>
+#include <linux/string.h>
+#include <asm/cpu.h>
+#include <linux/smp.h>
+#include <asm/topology.h>
+
+/* Topology mapping functions */
+u32 voyager_node_to_cpu_mask[MAX_PROCESSOR_BOARDS] = { 0 };
+u8 voyager_cpu_to_node[NR_CPUS] = { 0 };
+u8 voyager_num_nodes = 0;
+
+struct i386_cpu cpu_devices[NR_CPUS];
+struct i386_node node_devices[MAX_NUMNODES];
+
+static int __init topology_init(void)
+{
+ int i;
+
+ printk("VOYAGER BEGINNING TOPOLOGY INITIALISATION %d\n", MAX_NUMNODES);
+ memset(cpu_devices, 0, sizeof(cpu_devices));
+ memset(node_devices, 0, sizeof(node_devices));
+
+ for (i = 0; i < num_online_nodes(); i++)
+ arch_register_node(i);
+ for (i = 0; i < NR_CPUS; i++)
+ if (cpu_possible(i)) arch_register_cpu(i);
+
+ printk("NODES %d\n", num_online_nodes());
+ printk("CPU TO NODE: ");
+ for(i=0; i<NR_CPUS; i++)
+ if(cpu_possible(i))
+ printk("%d->%d ", i, voyager_cpu_to_node[i]);
+ printk("\nNODE TO CPU MASK: ");
+ for(i=0; i<voyager_num_nodes; i++)
+ printk("%d->0x%04x ", i, voyager_node_to_cpu_mask[i]);
+ printk("\n");
+
+
+ return 0;
+}
+
+subsys_initcall(topology_init);
diff -Nru a/arch/i386/mach-voyager/voyager_cat.c b/arch/i386/mach-voyager/voyager_cat.c
--- a/arch/i386/mach-voyager/voyager_cat.c Fri Nov 29 13:01:29 2002
+++ b/arch/i386/mach-voyager/voyager_cat.c Fri Nov 29 13:01:29 2002
@@ -26,6 +26,7 @@
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/delay.h>
+#include <asm/topology.h>
#include <asm/io.h>
#ifdef VOYAGER_CAT_DEBUG
@@ -575,6 +576,7 @@
__u8 qabc_data[0x20];
__u8 num_submodules, val;
voyager_eprom_hdr_t *eprom_hdr = (voyager_eprom_hdr_t *)&eprom_buf[0];
+ __u8 processor_cards = 0;
__u8 cmos[4];
unsigned long addr;
@@ -721,8 +723,13 @@
printk("Module \"%s\": Dyadic Processor Card\n",
cat_module_name(i));
voyager_extended_vic_processors |= (1<<cpu);
+ voyager_node_to_cpu_mask[processor_cards] |= (1<<cpu);
+ voyager_cpu_to_node[cpu] = processor_cards;
cpu += 4;
voyager_extended_vic_processors |= (1<<cpu);
+ voyager_node_to_cpu_mask[processor_cards] |= (1<<cpu);
+ voyager_cpu_to_node[cpu] = processor_cards;
+ processor_cards++;
outb(VOYAGER_CAT_END, CAT_CMD);
continue;
}
@@ -884,18 +891,20 @@
voyager_quad_processors |= (1<<cpu);
voyager_quad_cpi_addr[cpu] = (struct voyager_qic_cpi *)
(qic_addr+(j<<8));
+ voyager_cpu_to_node[cpu] = processor_cards;
+ voyager_node_to_cpu_mask[processor_cards] |= (1<<cpu);
CDEBUG(("CPU%d: CPI address 0x%lx\n", cpu,
(unsigned long)voyager_quad_cpi_addr[cpu]));
}
outb(VOYAGER_CAT_END, CAT_CMD);
-
-
+ processor_cards++;
*asicpp = NULL;
modpp = &((*modpp)->next);
}
*modpp = NULL;
printk("CAT Bus Initialisation finished: extended procs 0x%x, quad procs 0x%x, allowed vic boot = 0x%x\n", voyager_extended_vic_processors, voyager_quad_processors, voyager_allowed_boot_processors);
+ voyager_num_nodes = processor_cards;
request_resource(&ioport_resource, &vic_res);
if(voyager_quad_processors)
request_resource(&ioport_resource, &qic_res);
diff -Nru a/drivers/base/Makefile b/drivers/base/Makefile
--- a/drivers/base/Makefile Fri Nov 29 13:01:29 2002
+++ b/drivers/base/Makefile Fri Nov 29 13:01:29 2002
@@ -4,7 +4,8 @@
driver.o class.o intf.o platform.o \
cpu.o firmware.o
-obj-$(CONFIG_NUMA) += node.o memblk.o
+obj-$(CONFIG_BASE_NODE) += node.o
+obj-$(CONFIG_NUMA) += memblk.o
obj-y += fs/
diff -Nru a/drivers/base/node.c b/drivers/base/node.c
--- a/drivers/base/node.c Fri Nov 29 13:01:29 2002
+++ b/drivers/base/node.c Fri Nov 29 13:01:29 2002
@@ -93,7 +93,8 @@
static int __init register_node_type(void)
{
+ devclass_register(&node_devclass);
driver_register(&node_driver);
- return devclass_register(&node_devclass);
+ return 0; //devclass_register(&node_devclass);
}
postcore_initcall(register_node_type);
diff -Nru a/include/asm-i386/numnodes.h b/include/asm-i386/numnodes.h
--- a/include/asm-i386/numnodes.h Fri Nov 29 13:01:29 2002
+++ b/include/asm-i386/numnodes.h Fri Nov 29 13:01:29 2002
@@ -6,7 +6,7 @@
#ifdef CONFIG_X86_NUMAQ
#include <asm/numaq.h>
#else
-#define MAX_NUMNODES 1
+#define MAX_NUMNODES CONFIG_X86_NUMNODES
#endif /* CONFIG_X86_NUMAQ */
#endif /* _ASM_MAX_NUMNODES_H */
diff -Nru a/include/asm-i386/topology.h b/include/asm-i386/topology.h
--- a/include/asm-i386/topology.h Fri Nov 29 13:01:29 2002
+++ b/include/asm-i386/topology.h Fri Nov 29 13:01:29 2002
@@ -27,70 +27,7 @@
#ifndef _ASM_I386_TOPOLOGY_H
#define _ASM_I386_TOPOLOGY_H
-#ifdef CONFIG_X86_NUMAQ
-
-#include <asm/smpboot.h>
-
-/* Returns the number of the node containing CPU 'cpu' */
-#define __cpu_to_node(cpu) (cpu_to_logical_apicid(cpu) >> 4)
-
-/* Returns the number of the node containing MemBlk 'memblk' */
-#define __memblk_to_node(memblk) (memblk)
-
-/* Returns the number of the node containing Node 'node'. This architecture is flat,
- so it is a pretty simple function! */
-#define __parent_node(node) (node)
-
-/* Returns the number of the first CPU on Node 'node'.
- * This should be changed to a set of cached values
- * but this will do for now.
- */
-static inline int __node_to_first_cpu(int node)
-{
- int i, cpu, logical_apicid = node << 4;
-
- for(i = 1; i < 16; i <<= 1)
- /* check to see if the cpu is in the system */
- if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
- /* if yes, return it to caller */
- return cpu;
-
- BUG(); /* couldn't find a cpu on given node */
- return -1;
-}
-
-/* Returns a bitmask of CPUs on Node 'node'.
- * This should be changed to a set of cached bitmasks
- * but this will do for now.
- */
-static inline unsigned long __node_to_cpu_mask(int node)
-{
- int i, cpu, logical_apicid = node << 4;
- unsigned long mask = 0UL;
-
- if (sizeof(unsigned long) * 8 < NR_CPUS)
- BUG();
-
- for(i = 1; i < 16; i <<= 1)
- /* check to see if the cpu is in the system */
- if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
- /* if yes, add to bitmask */
- mask |= 1 << cpu;
-
- return mask;
-}
-
-/* Returns the number of the first MemBlk on Node 'node' */
-#define __node_to_memblk(node) (node)
-
-#else /* !CONFIG_X86_NUMAQ */
-/*
- * Other i386 platforms should define their own version of the
- * above macros here.
- */
-
-#include <asm-generic/topology.h>
-
-#endif /* CONFIG_X86_NUMAQ */
+/* get the machine specific topology file */
+#include "machine_topology.h"
#endif /* _ASM_I386_TOPOLOGY_H */
diff -Nru a/include/asm-i386/vic.h b/include/asm-i386/vic.h
--- a/include/asm-i386/vic.h Fri Nov 29 13:01:29 2002
+++ b/include/asm-i386/vic.h Fri Nov 29 13:01:29 2002
@@ -3,6 +3,8 @@
* Author: J.E.J.Bottomley@HansenPartnership.com
*
* Standard include definitions for the NCR Voyager Interrupt Controller */
+#ifndef _ASM_VIC_H
+#define _ASM_VIC_H
/* The eight CPI vectors. To activate a CPI, you write a bit mask
* corresponding to the processor set to be interrupted into the
@@ -59,3 +61,5 @@
#define VIC_BOOT_INTERRUPT_MASK 0xfe
extern void smp_vic_timer_interrupt(struct pt_regs *regs);
+
+#endif
diff -Nru a/include/asm-i386/voyager.h b/include/asm-i386/voyager.h
--- a/include/asm-i386/voyager.h Fri Nov 29 13:01:29 2002
+++ b/include/asm-i386/voyager.h Fri Nov 29 13:01:29 2002
@@ -3,6 +3,8 @@
* Author: J.E.J.Bottomley@HansenPartnership.com
*
* Standard include definitions for the NCR Voyager system */
+#ifndef _ASM_VOYAGER_H_
+#define _ASM_VOYAGER_H_
#undef VOYAGER_DEBUG
#undef VOYAGER_CAT_DEBUG
@@ -519,3 +521,5 @@
#define VOYAGER_PSI_SUBREAD 2
#define VOYAGER_PSI_SUBWRITE 3
extern void voyager_cat_psi(__u8, __u16, __u8 *);
+
+#endif
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] rethinking the topology functions
2002-11-29 19:18 [RFC] rethinking the topology functions James Bottomley
@ 2002-11-29 20:18 ` Martin J. Bligh
2002-12-05 2:02 ` Matthew Dobson
1 sibling, 0 replies; 3+ messages in thread
From: Martin J. Bligh @ 2002-11-29 20:18 UTC (permalink / raw)
To: James Bottomley, linux-kernel
Cc: wli, mochel, johnstul, Erich Focht, colpatch
[-- Attachment #1: Type: text/plain, Size: 2183 bytes --]
> I'd like to rework the current sysfs cpu/node pieces to provide
> two separate topologies (one for CPU and one for memory).
There are two other sets of patches that are floating around that
will interact with this. The first (from Matt) makes the topo
functions cache results, instead of recalculating each time.
I'm thinking that we may want the generic front end to do that and
read from arrays, whilst only the *_init backend gets moved into
the subarch stuff ? (looking back at the patch right now, I can't
quite see what it's doing in 30s, but that's what it was meant to do).
The second is the start of breaking things out into the numaq
subarch - the attatched bits seem to work & are written with the
intent of being mergeable. It's just pieces torn off the big patch
by James Cleverdon and John Stultz (which isn't really mergeable
as is) and tweaked around by me. I'm going to test them a little
more before submission, but they're pretty much there, I think.
They're based on top of John's subarch reshuffle (the stuff you had
under generic is really NUMA-Q)
> Ultimately, the scheduler could be tuned to use the topologies to
> make scheduling decisions.
We kind of have that already. There are NUMA sceduler patches that
do this kind of thing in 2.5.47-mjb3 ... earlier versions of Erich's
scheduler had a pooling abstraction - this got ripped out for
simplicity in the hope of getting something merged before the 2.5
freeze, but it'd be nice to put them back if that's not going to
happen (still vaguely hoping).
> When that happens, we can probably fold
> the current Pentium Hyperthreading stuff into a simple topology map
> as well.
Not sure about that ... the HT stuff I believe created one queue
per pair of CPUs, which isn't going to work well for multiple real
CPUs per nodes, though is kind of a nice trick for a shared cache
SMT like HT .... things like the PPC64 chip multi-chip-on-1-die
may feel differently about that.
> I believe Martin Bligh and Bill Irwin are working (or at least
> thinking) somewhat along these lines, so I thought I'd gather
> feedback before jumping into a wholesale rewrite.
cc'ed John, Erich, Matt ...
M.
[-- Attachment #2: 21-i386_topo-11 --]
[-- Type: text/plain, Size: 5254 bytes --]
diff -purN -X /home/mbligh/.diff.exclude 11-noearlyirq-01/arch/i386/kernel/smpboot.c 21-i386_topo-11/arch/i386/kernel/smpboot.c
--- 11-noearlyirq-01/arch/i386/kernel/smpboot.c Tue Nov 19 16:32:31 2002
+++ 21-i386_topo-11/arch/i386/kernel/smpboot.c Tue Nov 19 16:33:30 2002
@@ -502,6 +502,46 @@ static struct task_struct * __init fork_
return do_fork(CLONE_VM|CLONE_IDLETASK, 0, ®s, 0, NULL);
}
+#ifdef CONFIG_X86_NUMAQ
+/* which logical CPUs are on which nodes */
+volatile unsigned long node_2_cpu_mask[MAX_NR_NODES];
+/* which node each logical CPU is on */
+volatile int cpu_2_node[NR_CPUS];
+
+/* Initialize all maps between cpu number and node */
+static inline void init_cpu_to_node_mapping(void)
+{
+ int node, cpu;
+
+ for (node = 0; node <
MAX_NR_NODES; node++) {
+ node_2_cpu_mask[node] = 0;
+ }
+ for (cpu = 0; cpu < NR_CPUS; cpu++) {
+ cpu_2_node[cpu] = -1;
+ }
+}
+
+/* set up a mapping between cpu and node. */
+static inline void map_cpu_to_node(int cpu, int node)
+{
+ node_2_cpu_mask[node] |= (1 << cpu);
+ cpu_2_node[cpu] = node;
+}
+
+/* undo a mapping between cpu and node. */
+static inline void unmap_cpu_to_node(int cpu, int node)
+{
+ node_2_cpu_mask[node] &= ~(1 << cpu);
+ cpu_2_node[cpu] = -1;
+}
+#else /* !CONFIG_X86_NUMAQ */
+
+#define init_cpu_to_node_mapping() ({})
+#define map_cpu_to_node(cpu, node) ({})
+#define unmap_cpu_to_node(cpu, node) ({})
+
+#endif /* CONFIG_X86_NUMAQ */
+
/* which physical APIC ID maps to which logical CPU number */
volatile int
physical_apicid_2_cpu[MAX_APICID];
/* which logical CPU number maps to which physical APIC ID */
@@ -525,6 +565,7 @@ static inline void init_cpu_to_apicid(vo
cpu_2_physical_apicid[cpu] = -1;
cpu_2_logical_apicid[cpu] = -1;
}
+ init_cpu_to_node_mapping();
}
static inline void map_cpu_to_boot_apicid(int cpu, int apicid)
@@ -536,6 +577,7 @@ static inline void map_cpu_to_boot_apici
if (clustered_apic_mode) {
logical_apicid_2_cpu[apicid] = cpu;
cpu_2_logical_apicid[cpu] = apicid;
+ map_cpu_to_node(cpu, apicid >> 4);
} else {
physical_apicid_2_cpu[apicid] = cpu;
cpu_2_physical_apicid[cpu] = apicid;
@@ -551,6 +593,7 @@ static inline void unmap_cpu_to_boot_api
if (clustered_apic_mode) {
logical_apicid_2_cpu[apicid] =
-1;
cpu_2_logical_apicid[cpu] = -1;
+ unmap_cpu_to_node(cpu, apicid >> 4);
} else {
physical_apicid_2_cpu[apicid] = -1;
cpu_2_physical_apicid[cpu] = -1;
diff -purN -X /home/mbligh/.diff.exclude 11-noearlyirq-01/include/asm-i386/smpboot.h 21-i386_topo-11/include/asm-i386/smpboot.h
--- 11-noearlyirq-01/include/asm-i386/smpboot.h Sun Nov 10 19:28:31 2002
+++ 21-i386_topo-11/include/asm-i386/smpboot.h Tue Nov 19 16:33:30 2002
@@ -23,6 +23,14 @@
#define boot_cpu_apicid boot_cpu_physical_apicid
#endif /* CONFIG_CLUSTERED_APIC */
+#ifdef CONFIG_X86_NUMAQ
+/*
+ * Mappings between logical cpu number and node number
+ */
+extern volatile unsigned long node_2_cpu_mask[];
+extern volatile int cpu_2_node[];
+#endif /* CONFIG_X86_NUMAQ */
+
/*
* Mappings between logical cpu number and logical / physical apicid
* The first four macros are trivial, but it keeps the abstraction consistent
diff -purN -X /home/mbligh/.diff.exclude 11-noearlyirq-01/include/asm-i386/topology.h 21-i386_topo-11/include/asm-i386/topology.h
--- 11-noearlyirq-01/include/asm-i386/topology.h Sun Nov 10 19:28:05 2002
+++ 21-i386_topo-11/include/asm-i386/topology.h Tue Nov 19 16:33:30 2002
@@ -32,7 +32,7 @@
#include <asm/smpboot.h>
/* Returns the number of the node containing CPU 'cpu' */
-#define __cpu_to_node(cpu) (cpu_to_logical_apicid(cpu) >> 4)
+#define __cpu_to_node(cpu) (cpu_2_node[cpu])
/* Returns the number of the node containing MemBlk 'memblk' */
#define __memblk_to_node(memblk) (memblk)
@@ -41,44
+41,11 @@
so it is a pretty simple function! */
#define __parent_node(node) (node)
-/* Returns the number of the first CPU on Node 'node'.
- * This should be changed to a set of cached values
- * but this will do for now.
- */
-static inline int __node_to_first_cpu(int node)
-{
- int i, cpu, logical_apicid = node << 4;
+/* Returns a bitmask of CPUs on Node 'node'. */
+#define __node_to_cpu_mask(node) (node_2_cpu_mask[node])
- for(i = 1; i < 16; i <<= 1)
- /* check to see if the cpu is in the system */
- if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
- /* if yes, return it to caller */
- return cpu;
-
- BUG(); /* couldn't find a cpu on given node */
- return -1;
-}
-
-/* Returns a bitmask of CPUs on Node 'node'.
- * This
should be changed to a set of cached bitmasks
- * but this will do for now.
- */
-static inline unsigned long __node_to_cpu_mask(int node)
-{
- int i, cpu, logical_apicid = node << 4;
- unsigned long mask = 0UL;
-
- if (sizeof(unsigned long) * 8 < NR_CPUS)
- BUG();
-
- for(i = 1; i < 16; i <<= 1)
- /* check to see if the cpu is in the system */
- if ((cpu = logical_apicid_to_cpu(logical_apicid | i)) >= 0)
- /* if yes, add to bitmask */
- mask |= 1 << cpu;
-
- return mask;
-}
+/* Returns the number of the first CPU on Node 'node'. */
+#define __node_to_first_cpu(node) (__ffs(__node_to_cpu_mask(node)))
/* Returns the number of the first MemBlk on Node 'node' */
#define __node_to_memblk(node) (node)
[-- Attachment #3: numaq_makefile --]
[-- Type: text/plain, Size: 2969 bytes --]
diff -urpN -X /home/fletch/.diff.exclude 01-subarch_reorg/arch/i386/Makefile 11-numaq_makefile-01/arch/i386/Makefile
--- 01-subarch_reorg/arch/i386/Makefile Tue Nov 26 08:07:12 2002
+++ 11-numaq_makefile-01/arch/i386/Makefile Tue Nov 26 08:27:58 2002
@@ -49,6 +49,9 @@ CFLAGS += $(cflags-y)
#VISWS subarch support
mflags-$(CONFIG_VISWS) := -Iinclude/asm-i386/mach-visws
mcore-$(CONFIG_VISWS) := mach-visws
+#NUMAQ subarch support
+mflags-$(CONFIG_X86_NUMAQ) := -Iinclude/asm-i386/mach-numaq
+mcore-$(CONFIG_X86_NUMAQ) := mach-default
#default subarch support
mflags-y += -Iinclude/asm-i386/mach-default
ifndef mcore-y
diff -urpN -X /home/fletch/.diff.exclude 01-subarch_reorg/include/asm-i386/mach-default/mach_apic.h
11-numaq_makefile-01/include/asm-i386/mach-default/mach_apic.h
--- 01-subarch_reorg/include/asm-i386/mach-default/mach_apic.h Tue Nov 26 08:07:22 2002
+++ 11-numaq_makefile-01/include/asm-i386/mach-default/mach_apic.h Tue Nov 26 08:53:37 2002
@@ -12,7 +12,7 @@ static inline unsigned long calculate_ld
#define APIC_DFR_VALUE (APIC_DFR_FLAT)
#ifdef CONFIG_SMP
- #define TARGET_CPUS (clustered_apic_mode ? 0xf : cpu_online_map)
+ #define TARGET_CPUS (cpu_online_map)
#else
#define TARGET_CPUS 0x01
#endif
@@ -27,15 +27,12 @@ static inline void summit_check(char *oe
static inline void clustered_apic_check(void)
{
printk("Enabling APIC mode: %s. Using %d I/O APICs\n",
- (clustered_apic_mode ? "NUMA-Q" : "Flat"), nr_ioapics);
+ "Flat",
nr_ioapics);
}
static inline int cpu_present_to_apicid(int mps_cpu)
{
- if (clustered_apic_mode)
- return ( ((mps_cpu/4)*16) + (1<<(mps_cpu%4)) );
- else
- return mps_cpu;
+ return mps_cpu;
}
static inline unsigned long apicid_to_cpu_present(int apicid)
diff -urpN -X /home/fletch/.diff.exclude 01-subarch_reorg/include/asm-i386/mach-numaq/mach_apic.h 11-numaq_makefile-01/include/asm-i386/mach-numaq/mach_apic.h
--- 01-subarch_reorg/include/asm-i386/mach-numaq/mach_apic.h Wed Dec 31 16:00:00 1969
+++ 11-numaq_makefile-01/include/asm-i386/mach-numaq/mach_apic.h Tue Nov 26 08:52:30 2002
@@ -0,0 +1,39 @@
+#ifndef __ASM_MACH_APIC_H
+#define __ASM_MACH_APIC_H
+
+static inline unsigned long calculate_ldr(unsigned long old)
+{
+ unsigned long
id;
+
+ id = 1UL << smp_processor_id();
+ return ((old & ~APIC_LDR_MASK) | SET_APIC_LOGICAL_ID(id));
+}
+
+#define APIC_DFR_VALUE (APIC_DFR_FLAT)
+
+#define TARGET_CPUS (0xf)
+
+#define APIC_BROADCAST_ID 0x0F
+#define check_apicid_used(bitmap, apicid) (bitmap & (1 << apicid))
+
+static inline void summit_check(char *oem, char *productid)
+{
+}
+
+static inline void clustered_apic_check(void)
+{
+ printk("Enabling APIC mode: %s. Using %d I/O APICs\n",
+ "NUMA-Q", nr_ioapics);
+}
+
+static inline int cpu_present_to_apicid(int mps_cpu)
+{
+ return ( ((mps_cpu/4)*16) + (1<<(mps_cpu%4)) );
+}
+
+static inline unsigned long apicid_to_cpu_present(int apicid)
+{
+ return (1ul << apicid);
+}
+
+#endif /* __ASM_MACH_APIC_H */
[-- Attachment #4: numaq_apic --]
[-- Type: text/plain, Size: 6557 bytes --]
diff -urpN -X /home/fletch/.diff.exclude 11-numaq_makefile-01/arch/i386/kernel/apic.c 12-numaq_apic-11/arch/i386/kernel/apic.c
--- 11-numaq_makefile-01/arch/i386/kernel/apic.c Tue Nov 26 08:07:12 2002
+++ 12-numaq_apic-11/arch/i386/kernel/apic.c Tue Nov 26 09:09:42 2002
@@ -311,11 +311,9 @@ void __init setup_local_APIC (void)
__error_in_apic_c();
/*
- * Double-check wether this APIC is really registered.
- * This is meaningless in clustered apic mode, so we skip it.
+ * Double-check whether this APIC is really registered.
*/
- if (!clustered_apic_mode &&
- !test_bit(GET_APIC_ID(apic_read(APIC_ID)), &phys_cpu_present_map))
+ if (!apic_id_registered())
BUG();
/*
@@ -323,21 +321,7 @@ void __init setup_local_APIC (void)
* an
APIC. See e.g. "AP-388 82489DX User's Manual" (Intel
* document number 292116). So here it goes...
*/
-
- if (!clustered_apic_mode) {
- /*
- * In clustered apic mode, the firmware does this for us
- * Put the APIC into flat delivery mode.
- * Must be "all ones" explicitly for 82489DX.
- */
- apic_write_around(APIC_DFR, APIC_DFR_VALUE);
-
- /*
- * Set up the logical destination ID.
- */
- value = apic_read(APIC_LDR);
- apic_write_around(APIC_LDR, calculate_ldr(value));
- }
+ init_apic_ldr();
/*
* Set Task Priority to 'accept all'. We never change this
diff -urpN -X /home/fletch/.diff.exclude 11-numaq_makefile-01/arch/i386/kernel/io_apic.c 12-numaq_apic-11/arch/i386/kernel/io_apic.c
---
11-numaq_makefile-01/arch/i386/kernel/io_apic.c Tue Nov 26 08:07:12 2002
+++ 12-numaq_apic-11/arch/i386/kernel/io_apic.c Tue Nov 26 09:06:30 2002
@@ -256,7 +256,7 @@ static inline void balance_irq(int irq)
irq_balance_t *entry = irq_balance + irq;
unsigned long now = jiffies;
- if (clustered_apic_mode)
+ if (no_balance_irq)
return;
if (unlikely(time_after(now, entry->timestamp + IRQ_BALANCE_INTERVAL))) {
@@ -272,7 +272,7 @@ static inline void balance_irq(int irq)
new_cpu = move(entry->cpu, allowed_mask, now, random_number);
if (entry->cpu != new_cpu) {
entry->cpu = new_cpu;
- set_ioapic_affinity(irq, 1 << new_cpu);
+ set_ioapic_affinity(irq, cpu_present_to_apicid(new_cpu));
}
}
}
@@ -734,7 +734,6 @@ void __init
setup_IO_APIC_irqs(void)
if (irq_trigger(idx)) {
entry.trigger = 1;
entry.mask = 1;
- entry.dest.logical.logical_dest = TARGET_CPUS;
}
irq = pin_2_irq(idx, apic, pin);
@@ -742,7 +741,7 @@ void __init setup_IO_APIC_irqs(void)
* skip adding the timer int on secondary nodes, which causes
* a small but painful rift in the time-space continuum
*/
- if (clustered_apic_mode && (apic != 0) && (irq == 0))
+ if (multi_timer_check(apic, irq))
continue;
else
add_pin_to_irq(irq, apic, pin);
diff -urpN -X /home/fletch/.diff.exclude 11-numaq_makefile-01/include/asm-i386/mach-default/mach_apic.h 12-numaq_apic-11/include/asm-i386/mach-default/mach_apic.h
---
11-numaq_makefile-01/include/asm-i386/mach-default/mach_apic.h Tue Nov 26 08:53:37 2002
+++ 12-numaq_apic-11/include/asm-i386/mach-default/mach_apic.h Tue Nov 26 09:31:11 2002
@@ -17,6 +17,8 @@ static inline unsigned long calculate_ld
#define TARGET_CPUS 0x01
#endif
+#define no_balance_irq (0)
+
#define APIC_BROADCAST_ID 0x0F
#define check_apicid_used(bitmap, apicid) (bitmap & (1 << apicid))
@@ -24,10 +26,38 @@ static inline void summit_check(char *oe
{
}
+static inline int apic_id_registered(void)
+{
+ return (test_bit(GET_APIC_ID(apic_read(APIC_ID)),
+ &phys_cpu_present_map));
+}
+
+/*
+ * Set up the logical destination ID.
+ *
+ * Intel recommends to set DFR, LDR and TPR before enabling
+ * an APIC. See e.g. "AP-388
82489DX User's Manual" (Intel
+ * document number 292116). So here it goes...
+ */
+static inline void init_apic_ldr(void)
+{
+ unsigned long val;
+
+ apic_write_around(APIC_DFR, APIC_DFR_VALUE);
+ val = apic_read(APIC_LDR) & ~APIC_LDR_MASK;
+ val |= SET_APIC_LOGICAL_ID(1UL << smp_processor_id());
+ apic_write_around(APIC_LDR, val);
+}
+
static inline void clustered_apic_check(void)
{
printk("Enabling APIC mode: %s. Using %d I/O APICs\n",
"Flat", nr_ioapics);
+}
+
+static inline int multi_timer_check(int apic, int irq)
+{
+ return 0;
}
static inline int cpu_present_to_apicid(int mps_cpu)
diff -urpN -X /home/fletch/.diff.exclude 11-numaq_makefile-01/include/asm-i386/mach-numaq/mach_apic.h
12-numaq_apic-11/include/asm-i386/mach-numaq/mach_apic.h
--- 11-numaq_makefile-01/include/asm-i386/mach-numaq/mach_apic.h Tue Nov 26 08:52:30 2002
+++ 12-numaq_apic-11/include/asm-i386/mach-numaq/mach_apic.h Tue Nov 26 09:32:21 2002
@@ -13,6 +13,8 @@ static inline unsigned long calculate_ld
#define TARGET_CPUS (0xf)
+#define no_balance_irq (1)
+
#define APIC_BROADCAST_ID 0x0F
#define check_apicid_used(bitmap, apicid) (bitmap & (1 << apicid))
@@ -20,10 +22,25 @@ static inline void summit_check(char *oe
{
}
+static inline int apic_id_registered(void)
+{
+ return (1);
+}
+
+static inline void init_apic_ldr(void)
+{
+ /* Already done in NUMA-Q firmware */
+}
+
static inline void clustered_apic_check(void)
{
printk("Enabling APIC
mode: %s. Using %d I/O APICs\n",
"NUMA-Q", nr_ioapics);
+}
+
+static inline int multi_timer_check(int apic, int irq)
+{
+ return (apic != 0 && irq == 0);
}
static inline int cpu_present_to_apicid(int mps_cpu)
diff -urpN -X /home/fletch/.diff.exclude 11-numaq_makefile-01/include/asm-i386/mach-summit/mach_apic.h 12-numaq_apic-11/include/asm-i386/mach-summit/mach_apic.h
--- 11-numaq_makefile-01/include/asm-i386/mach-summit/mach_apic.h Tue Nov 26 08:07:22 2002
+++ 12-numaq_apic-11/include/asm-i386/mach-summit/mach_apic.h Tue Nov 26 09:31:41 2002
@@ -23,6 +23,8 @@ static inline unsigned long calculate_ld
#define APIC_DFR_VALUE (x86_summit ? APIC_DFR_CLUSTER : APIC_DFR_FLAT)
#define TARGET_CPUS (x86_summit ? XAPIC_DEST_CPUS_MASK :
cpu_online_map)
+#define no_balance_irq (1)
+
#define APIC_BROADCAST_ID (x86_summit ? 0xFF : 0x0F)
#define check_apicid_used(bitmap, apicid) (0)
@@ -32,10 +34,25 @@ static inline void summit_check(char *oe
x86_summit = 1;
}
+static inline int apic_id_registered(void)
+{
+ return (1);
+}
+
+
+static inline void init_apic_ldr(void)
+{
+}
+
static inline void clustered_apic_check(void)
{
printk("Enabling APIC mode: %s. Using %d I/O APICs\n",
(x86_summit ? "Summit" : "Flat"), nr_ioapics);
+}
+
+static inline int multi_timer_check(int apic, int irq)
+{
+ return 0;
}
static inline int cpu_present_to_apicid(int mps_cpu)
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] rethinking the topology functions
2002-11-29 19:18 [RFC] rethinking the topology functions James Bottomley
2002-11-29 20:18 ` Martin J. Bligh
@ 2002-12-05 2:02 ` Matthew Dobson
1 sibling, 0 replies; 3+ messages in thread
From: Matthew Dobson @ 2002-12-05 2:02 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-kernel, Martin J. Bligh, wli, mochel
James Bottomley wrote:
> The attached represents an initial stab at implementing topology functions (or
> actually indirecting topology through the subarchitecture features).
Hmm.. I think this is actually a really good goal. I've got some
comments on your implementation below, but I think this goes in the
right direction.
> Getting this far made me realise that the current topology infrastructure is
> rather inadequate (being geared towards the needs of NUMA machines).
>
> All I really need for voyager is the concept of cpu_nodes (voyager CPU cards
> have huge L3 caches and up to 4 CPUs each, so scheduling between CPU cards can
> end up rather expensive in terms of cache invalidation). I have no use for
> memory affinities since the voyager memory map is uniform.
Inadequate? It sounds as though the current topology infrastructure
does everything you need plus *more*. In that case, simply don't use
that part you don't need (memory affinity), and you get the part you do
need (CPU affinity) for free. No?
> I'd like to rework the current sysfs cpu/node pieces to provide two separate
> topologies (one for CPU and one for memory).
If you don't want the memblk stuff in for voyager, just edit the
makefile, and make sure drivers/base/memblk.c isn't compiled in for you.
Or, even more simply, when you write the topology init for voyager,
don't initialize any memblks... problem solved! ;)
> Ultimately, the scheduler could be tuned to use the topologies to make
> scheduling decisions. When that happens, we can probably fold the current
> Pentium Hyperthreading stuff into a simple topology map as well.
>
> I believe Martin Bligh and Bill Irwin are working (or at least thinking)
> somewhat along these lines, so I thought I'd gather feedback before jumping
> into a wholesale rewrite.
Martin already responded to this bit, but I'll reitterate a piece. The
scheduler already does use some of the topology macros. We (I?) don't
want to split the in-kernel topology into multiple different topology
infrastructures: one for VM, one for scheduling, one for I/O, etc. It
would be best if we could all use the same infrastructure, and just use
the information it provides for each subsystem. For example, the
scheduler could cache some of the cpu<->node mappings in local per-pool
arrays or something, rather than inventing new cpu<->pool topology.
> James Bottomley
>
>
> ------------------------------------------------------------------------
>
> # This is a BitKeeper generated patch for the following project:
> # Project Name: Linux kernel tree
> # This patch format is intended for GNU patch command version 2.5 or higher.
> # This patch includes the following deltas:
> # ChangeSet 1.934 -> 1.935
> # include/asm-i386/voyager.h 1.3 -> 1.4
> # arch/i386/Kconfig 1.16 -> 1.17
> # drivers/base/node.c 1.3 -> 1.4
> # include/asm-i386/topology.h 1.2 -> 1.3
> # include/asm-i386/numnodes.h 1.2 -> 1.3
> # arch/i386/mach-voyager/voyager_cat.c 1.7 -> 1.8
> # include/asm-i386/vic.h 1.4 -> 1.5
> # arch/i386/mach-voyager/Makefile 1.9 -> 1.10
> # drivers/base/Makefile 1.16 -> 1.17
> # (new) -> 1.1 arch/i386/mach-generic/machine_topology.h
> # (new) -> 1.1 arch/i386/mach-voyager/topology.c
> # (new) -> 1.1 arch/i386/mach-voyager/machine_topology.h
> #
> # The following is the BitKeeper ChangeSet Log
> # --------------------------------------------
> # 02/11/29 jejb@malley.(none) 1.935
> # add topology to voyager
> # --------------------------------------------
> #
> diff -Nru a/arch/i386/Kconfig b/arch/i386/Kconfig
> --- a/arch/i386/Kconfig Fri Nov 29 13:01:29 2002
> +++ b/arch/i386/Kconfig Fri Nov 29 13:01:29 2002
> @@ -1698,3 +1698,13 @@
> bool
> depends on SMP
> default y
> +
> +config BASE_NODE
> + bool
> + depends on NUMA || VOYAGER
> + default y
> +
> +config X86_NUMNODES
> + int
> + default "8" if VOYAGER
> + default "1" if !VOYAGER
Ok.. I *really* don't think that we need *another* maximum number of
nodes counter. We already have MAX_NR_NODES, and MAX_NUMNODES. The
last thing we need is one more.
> diff -Nru a/arch/i386/mach-generic/machine_topology.h b/arch/i386/mach-generic/machine_topology.h
> --- /dev/null Wed Dec 31 16:00:00 1969
> +++ b/arch/i386/mach-generic/machine_topology.h Fri Nov 29 13:01:29 2002
> @@ -0,0 +1,96 @@
> +/*
> + * linux/include/asm-i386/topology.h
> + *
> + * Written by: Matthew Dobson, IBM Corporation
> + *
> + * Copyright (C) 2002, IBM Corp.
> + *
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
> + * NON INFRINGEMENT. See the GNU General Public License for more
> + * details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> + *
> + * Send feedback to <colpatch@us.ibm.com>
> + */
> +#ifndef _MACHINE_TOPOLOGY_H
> +#define _MACHINE_TOPOLOGY_H
> +
> <snip>
> +
> +#endif
I like this. I wanted to get around to doing this anyhow. John Stultz
is doing this for summit, and I planned to follow his lead with the
sub-arch breakup. Again, we're all trying for the same thing, so one of
us should succeed.
> diff -Nru a/arch/i386/mach-voyager/Makefile b/arch/i386/mach-voyager/Makefile
> --- a/arch/i386/mach-voyager/Makefile Fri Nov 29 13:01:29 2002
> +++ b/arch/i386/mach-voyager/Makefile Fri Nov 29 13:01:29 2002
> @@ -10,7 +10,7 @@
> EXTRA_CFLAGS += -I../kernel
> export-objs :=
>
> -obj-y := setup.o voyager_basic.o voyager_thread.o
> +obj-y := setup.o voyager_basic.o voyager_thread.o topology.o
>
> obj-$(CONFIG_SMP) += voyager_smp.o voyager_cat.o
>
> diff -Nru a/arch/i386/mach-voyager/machine_topology.h b/arch/i386/mach-voyager/machine_topology.h
> --- /dev/null Wed Dec 31 16:00:00 1969
> +++ b/arch/i386/mach-voyager/machine_topology.h Fri Nov 29 13:01:29 2002
> @@ -0,0 +1,45 @@
> +/* -*- mode: c; c-basic-offset: 8 -*- */
> +
> +/* Copyright (C) 1999,2001
> + *
> + * Author: J.E.J.Bottomley@HansenPartnership.com
> + *
> + * linux/arch/i386/mach-voyager/machine_topology.h
> + */
> +#include <asm/voyager.h>
> +
> <SNIP>
> +
> +/* FIXME: these are useless defines just to get topology to compile
> + * The code needs to be NUMA cleaned up to separate node from what
> + * NUMA thinks of as a node */
> +#define __node_to_memblk(node) (node)
> +#define si_meminfo_node(i, j)
> +
If voyager has a flat memory space, I would highly recommend
#define __node_to_memblk(node) (0)
instead of (node). Returning the node number makes it look like each
node has it's own memblk with a 1-1 mapping between them. Also, your
si_meminfo_node should be able to more or less just call the non-node
version.
> diff -Nru a/arch/i386/mach-voyager/topology.c b/arch/i386/mach-voyager/topology.c
> --- /dev/null Wed Dec 31 16:00:00 1969
> +++ b/arch/i386/mach-voyager/topology.c Fri Nov 29 13:01:29 2002
> @@ -0,0 +1,51 @@
> +/* -*- mode: c; c-basic-offset: 8 -*- */
> +
> +/* Copyright (C) 2002
> + *
> + * Author: J.E.J.Bottomley@HansenPartnership.com
> + *
> + * voyager topology functions
> + */
> +
> +#include <linux/init.h>
> +#include <linux/string.h>
> +#include <asm/cpu.h>
> +#include <linux/smp.h>
> +#include <asm/topology.h>
> +
> +/* Topology mapping functions */
> +u32 voyager_node_to_cpu_mask[MAX_PROCESSOR_BOARDS] = { 0 };
> +u8 voyager_cpu_to_node[NR_CPUS] = { 0 };
> +u8 voyager_num_nodes = 0;
> +
> +struct i386_cpu cpu_devices[NR_CPUS];
> +struct i386_node node_devices[MAX_NUMNODES];
> +
> +static int __init topology_init(void)
> +{
> + int i;
> +
> + printk("VOYAGER BEGINNING TOPOLOGY INITIALISATION %d\n", MAX_NUMNODES);
> + memset(cpu_devices, 0, sizeof(cpu_devices));
> + memset(node_devices, 0, sizeof(node_devices));
> +
> + for (i = 0; i < num_online_nodes(); i++)
> + arch_register_node(i);
> + for (i = 0; i < NR_CPUS; i++)
> + if (cpu_possible(i)) arch_register_cpu(i);
> +
> + printk("NODES %d\n", num_online_nodes());
> + printk("CPU TO NODE: ");
> + for(i=0; i<NR_CPUS; i++)
> + if(cpu_possible(i))
> + printk("%d->%d ", i, voyager_cpu_to_node[i]);
> + printk("\nNODE TO CPU MASK: ");
> + for(i=0; i<voyager_num_nodes; i++)
> + printk("%d->0x%04x ", i, voyager_node_to_cpu_mask[i]);
> + printk("\n");
> +
> +
> + return 0;
> +}
> +
> +subsys_initcall(topology_init);
As I mentioned above, since you only have the one memblk, you could
easily call arch_register_memblk once, or just leave it alone like
you've done.
>
> <snip>
>
> diff -Nru a/drivers/base/Makefile b/drivers/base/Makefile
> --- a/drivers/base/Makefile Fri Nov 29 13:01:29 2002
> +++ b/drivers/base/Makefile Fri Nov 29 13:01:29 2002
> @@ -4,7 +4,8 @@
> driver.o class.o intf.o platform.o \
> cpu.o firmware.o
>
> -obj-$(CONFIG_NUMA) += node.o memblk.o
> +obj-$(CONFIG_BASE_NODE) += node.o
> +obj-$(CONFIG_NUMA) += memblk.o
>
> obj-y += fs/
>
Mentioned this above. You could leave it in and just register the
singular memblk if you liked.
> diff -Nru a/drivers/base/node.c b/drivers/base/node.c
> --- a/drivers/base/node.c Fri Nov 29 13:01:29 2002
> +++ b/drivers/base/node.c Fri Nov 29 13:01:29 2002
> @@ -93,7 +93,8 @@
>
> static int __init register_node_type(void)
> {
> + devclass_register(&node_devclass);
> driver_register(&node_driver);
> - return devclass_register(&node_devclass);
> + return 0; //devclass_register(&node_devclass);
> }
> postcore_initcall(register_node_type);
I've submitted this fix separtately. It really needs to go in the
mainline, without this NUMA boxen panic.
> diff -Nru a/include/asm-i386/numnodes.h b/include/asm-i386/numnodes.h
> --- a/include/asm-i386/numnodes.h Fri Nov 29 13:01:29 2002
> +++ b/include/asm-i386/numnodes.h Fri Nov 29 13:01:29 2002
> @@ -6,7 +6,7 @@
> #ifdef CONFIG_X86_NUMAQ
> #include <asm/numaq.h>
> #else
> -#define MAX_NUMNODES 1
> +#define MAX_NUMNODES CONFIG_X86_NUMNODES
> #endif /* CONFIG_X86_NUMAQ */
>
> #endif /* _ASM_MAX_NUMNODES_H */
I'd like to see some of the confusion with MAX_NR_NODES and MAX_NUMNODES
disappear, but apparently no one liked my patch. I'll dust it off and
resubmit it. I really think it'd be beneficial to only have one
variable for this. I also suppose that there's no reason this has to be
X86 specific. We could have this for most arch's and just default to 1.
>
><snip>
>
> diff -Nru a/include/asm-i386/vic.h b/include/asm-i386/vic.h
> --- a/include/asm-i386/vic.h Fri Nov 29 13:01:29 2002
> +++ b/include/asm-i386/vic.h Fri Nov 29 13:01:29 2002
> @@ -3,6 +3,8 @@
> * Author: J.E.J.Bottomley@HansenPartnership.com
> *
> * Standard include definitions for the NCR Voyager Interrupt Controller */
> +#ifndef _ASM_VIC_H
> +#define _ASM_VIC_H
>
> /* The eight CPI vectors. To activate a CPI, you write a bit mask
> * corresponding to the processor set to be interrupted into the
> @@ -59,3 +61,5 @@
> #define VIC_BOOT_INTERRUPT_MASK 0xfe
>
> extern void smp_vic_timer_interrupt(struct pt_regs *regs);
> +
> +#endif
> diff -Nru a/include/asm-i386/voyager.h b/include/asm-i386/voyager.h
> --- a/include/asm-i386/voyager.h Fri Nov 29 13:01:29 2002
> +++ b/include/asm-i386/voyager.h Fri Nov 29 13:01:29 2002
> @@ -3,6 +3,8 @@
> * Author: J.E.J.Bottomley@HansenPartnership.com
> *
> * Standard include definitions for the NCR Voyager system */
> +#ifndef _ASM_VOYAGER_H_
> +#define _ASM_VOYAGER_H_
>
> #undef VOYAGER_DEBUG
> #undef VOYAGER_CAT_DEBUG
> @@ -519,3 +521,5 @@
> #define VOYAGER_PSI_SUBREAD 2
> #define VOYAGER_PSI_SUBWRITE 3
> extern void voyager_cat_psi(__u8, __u16, __u8 *);
> +
> +#endif
This is voyager specific and I'm sure it's fine.
I'll have a go at a counter-patch tomorrow... ;)
Cheers!
-Matt
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2002-12-05 2:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-29 19:18 [RFC] rethinking the topology functions James Bottomley
2002-11-29 20:18 ` Martin J. Bligh
2002-12-05 2:02 ` Matthew Dobson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox