Well, after lot's of aggravation I finally got the dusty old numa box to boot up, and the upstream linux-2.6 kernel works fine. (I've attached the startup log showing 8 cores on 4 nodes.) The big difference appears in where memory is located. Your box has all the memory on node 0 where my box has memory on all the nodes... >From the working log: SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: PXM 2 -> APIC 4 -> Node 2 SRAT: PXM 2 -> APIC 5 -> Node 2 SRAT: PXM 3 -> APIC 6 -> Node 3 SRAT: PXM 3 -> APIC 7 -> Node 3 SRAT: Node 0 PXM 0 0-a0000 SRAT: Node 0 PXM 0 0-e4000000 SRAT: Node 0 PXM 0 0-200000000 SRAT: Node 1 PXM 1 200000000-400000000 SRAT: Node 2 PXM 2 400000000-600000000 SRAT: Node 3 PXM 3 600000000-800000000 >From the failing log: SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 0 PXM 0 0-40000000 The message that cpu's 2 & 3 having no node is therefore misleading, it should say that cpu's 2 & 3 have no "node local memory". But other than that, it should allocate the PERCPU memory on node 0 and everything's fine. Apparently then the only way to debug this is to fake a setup where some cpus have no node local memory and go from there. Unfortunately, the box I'm testing on has no working remote access. I'll see if I can't fake it out on a non-numa box until the lab is back open on Tuesday. Thanks, Mike Mel Gorman wrote: > On (14/02/08 12:41), Mike Travis didst pronounce: >> Mel Gorman wrote: ... >>>> >>> According to git-bisect, the problem patch is below. It doesn't back out >>> cleanly so I haven't verified for sure the bisect is correct yet. >> This might make sense. This code is in preparation for the extended >> apic's available on the new processors. I've tested the code with >> our simulator (with no errors) and I'm setting up to test on a real >> machine that has multiple numa nodes. I wonder if maybe BIOS is not >> providing correct node data, or the ACPI parsing is in error? You >> might try adding "apic=debug" to the boot command line. >> > > I tried this, but the dmesg complained about a malformed option. I'll > check out why tomorrow but it didn't appear particularly helpful. > >> For the short term, we can remove this patch if it's causing the >> problem. A more complete patch will be available soon that contains >> the entire set of x2apic changes. >> > > If you send me patches to apply on top of 2.6.25-rc1, I'll give them a spin > on the machine in question. Reverting didn't work out very well as there are > too many collisions with patches that were applied later. I eventually got > the machine booting but it only succeeds because it only brings up one core > on each processor. The patch, which is pretty brain damaged is below in case > it helps you guess what the real problem is. dmesg logs are attached of the > vanilla failure with acpi=debug and the log with the patch applied showing > "__cpu_up: bad cpu 1" and "__cpu_up: bad cpu3" (i.e. the second cores of > each machine). > > > diff -ru linux-2.6/arch/x86/kernel/genapic_64.c linux-2.6-working/arch/x86/kernel/genapic_64.c > --- linux-2.6/arch/x86/kernel/genapic_64.c 2008-02-14 16:32:55.000000000 -0600 > +++ linux-2.6-working/arch/x86/kernel/genapic_64.c 2008-02-14 15:46:18.000000000 -0600 > @@ -25,10 +25,10 @@ > #endif > > /* which logical CPU number maps to which CPU (physical APIC ID) */ > -u16 x86_cpu_to_apicid_init[NR_CPUS] __initdata > +u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata > = { [0 ... NR_CPUS-1] = BAD_APICID }; > void *x86_cpu_to_apicid_early_ptr; > -DEFINE_PER_CPU(u16, x86_cpu_to_apicid) = BAD_APICID; > +DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID; > EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid); > > struct genapic __read_mostly *genapic = &apic_flat; > diff -ru linux-2.6/arch/x86/kernel/mpparse_64.c linux-2.6-working/arch/x86/kernel/mpparse_64.c > --- linux-2.6/arch/x86/kernel/mpparse_64.c 2008-02-14 16:32:55.000000000 -0600 > +++ linux-2.6-working/arch/x86/kernel/mpparse_64.c 2008-02-14 15:45:44.000000000 -0600 > @@ -67,7 +67,7 @@ > /* Bitmask of physically existing CPUs */ > physid_mask_t phys_cpu_present_map = PHYSID_MASK_NONE; > > -u16 x86_bios_cpu_apicid_init[NR_CPUS] __initdata > +u8 x86_bios_cpu_apicid_init[NR_CPUS] __initdata > = { [0 ... NR_CPUS-1] = BAD_APICID }; > void *x86_bios_cpu_apicid_early_ptr; > DEFINE_PER_CPU(u16, x86_bios_cpu_apicid) = BAD_APICID; > diff -ru linux-2.6/include/asm-x86/smp_64.h linux-2.6-working/include/asm-x86/smp_64.h > --- linux-2.6/include/asm-x86/smp_64.h 2008-02-14 16:33:04.000000000 -0600 > +++ linux-2.6-working/include/asm-x86/smp_64.h 2008-02-14 15:43:01.000000000 -0600 > @@ -26,15 +26,16 @@ > extern int smp_call_function_mask(cpumask_t mask, void (*func)(void *), > void *info, int wait); > > -extern u16 __initdata x86_cpu_to_apicid_init[]; > -extern u16 __initdata x86_bios_cpu_apicid_init[]; > +extern u8 __initdata x86_cpu_to_apicid_init[]; > +extern u8 __initdata x86_bios_cpu_apicid_init[]; > extern void *x86_cpu_to_apicid_early_ptr; > extern void *x86_bios_cpu_apicid_early_ptr; > +DECLARE_PER_CPU(u8, x86_cpu_to_apicid); /* physical ID */ > +extern u8 bios_cpu_apicid[]; > > DECLARE_PER_CPU(cpumask_t, cpu_sibling_map); > DECLARE_PER_CPU(cpumask_t, cpu_core_map); > DECLARE_PER_CPU(u16, cpu_llc_id); > -DECLARE_PER_CPU(u16, x86_cpu_to_apicid); > DECLARE_PER_CPU(u16, x86_bios_cpu_apicid); > > static inline int cpu_present_to_apicid(int mps_cpu) > >