* x86_64: 2.6.14 with NUMA panics at boot @ 2005-10-28 19:26 Janne M O Heikkinen 2005-10-28 21:06 ` Andi Kleen 0 siblings, 1 reply; 18+ messages in thread From: Janne M O Heikkinen @ 2005-10-28 19:26 UTC (permalink / raw) To: linux-kernel With CONFIG_K8_NUMA I get the following right after boot: PANIC: early exception rip ffffffff8023429f error 0 cr2 0 PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd023 Looking at the System.map 8023429f seems to be find_first_bit and 80118993a safe_smp_processor_id. When I compile kernel without K8 NUMA it boots fine but eg. ATI Radeon driver doesn work. Board I'm using is Tyan S2885 with two Opteron 246's and 4GB ram. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-28 19:26 x86_64: 2.6.14 with NUMA panics at boot Janne M O Heikkinen @ 2005-10-28 21:06 ` Andi Kleen 2005-10-28 22:06 ` Janne M O Heikkinen 0 siblings, 1 reply; 18+ messages in thread From: Andi Kleen @ 2005-10-28 21:06 UTC (permalink / raw) To: Janne M O Heikkinen; +Cc: linux-kernel Janne M O Heikkinen <jmoheikk@cc.helsinki.fi> writes: > With CONFIG_K8_NUMA I get the following right after boot: > > PANIC: early exception rip ffffffff8023429f error 0 cr2 0 > PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd023 > > Looking at the System.map 8023429f seems to be find_first_bit > and 80118993a safe_smp_processor_id. When I compile kernel without > K8 NUMA it boots fine but eg. ATI Radeon driver doesn work. > > Board I'm using is Tyan S2885 with two Opteron 246's and 4GB ram. Did earlier kernels work? Please post full log with earlyprintk=vga or earlyprintk=serial,ttySx,baud -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-28 21:06 ` Andi Kleen @ 2005-10-28 22:06 ` Janne M O Heikkinen 2005-10-28 22:21 ` Janne M O Heikkinen 2005-10-29 10:01 ` Andi Kleen 0 siblings, 2 replies; 18+ messages in thread From: Janne M O Heikkinen @ 2005-10-28 22:06 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Sat, 28 Oct 2005, Andi Kleen wrote: > Janne M O Heikkinen <jmoheikk@cc.helsinki.fi> writes: > >> With CONFIG_K8_NUMA I get the following right after boot: >> PANIC: early exception rip ffffffff8023429f error 0 cr2 0 >> PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd023 > Did earlier kernels work? Please post full log with earlyprintk=vga > or earlyprintk=serial,ttySx,baud 2.6.13.4 works just fine, this is what I got with earlyprintk=vga: Loading K-2.6.14 Bootdata ok (command line is auto BOOT_IMAGE=K-2.6.14 ro root=901 resume=/dev/md0 selinux=0 splash=verbose console=tty0 earlyprintk=vga) Linux version 2.6.14-smp (jamse@linux) (gcc version 4.0.2) #3 SMP PREEMPT Fri Oct 28 20:49:34 EEST 2005one. BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable) BIOS-e820: 00000000bfff0000 - 00000000bffff000 (ACPI data) BIOS-e820: 00000000bffff000 - 00000000c0000000 (ACPI NVS) BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000140000000 (usable) SRAT: PXM 0 -> APIC 0 -> CPU 0 -> Node 0 SRAT: PXM 1 -> APIC 1 -> CPU 1 -> Node 1 SRAT: Node 0 PXM 0 100000-7fffffff SRAT: Node 1 PXM 1 80000000-bfffffff SRAT: Node 1 PXM 1 80000000-13fffffff SRAT: Node 0 PXM 0 0-7fffffff Bootmem setup node 0 0000000000000000-000000007fffffff PANIC: early exception rip ffffffff8023429f error 0 cr2 0 PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd023 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-28 22:06 ` Janne M O Heikkinen @ 2005-10-28 22:21 ` Janne M O Heikkinen 2005-10-29 10:01 ` Andi Kleen 1 sibling, 0 replies; 18+ messages in thread From: Janne M O Heikkinen @ 2005-10-28 22:21 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Sat, 29 Oct 2005, Janne M O Heikkinen wrote: There was one line missing for that post, should have been: ... BIOS-e820: 0000000100000000 - 0000000140000000 (usable) Kernel direct mapping tables upto ffff810140000000 @ 8000 - e0000 SRAT: PXM 0 -> APIC 0 -> CPU 0 -> Node 0 ... ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-28 22:06 ` Janne M O Heikkinen 2005-10-28 22:21 ` Janne M O Heikkinen @ 2005-10-29 10:01 ` Andi Kleen 2005-10-29 11:08 ` Janne M O Heikkinen 1 sibling, 1 reply; 18+ messages in thread From: Andi Kleen @ 2005-10-29 10:01 UTC (permalink / raw) To: Janne M O Heikkinen; +Cc: linux-kernel On Saturday 29 October 2005 00:06, Janne M O Heikkinen wrote: > On Sat, 28 Oct 2005, Andi Kleen wrote: > > Janne M O Heikkinen <jmoheikk@cc.helsinki.fi> writes: > >> With CONFIG_K8_NUMA I get the following right after boot: > >> PANIC: early exception rip ffffffff8023429f error 0 cr2 0 > >> PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd023 > > > > Did earlier kernels work? Please post full log with earlyprintk=vga > > or earlyprintk=serial,ttySx,baud > > 2.6.13.4 works just fine, this is what I got with earlyprintk=vga: > > Loading K-2.6.14 > Bootdata ok (command line is auto BOOT_IMAGE=K-2.6.14 ro root=901 > resume=/dev/md0 selinux=0 splash=verbose console=tty0 earlyprintk=vga) > Linux version 2.6.14-smp (jamse@linux) (gcc version 4.0.2) #3 SMP > PREEMPT Fri Oct 28 20:49:34 EEST 2005one. > BIOS-provided physical RAM map: > BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) > BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) > BIOS-e820: 0000000000100000 - 00000000bfff0000 (usable) > BIOS-e820: 00000000bfff0000 - 00000000bffff000 (ACPI data) > BIOS-e820: 00000000bffff000 - 00000000c0000000 (ACPI NVS) > BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved) > BIOS-e820: 0000000100000000 - 0000000140000000 (usable) > SRAT: PXM 0 -> APIC 0 -> CPU 0 -> Node 0 > SRAT: PXM 1 -> APIC 1 -> CPU 1 -> Node 1 > SRAT: Node 0 PXM 0 100000-7fffffff > SRAT: Node 1 PXM 1 80000000-bfffffff > SRAT: Node 1 PXM 1 80000000-13fffffff > SRAT: Node 0 PXM 0 0-7fffffff > Bootmem setup node 0 0000000000000000-000000007fffffff > PANIC: early exception rip ffffffff8023429f error 0 cr2 0 > PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd02 And it boots with numa=noacpi ? -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-29 10:01 ` Andi Kleen @ 2005-10-29 11:08 ` Janne M O Heikkinen 2005-10-29 14:54 ` Janne M O Heikkinen 0 siblings, 1 reply; 18+ messages in thread From: Janne M O Heikkinen @ 2005-10-29 11:08 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Sat, 29 Oct 2005, Andi Kleen wrote: > On Saturday 29 October 2005 00:06, Janne M O Heikkinen wrote: >> PANIC: early exception rip ffffffff8023429f error 0 cr2 0 >> PANIC: early exception rip ffffffff8011893a error 0 cr2 ffffffffff5fd02 > > And it boots with numa=noacpi ? No, I get same panics with numa=noacpi or even with numa=off. If I compile 2.6.14 kernel without CONFIG_ACPI_NUMA it does boot. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-29 11:08 ` Janne M O Heikkinen @ 2005-10-29 14:54 ` Janne M O Heikkinen 2005-10-29 16:41 ` Andi Kleen 0 siblings, 1 reply; 18+ messages in thread From: Janne M O Heikkinen @ 2005-10-29 14:54 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel On Sat, 29 Oct 2005, Janne M O Heikkinen wrote: > No, I get same panics with numa=noacpi or even with numa=off. If I compile > 2.6.14 kernel without CONFIG_ACPI_NUMA it does boot. It wasn't removing of CONFIG_ACPI_NUMA that made it boot after all, I had also changed memory model from "Sparse" to "Discontiguous". And now when I recompiled with CONFIG_ACPI_NUMA=y and with "Discontiguous" memory model it booted just fine. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-29 14:54 ` Janne M O Heikkinen @ 2005-10-29 16:41 ` Andi Kleen 2005-10-29 17:30 ` Dave Hansen 0 siblings, 1 reply; 18+ messages in thread From: Andi Kleen @ 2005-10-29 16:41 UTC (permalink / raw) To: Janne M O Heikkinen; +Cc: linux-kernel, haveblue On Saturday 29 October 2005 16:54, Janne M O Heikkinen wrote: > On Sat, 29 Oct 2005, Janne M O Heikkinen wrote: > > > No, I get same panics with numa=noacpi or even with numa=off. If I compile > > 2.6.14 kernel without CONFIG_ACPI_NUMA it does boot. > > It wasn't removing of CONFIG_ACPI_NUMA that made it boot after all, I had > also changed memory model from "Sparse" to "Discontiguous". And now > when I recompiled with CONFIG_ACPI_NUMA=y and with "Discontiguous" memory > model it booted just fine. Ok, that would explain it. I never test sparse, only discontiguous. sparse is only an experimental option that is not really maintained yet. Probably need to disable it if it's broken. Perhaps Dave H. knows what to do with it. -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-29 16:41 ` Andi Kleen @ 2005-10-29 17:30 ` Dave Hansen 2005-10-31 0:17 ` Bob Picco 0 siblings, 1 reply; 18+ messages in thread From: Dave Hansen @ 2005-10-29 17:30 UTC (permalink / raw) To: Andi Kleen; +Cc: Janne M O Heikkinen, Linux Kernel Mailing List On Sat, 2005-10-29 at 18:41 +0200, Andi Kleen wrote: > On Saturday 29 October 2005 16:54, Janne M O Heikkinen wrote: > > On Sat, 29 Oct 2005, Janne M O Heikkinen wrote: > > > > > No, I get same panics with numa=noacpi or even with numa=off. If I compile > > > 2.6.14 kernel without CONFIG_ACPI_NUMA it does boot. > > > > It wasn't removing of CONFIG_ACPI_NUMA that made it boot after all, I had > > also changed memory model from "Sparse" to "Discontiguous". And now > > when I recompiled with CONFIG_ACPI_NUMA=y and with "Discontiguous" memory > > model it booted just fine. > > Ok, that would explain it. I never test sparse, only discontiguous. > sparse is only an experimental option that is not really maintained > yet. Probably need to disable it if it's broken. > > Perhaps Dave H. knows what to do with it. I'll try to dig up an Opteron machine on Monday and see what I can do. -- Dave ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-29 17:30 ` Dave Hansen @ 2005-10-31 0:17 ` Bob Picco 2005-10-31 2:12 ` Andi Kleen 0 siblings, 1 reply; 18+ messages in thread From: Bob Picco @ 2005-10-31 0:17 UTC (permalink / raw) To: Dave Hansen; +Cc: Andi Kleen, Janne M O Heikkinen, Linux Kernel Mailing List Dave Hansen wrote: [Sat Oct 29 2005, 01:30:17PM EDT] > On Sat, 2005-10-29 at 18:41 +0200, Andi Kleen wrote: > > On Saturday 29 October 2005 16:54, Janne M O Heikkinen wrote: > > > On Sat, 29 Oct 2005, Janne M O Heikkinen wrote: > > > > > > > No, I get same panics with numa=noacpi or even with numa=off. If I compile > > > > 2.6.14 kernel without CONFIG_ACPI_NUMA it does boot. > > > > > > It wasn't removing of CONFIG_ACPI_NUMA that made it boot after all, I had > > > also changed memory model from "Sparse" to "Discontiguous". And now > > > when I recompiled with CONFIG_ACPI_NUMA=y and with "Discontiguous" memory > > > model it booted just fine. > > > > Ok, that would explain it. I never test sparse, only discontiguous. > > sparse is only an experimental option that is not really maintained > > yet. Probably need to disable it if it's broken. > > > > Perhaps Dave H. knows what to do with it. > > I'll try to dig up an Opteron machine on Monday and see what I can do. > > -- Dave Dave, This is a slightly modified patch I used on x86_64 for EXTREME testing. The original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. It will apply with this untested patch. The patch needs to have arch_sparse_init which is only active for SPARSEMEM. This patch was just for testing EXTREME on x86_64 NUMA and needs review. I think the bootmem allocator is being used before initialized. This wouldn't have happened before SPARSEMEM_EXTREME became the default. If you feel my analysis is correct, I'll generate a cleaner patch and test on my 4 way. bob Index: linux-2.6.14/arch/x86_64/mm/numa.c =================================================================== --- linux-2.6.14.orig/arch/x86_64/mm/numa.c 2005-10-28 14:24:58.000000000 -0400 +++ linux-2.6.14/arch/x86_64/mm/numa.c 2005-10-30 18:49:20.000000000 -0500 @@ -94,7 +94,6 @@ void __init setup_node_bootmem(int nodei start_pfn = start >> PAGE_SHIFT; end_pfn = end >> PAGE_SHIFT; - memory_present(nodeid, start_pfn, end_pfn); nodedata_phys = find_e820_area(start, end, pgdat_size); if (nodedata_phys == -1L) panic("Cannot find memory pgdat in node %d\n", nodeid); @@ -280,9 +279,14 @@ unsigned long __init numa_free_all_bootm void __init paging_init(void) { int i; - for_each_online_node(i) { + + for_each_online_node(i) + memory_present(node_start_pfn(i), node_end_pfn(i)); + + sparse_init(); + + for_each_online_node(i) setup_node_zones(i); - } } /* [numa=off] */ Index: linux-2.6.14/arch/x86_64/kernel/setup.c =================================================================== --- linux-2.6.14.orig/arch/x86_64/kernel/setup.c 2005-10-28 14:24:58.000000000 -0400 +++ linux-2.6.14/arch/x86_64/kernel/setup.c 2005-10-30 18:50:05.000000000 -0500 @@ -657,8 +657,6 @@ void __init setup_arch(char **cmdline_p) } #endif - sparse_init(); - paging_init(); check_ioapic(); ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-31 0:17 ` Bob Picco @ 2005-10-31 2:12 ` Andi Kleen 2005-10-31 1:40 ` Bob Picco 2005-11-02 5:07 ` Martin J. Bligh 0 siblings, 2 replies; 18+ messages in thread From: Andi Kleen @ 2005-10-31 2:12 UTC (permalink / raw) To: Bob Picco; +Cc: Dave Hansen, Janne M O Heikkinen, Linux Kernel Mailing List On Monday 31 October 2005 01:17, Bob Picco wrote: > This is a slightly modified patch I used on x86_64 for EXTREME testing. The > original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. It will > apply with this untested patch. The patch needs to have arch_sparse_init > which is only active for SPARSEMEM. This patch was just for testing EXTREME > on x86_64 NUMA and needs review. > > I think the bootmem allocator is being used before initialized. This > wouldn't have happened before SPARSEMEM_EXTREME became the default. > > If you feel my analysis is correct, I'll generate a cleaner patch and > test on my 4 way. Ok the question is - why did nobody submit this patch in time? When sparse was merged I assumed folks would actually test and maintain it. But that doesn't seem to be the case? Somewhat surprising. I personally don't care much about sparsemem right now because it doesn't have any advantage and if it's unmaintained would consider to mark it CONFIG_BROKEN. That's simply because we can't have highly experimental CONFIGs in a production kernel that unsuspecting users can just set and break their configuration. Dave, is there someone in charge for sparsemem on x86-64? -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-31 2:12 ` Andi Kleen @ 2005-10-31 1:40 ` Bob Picco 2005-10-31 3:28 ` Andi Kleen 2005-11-02 5:07 ` Martin J. Bligh 1 sibling, 1 reply; 18+ messages in thread From: Bob Picco @ 2005-10-31 1:40 UTC (permalink / raw) To: Andi Kleen Cc: Bob Picco, Dave Hansen, Janne M O Heikkinen, matthew.e.tolentino, Linux Kernel Mailing List Added Matt to cc: Andi Kleen wrote: [Sun Oct 30 2005, 09:12:17PM EST] > On Monday 31 October 2005 01:17, Bob Picco wrote: > > > This is a slightly modified patch I used on x86_64 for EXTREME testing. The > > original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. It will > > apply with this untested patch. The patch needs to have arch_sparse_init > > which is only active for SPARSEMEM. This patch was just for testing EXTREME > > on x86_64 NUMA and needs review. > > > > I think the bootmem allocator is being used before initialized. This > > wouldn't have happened before SPARSEMEM_EXTREME became the default. > > > > If you feel my analysis is correct, I'll generate a cleaner patch and > > test on my 4 way. > > Ok the question is - why did nobody submit this patch in time? When > sparse was merged I assumed folks would actually test and maintain > it. But that doesn't seem to be the case? Somewhat surprising. Well I did post it on lhms mailing list. However it's incomplete because it doesn't address !NUMA. I used it specifically for looking at performance regression as a result of SPARSEMEM_EXTREME which we were analyzing at that time. It wasn't intended for inclusion. Also EXTREME came later in the initial SPARSEMEM submission to address very sparse arch platforms. So I think it slipped by us; at least me. > > I personally don't care much about sparsemem right now because it doesn't have > any advantage and if it's unmaintained would consider to mark it > CONFIG_BROKEN. That's simply because we can't have highly experimental > CONFIGs in a production kernel that unsuspecting users can just set and break > their configuration. > > Dave, is there someone in charge for sparsemem on x86-64? Well I think Matt (matthew.e.tolentino@intel.com) is maintaining but could be wrong. I'll pick it up should Matt not have the time or no other volunteer come forward. > > -Andi bob ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-31 1:40 ` Bob Picco @ 2005-10-31 3:28 ` Andi Kleen 2005-10-31 2:46 ` Bob Picco 0 siblings, 1 reply; 18+ messages in thread From: Andi Kleen @ 2005-10-31 3:28 UTC (permalink / raw) To: Bob Picco Cc: Dave Hansen, Janne M O Heikkinen, matthew.e.tolentino, Linux Kernel Mailing List On Monday 31 October 2005 02:40, Bob Picco wrote: > Added Matt to cc: > Andi Kleen wrote: [Sun Oct 30 2005, 09:12:17PM EST] > > > On Monday 31 October 2005 01:17, Bob Picco wrote: > > > This is a slightly modified patch I used on x86_64 for EXTREME testing. > > > The original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. > > > It will apply with this untested patch. The patch needs to have > > > arch_sparse_init which is only active for SPARSEMEM. This patch was > > > just for testing EXTREME on x86_64 NUMA and needs review. > > > > > > I think the bootmem allocator is being used before initialized. This > > > wouldn't have happened before SPARSEMEM_EXTREME became the default. > > > > > > If you feel my analysis is correct, I'll generate a cleaner patch and > > > test on my 4 way. > > > > Ok the question is - why did nobody submit this patch in time? When > > sparse was merged I assumed folks would actually test and maintain > > it. But that doesn't seem to be the case? Somewhat surprising. > > Well I did post it on lhms mailing list. Fixes for code that is in mainline needs to go to the appropiate mainline mailing list (for x86-64 that is l-k and discuss@x86-64.org) and maintainers. > However it's incomplete because > it doesn't address !NUMA. So i should not apply it yet? -Andi ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-31 3:28 ` Andi Kleen @ 2005-10-31 2:46 ` Bob Picco 0 siblings, 0 replies; 18+ messages in thread From: Bob Picco @ 2005-10-31 2:46 UTC (permalink / raw) To: Andi Kleen Cc: Bob Picco, Dave Hansen, Janne M O Heikkinen, matthew.e.tolentino, Linux Kernel Mailing List Andi Kleen wrote: [Sun Oct 30 2005, 10:28:49PM EST] > On Monday 31 October 2005 02:40, Bob Picco wrote: > > Added Matt to cc: > > Andi Kleen wrote: [Sun Oct 30 2005, 09:12:17PM EST] > > > > > On Monday 31 October 2005 01:17, Bob Picco wrote: > > > > This is a slightly modified patch I used on x86_64 for EXTREME testing. > > > > The original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. > > > > It will apply with this untested patch. The patch needs to have > > > > arch_sparse_init which is only active for SPARSEMEM. This patch was > > > > just for testing EXTREME on x86_64 NUMA and needs review. > > > > > > > > I think the bootmem allocator is being used before initialized. This > > > > wouldn't have happened before SPARSEMEM_EXTREME became the default. > > > > > > > > If you feel my analysis is correct, I'll generate a cleaner patch and > > > > test on my 4 way. > > > > > > Ok the question is - why did nobody submit this patch in time? When > > > sparse was merged I assumed folks would actually test and maintain > > > it. But that doesn't seem to be the case? Somewhat surprising. > > > > Well I did post it on lhms mailing list. > > Fixes for code that is in mainline needs to go to the appropiate mainline > mailing list (for x86-64 that is l-k and discuss@x86-64.org) and maintainers. Well it wasn't intended for inclusion. I was just trying to help Dave out by not pursuing an issue which I've already looked at some. > > > However it's incomplete because > > it doesn't address !NUMA. > > So i should not apply it yet? Nope. It's incomplete. I'll wait to see whether Matt is on this. Otherwise, I'll put a patch together and test it within the next couple of days. > > -Andi > bob ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-10-31 2:12 ` Andi Kleen 2005-10-31 1:40 ` Bob Picco @ 2005-11-02 5:07 ` Martin J. Bligh 2005-11-02 18:08 ` Andy Whitcroft 1 sibling, 1 reply; 18+ messages in thread From: Martin J. Bligh @ 2005-11-02 5:07 UTC (permalink / raw) To: Andi Kleen, Bob Picco, Andy Whitcroft Cc: Dave Hansen, Janne M O Heikkinen, Linux Kernel Mailing List --Andi Kleen <ak@suse.de> wrote (on Monday, October 31, 2005 03:12:17 +0100): > On Monday 31 October 2005 01:17, Bob Picco wrote: > >> This is a slightly modified patch I used on x86_64 for EXTREME testing. The >> original 2.6.13-rc1-mhp1 patch didn't apply cleanly against 2.6.14. It will >> apply with this untested patch. The patch needs to have arch_sparse_init >> which is only active for SPARSEMEM. This patch was just for testing EXTREME >> on x86_64 NUMA and needs review. >> >> I think the bootmem allocator is being used before initialized. This >> wouldn't have happened before SPARSEMEM_EXTREME became the default. >> >> If you feel my analysis is correct, I'll generate a cleaner patch and >> test on my 4 way. > > Ok the question is - why did nobody submit this patch in time? When > sparse was merged I assumed folks would actually test and maintain > it. But that doesn't seem to be the case? Somewhat surprising. > > I personally don't care much about sparsemem right now because it doesn't have > any advantage and if it's unmaintained would consider to mark it > CONFIG_BROKEN. That's simply because we can't have highly experimental > CONFIGs in a production kernel that unsuspecting users can just set and break > their configuration. > > Dave, is there someone in charge for sparsemem on x86-64? Sparsemem is Andy's baby. He is duly cc'ed. M. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-11-02 5:07 ` Martin J. Bligh @ 2005-11-02 18:08 ` Andy Whitcroft 2005-11-03 14:43 ` Bob Picco 0 siblings, 1 reply; 18+ messages in thread From: Andy Whitcroft @ 2005-11-02 18:08 UTC (permalink / raw) To: Martin J. Bligh Cc: Andi Kleen, Bob Picco, Dave Hansen, Janne M O Heikkinen, Linux Kernel Mailing List Martin J. Bligh wrote: > > --Andi Kleen <ak@suse.de> wrote (on Monday, October 31, 2005 03:12:17 +0100): > > >>On Monday 31 October 2005 01:17, Bob Picco wrote: >>Ok the question is - why did nobody submit this patch in time? When >>sparse was merged I assumed folks would actually test and maintain >>it. But that doesn't seem to be the case? Somewhat surprising. We are activly maintaining sparsemem. But we do seem to have fallen short on the testing front on some of the architectures. I'm looking right now into getting some automated testing sorted out for SPARSEMEM specifically so that we catch this stuff much earlier in the pipeline, as its much simpler for us to find the earlier a problem appears. >>I personally don't care much about sparsemem right now because it doesn't have >>any advantage and if it's unmaintained would consider to mark it >>CONFIG_BROKEN. That's simply because we can't have highly experimental >>CONFIGs in a production kernel that unsuspecting users can just set and break >>their configuration. >> >>Dave, is there someone in charge for sparsemem on x86-64? I had assumed that it was being maintained, but its not obvious from this thread that we're all on the same page. But we'll find out and get that sorted. -apw ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-11-02 18:08 ` Andy Whitcroft @ 2005-11-03 14:43 ` Bob Picco 2005-11-03 17:06 ` Andy Whitcroft 0 siblings, 1 reply; 18+ messages in thread From: Bob Picco @ 2005-11-03 14:43 UTC (permalink / raw) To: Andy Whitcroft Cc: Martin J. Bligh, Andi Kleen, Bob Picco, Dave Hansen, Janne M O Heikkinen, matthew.e.tolentino, discuss, Linux Kernel Mailing List Added Matt and discuss@x86-64.org to cc: Andy Wihitcroft wrote: [Wed Nov 02 2005, 01:08:03PM EST] > Martin J. Bligh wrote: > > > > --Andi Kleen <ak@suse.de> wrote (on Monday, October 31, 2005 03:12:17 +0100): > > > > > >>On Monday 31 October 2005 01:17, Bob Picco wrote: > >>Ok the question is - why did nobody submit this patch in time? When > >>sparse was merged I assumed folks would actually test and maintain > >>it. But that doesn't seem to be the case? Somewhat surprising. > > We are activly maintaining sparsemem. But we do seem to have fallen > short on the testing front on some of the architectures. I'm looking > right now into getting some automated testing sorted out for SPARSEMEM > specifically so that we catch this stuff much earlier in the pipeline, > as its much simpler for us to find the earlier a problem appears. > > >>I personally don't care much about sparsemem right now because it doesn't have > >>any advantage and if it's unmaintained would consider to mark it > >>CONFIG_BROKEN. That's simply because we can't have highly experimental > >>CONFIGs in a production kernel that unsuspecting users can just set and break > >>their configuration. > >> > >>Dave, is there someone in charge for sparsemem on x86-64? > > I had assumed that it was being maintained, but its not obvious from > this thread that we're all on the same page. But we'll find out and get > that sorted. > > -apw > - Matt responded to a private that I posted to Dave and Matt. Matt is traveling and told me to go ahead and post a fix. I removed memory_present called from the FLATMEM routine contig_initmem_init. Otherwise my original quick patch used for testing SPARSEMEM EXTREME was nearly complete. I've boot tested all three configurations (SPARSEMEM, DISCONTIGMEM and CONTIG) on my DL585 (4 node machine). bob Signed-off-by: Bob Picco <bob.picco@hp.com> arch/x86_64/kernel/setup.c | 3 --- arch/x86_64/mm/numa.c | 18 +++++++++++++++++- 2 files changed, 17 insertions(+), 4 deletions(-) Index: linux-2.6.14/arch/x86_64/kernel/setup.c =================================================================== --- linux-2.6.14.orig/arch/x86_64/kernel/setup.c 2005-10-30 20:14:11.000000000 -0500 +++ linux-2.6.14/arch/x86_64/kernel/setup.c 2005-11-02 14:23:18.000000000 -0500 @@ -412,7 +412,6 @@ contig_initmem_init(unsigned long start_ { unsigned long bootmap_size, bootmap; - memory_present(0, start_pfn, end_pfn); bootmap_size = bootmem_bootmap_pages(end_pfn)<<PAGE_SHIFT; bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size); if (bootmap == -1L) @@ -657,8 +656,6 @@ void __init setup_arch(char **cmdline_p) } #endif - sparse_init(); - paging_init(); check_ioapic(); Index: linux-2.6.14/arch/x86_64/mm/numa.c =================================================================== --- linux-2.6.14.orig/arch/x86_64/mm/numa.c 2005-10-31 11:31:02.000000000 -0500 +++ linux-2.6.14/arch/x86_64/mm/numa.c 2005-11-02 17:35:02.000000000 -0500 @@ -94,7 +94,6 @@ void __init setup_node_bootmem(int nodei start_pfn = start >> PAGE_SHIFT; end_pfn = end >> PAGE_SHIFT; - memory_present(nodeid, start_pfn, end_pfn); nodedata_phys = find_e820_area(start, end, pgdat_size); if (nodedata_phys == -1L) panic("Cannot find memory pgdat in node %d\n", nodeid); @@ -277,9 +276,26 @@ unsigned long __init numa_free_all_bootm return pages; } +#ifdef CONFIG_SPARSEMEM +static void __init arch_sparse_init(void) +{ + int i; + + for_each_online_node(i) + memory_present(i, node_start_pfn(i), node_end_pfn(i)); + + sparse_init(); +} +#else +#define arch_sparse_init() do {} while (0) +#endif + void __init paging_init(void) { int i; + + arch_sparse_init(); + for_each_online_node(i) { setup_node_zones(i); } ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: x86_64: 2.6.14 with NUMA panics at boot 2005-11-03 14:43 ` Bob Picco @ 2005-11-03 17:06 ` Andy Whitcroft 0 siblings, 0 replies; 18+ messages in thread From: Andy Whitcroft @ 2005-11-03 17:06 UTC (permalink / raw) To: Bob Picco Cc: Martin J. Bligh, Andi Kleen, Dave Hansen, Janne M O Heikkinen, matthew.e.tolentino, discuss, Linux Kernel Mailing List Bob Picco wrote: > Matt responded to a private that I posted to Dave and Matt. Matt is > traveling and told me to go ahead and post a fix. > > I removed memory_present called from the FLATMEM routine contig_initmem_init. > Otherwise my original quick patch used for testing SPARSEMEM EXTREME > was nearly complete. > > I've boot tested all three configurations (SPARSEMEM, DISCONTIGMEM and CONTIG) > on my DL585 (4 node machine). I'll test on it and let you know if it works for me too. Thanks. -apw ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2005-11-03 17:07 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-10-28 19:26 x86_64: 2.6.14 with NUMA panics at boot Janne M O Heikkinen 2005-10-28 21:06 ` Andi Kleen 2005-10-28 22:06 ` Janne M O Heikkinen 2005-10-28 22:21 ` Janne M O Heikkinen 2005-10-29 10:01 ` Andi Kleen 2005-10-29 11:08 ` Janne M O Heikkinen 2005-10-29 14:54 ` Janne M O Heikkinen 2005-10-29 16:41 ` Andi Kleen 2005-10-29 17:30 ` Dave Hansen 2005-10-31 0:17 ` Bob Picco 2005-10-31 2:12 ` Andi Kleen 2005-10-31 1:40 ` Bob Picco 2005-10-31 3:28 ` Andi Kleen 2005-10-31 2:46 ` Bob Picco 2005-11-02 5:07 ` Martin J. Bligh 2005-11-02 18:08 ` Andy Whitcroft 2005-11-03 14:43 ` Bob Picco 2005-11-03 17:06 ` Andy Whitcroft
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox