* 2.6.23 boot failures on x86-64.
@ 2007-10-29 17:50 Dave Jones
2007-10-29 18:07 ` [stable] " Greg KH
2007-10-29 18:18 ` Andi Kleen
0 siblings, 2 replies; 17+ messages in thread
From: Dave Jones @ 2007-10-29 17:50 UTC (permalink / raw)
To: Linux Kernel
Cc: Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable,
Andrew Morton, Linus Torvalds
We've had a number of people reporting that their x86-64s stopped booting
when they moved to 2.6.23. It rebooted just after discovering the AGP bridge
as a result of the IOMMU init.
Martin tracked this down to the following commit.
commit 2e1c49db4c640b35df13889b86b9d62215ade4b6
Author: Zou Nan hai <nanhai.zou@intel.com>
Date: Fri Jun 1 00:46:28 2007 -0700
x86_64: allocate sparsemem memmap above 4G
On systems with huge amount of physical memory, VFS cache and memory memmap
may eat all available system memory under 4G, then the system may fail to
allocate swiotlb bounce buffer.
There was a fix for this issue in arch/x86_64/mm/numa.c, but that fix dose
not cover sparsemem model.
This patch add fix to sparsemem model by first try to allocate memmap above
4G.
Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This should be probably be reverted for 2.6.23-stable, and either fixed
properly in .24, or reverted there too.
More info at https://bugzilla.redhat.com/show_bug.cgi?id=249174
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 17:50 2.6.23 boot failures on x86-64 Dave Jones @ 2007-10-29 18:07 ` Greg KH 2007-10-29 18:37 ` Linus Torvalds 2007-10-29 18:18 ` Andi Kleen 1 sibling, 1 reply; 17+ messages in thread From: Greg KH @ 2007-10-29 18:07 UTC (permalink / raw) To: Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Linus Torvalds On Mon, Oct 29, 2007 at 01:50:14PM -0400, Dave Jones wrote: > We've had a number of people reporting that their x86-64s stopped booting > when they moved to 2.6.23. It rebooted just after discovering the AGP bridge > as a result of the IOMMU init. > > Martin tracked this down to the following commit. > > > commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 > Author: Zou Nan hai <nanhai.zou@intel.com> > Date: Fri Jun 1 00:46:28 2007 -0700 > > x86_64: allocate sparsemem memmap above 4G > > On systems with huge amount of physical memory, VFS cache and memory memmap > may eat all available system memory under 4G, then the system may fail to > allocate swiotlb bounce buffer. > > There was a fix for this issue in arch/x86_64/mm/numa.c, but that fix dose > not cover sparsemem model. > > This patch add fix to sparsemem model by first try to allocate memmap above > 4G. > > Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> > Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> > Cc: Andi Kleen <ak@suse.de> > Cc: <stable@kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > > This should be probably be reverted for 2.6.23-stable, and either fixed > properly in .24, or reverted there too. I'll be glad to revert it in -stable, if it's also reverted in Linus's tree first :) thanks, greg k-h ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 18:07 ` [stable] " Greg KH @ 2007-10-29 18:37 ` Linus Torvalds 2007-10-29 19:51 ` Christoph Lameter ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Linus Torvalds @ 2007-10-29 18:37 UTC (permalink / raw) To: Greg KH Cc: Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Christoph Lameter, Andy Whitcroft, Mel Gorman On Mon, 29 Oct 2007, Greg KH wrote: > > I'll be glad to revert it in -stable, if it's also reverted in Linus's > tree first :) We've had some changes since 2.6.23, and afaik, the "alloc_bootmem_high_node()" code is alreadt effectively dead there. It's only called if CONFIG_SPARSEMEM_VMEMMAP is *not* enabled, and I *think* we enable it by force on x86-64 these days. More people added to Cc, just to clarify whether I'm just confused. Andy, Christoph, Mel: commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 aka "x86_64: allocate sparsemem memmap above 4G" is the one that causes the failures, just fyi. Martin - it would be great if you could try out your failing machine with 2.6.24-rc1 (or a nightly snapshot or current git.. the more recent the better). But if I'm right, that commit should be reverted from 2.6.24 just because it's pointless (even if the bug itself is gone). And if I'm wrong, it should be reverted. So something like the appended would make sense regardless. Can I get a "tested-by"? And/or ack/nack's on my half-arsed theory above? Linus -- From: Linus Torvalds <torvalds@woody.linux-foundation.org> Revert "x86_64: allocate sparsemem memmap above 4G" This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6, since testing in Fedora has shown it to cause boot failures, as per Dave Jones. Bisected down by Martin Ebourne. Cc: Dave Jones <davej@redhat.com> Cc: Martin Ebourne <fedora@ebourne.me.uk> Cc: Zou Nan hai <nanhai.zou@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 1e3862e..a7308b2 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -728,12 +728,6 @@ int in_gate_area_no_task(unsigned long addr) return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END); } -void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) -{ - return __alloc_bootmem_core(pgdat->bdata, size, - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); -} - const char *arch_vma_name(struct vm_area_struct *vma) { if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso) diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index c83534e..0365ec9 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -59,7 +59,6 @@ extern void *__alloc_bootmem_core(struct bootmem_data *bdata, unsigned long align, unsigned long goal, unsigned long limit); -extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size); #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE extern void reserve_bootmem(unsigned long addr, unsigned long size); diff --git a/mm/sparse.c b/mm/sparse.c index 08fb14f..e06f514 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -220,12 +220,6 @@ static int __meminit sparse_init_one_section(struct mem_section *ms, return 1; } -__attribute__((weak)) __init -void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) -{ - return NULL; -} - static unsigned long usemap_size(void) { unsigned long size_bytes; @@ -267,11 +261,6 @@ struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid) if (map) return map; - map = alloc_bootmem_high_node(NODE_DATA(nid), - sizeof(struct page) * PAGES_PER_SECTION); - if (map) - return map; - map = alloc_bootmem_node(NODE_DATA(nid), sizeof(struct page) * PAGES_PER_SECTION); return map; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 18:37 ` Linus Torvalds @ 2007-10-29 19:51 ` Christoph Lameter 2007-10-29 19:52 ` Siddha, Suresh B ` (2 subsequent siblings) 3 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2007-10-29 19:51 UTC (permalink / raw) To: Linus Torvalds Cc: Greg KH, Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Andy Whitcroft, Mel Gorman On Mon, 29 Oct 2007, Linus Torvalds wrote: > We've had some changes since 2.6.23, and afaik, the > "alloc_bootmem_high_node()" code is alreadt effectively dead there. It's > only called if CONFIG_SPARSEMEM_VMEMMAP is *not* enabled, and I *think* we > enable it by force on x86-64 these days. CONFIG_SPARSEMEM_VMEMMAP was introduced in 2.6.24-rc1. If I read this Kconfig.x86_64 correctly then it seems that DISCONTIG is still the default. Andy? config ARCH_DISCONTIGMEM_ENABLE bool depends on NUMA default y config ARCH_DISCONTIGMEM_DEFAULT def_bool y depends on NUMA config ARCH_SPARSEMEM_ENABLE def_bool y depends on (NUMA || EXPERIMENTAL) select SPARSEMEM_VMEMMAP_ENABLE config ARCH_MEMORY_PROBE def_bool y depends on MEMORY_HOTPLUG config ARCH_FLATMEM_ENABLE def_bool y depends on !NUMA ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 18:37 ` Linus Torvalds 2007-10-29 19:51 ` Christoph Lameter @ 2007-10-29 19:52 ` Siddha, Suresh B 2007-10-29 20:09 ` Christoph Lameter 2007-10-29 20:23 ` Andy Whitcroft 2007-10-29 20:27 ` Martin Ebourne 3 siblings, 1 reply; 17+ messages in thread From: Siddha, Suresh B @ 2007-10-29 19:52 UTC (permalink / raw) To: Linus Torvalds Cc: Greg KH, Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Christoph Lameter, Andy Whitcroft, Mel Gorman On Mon, Oct 29, 2007 at 11:37:40AM -0700, Linus Torvalds wrote: > > > On Mon, 29 Oct 2007, Greg KH wrote: > > > > I'll be glad to revert it in -stable, if it's also reverted in Linus's > > tree first :) > > We've had some changes since 2.6.23, and afaik, the > "alloc_bootmem_high_node()" code is alreadt effectively dead there. It's > only called if CONFIG_SPARSEMEM_VMEMMAP is *not* enabled, and I *think* we > enable it by force on x86-64 these days. If so, we(Nanhai and myself) will take a look at VMEMMAP changes and see if the bug that the commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 tries to fix is still open in the latest git. But I can't explain how 2e1c49db4c640b35df13889b86b9d62215ade4b6 can be the root cause of Dave's issue in 2.6.23. thanks, suresh ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 19:52 ` Siddha, Suresh B @ 2007-10-29 20:09 ` Christoph Lameter 0 siblings, 0 replies; 17+ messages in thread From: Christoph Lameter @ 2007-10-29 20:09 UTC (permalink / raw) To: Siddha, Suresh B Cc: Linus Torvalds, Greg KH, Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Andi Kleen, stable, Andrew Morton, Andy Whitcroft, Mel Gorman On Mon, 29 Oct 2007, Siddha, Suresh B wrote: > But I can't explain how 2e1c49db4c640b35df13889b86b9d62215ade4b6 can be > the root cause of Dave's issue in 2.6.23. 2.6.23 has no VMEMMAP support for x86_64. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 18:37 ` Linus Torvalds 2007-10-29 19:51 ` Christoph Lameter 2007-10-29 19:52 ` Siddha, Suresh B @ 2007-10-29 20:23 ` Andy Whitcroft 2007-10-29 20:27 ` Martin Ebourne 3 siblings, 0 replies; 17+ messages in thread From: Andy Whitcroft @ 2007-10-29 20:23 UTC (permalink / raw) To: Linus Torvalds Cc: Greg KH, Dave Jones, Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Christoph Lameter, Mel Gorman On Mon, Oct 29, 2007 at 11:37:40AM -0700, Linus Torvalds wrote: > > > On Mon, 29 Oct 2007, Greg KH wrote: > > > > I'll be glad to revert it in -stable, if it's also reverted in Linus's > > tree first :) > > We've had some changes since 2.6.23, and afaik, the > "alloc_bootmem_high_node()" code is alreadt effectively dead there. It's > only called if CONFIG_SPARSEMEM_VMEMMAP is *not* enabled, and I *think* we > enable it by force on x86-64 these days. CONFIG_SPARSEMEM_VMEMMAP is the default when SPARSEMEM is enabled on x86_64. The overall default remains DISCONTIGMEM, mainly as a safety measure while the i386/x86_64 => x86 merge stablises. But yes this code is only used when SPARSEMEM is enabled but VMEMMAP is not. So it is effectivly redundant. > More people added to Cc, just to clarify whether I'm just confused. > > Andy, Christoph, Mel: commit 2e1c49db4c640b35df13889b86b9d62215ade4b6 aka > "x86_64: allocate sparsemem memmap above 4G" is the one that causes the > failures, just fyi. That patch seems to have a laudable goal of trying to push the memory which backs the sparsemem memmap out to non-dma memory. I would have expected that call to actually succeed as the bootmem allocator seems to try at the goal which would likely be outside the node on a small machine, and then retry without a goal. Which is what the code without the goal does. Most illogical. > Martin - it would be great if you could try out your failing machine with > 2.6.24-rc1 (or a nightly snapshot or current git.. the more recent the > better). > > But if I'm right, that commit should be reverted from 2.6.24 just because > it's pointless (even if the bug itself is gone). And if I'm wrong, it > should be reverted. So something like the appended would make sense > regardless. > > Can I get a "tested-by"? And/or ack/nack's on my half-arsed theory above? This code is definatly only used when SPARSEMEM is enabled, and VMEMMAP is not which is not a combination we see on x86_64. Acked-by: Andy Whitcroft <apw@shadowen.org> > Linus > -- > From: Linus Torvalds <torvalds@woody.linux-foundation.org> > > Revert "x86_64: allocate sparsemem memmap above 4G" > > This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6, since > testing in Fedora has shown it to cause boot failures, as per Dave > Jones. Bisected down by Martin Ebourne. > > Cc: Dave Jones <davej@redhat.com> > Cc: Martin Ebourne <fedora@ebourne.me.uk> > Cc: Zou Nan hai <nanhai.zou@intel.com> > Cc: Suresh Siddha <suresh.b.siddha@intel.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index 1e3862e..a7308b2 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -728,12 +728,6 @@ int in_gate_area_no_task(unsigned long addr) > return (addr >= VSYSCALL_START) && (addr < VSYSCALL_END); > } > > -void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) > -{ > - return __alloc_bootmem_core(pgdat->bdata, size, > - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); > -} > - > const char *arch_vma_name(struct vm_area_struct *vma) > { > if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso) > diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h > index c83534e..0365ec9 100644 > --- a/include/linux/bootmem.h > +++ b/include/linux/bootmem.h > @@ -59,7 +59,6 @@ extern void *__alloc_bootmem_core(struct bootmem_data *bdata, > unsigned long align, > unsigned long goal, > unsigned long limit); > -extern void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size); > > #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE > extern void reserve_bootmem(unsigned long addr, unsigned long size); > diff --git a/mm/sparse.c b/mm/sparse.c > index 08fb14f..e06f514 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -220,12 +220,6 @@ static int __meminit sparse_init_one_section(struct mem_section *ms, > return 1; > } > > -__attribute__((weak)) __init > -void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) > -{ > - return NULL; > -} > - > static unsigned long usemap_size(void) > { > unsigned long size_bytes; > @@ -267,11 +261,6 @@ struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid) > if (map) > return map; > > - map = alloc_bootmem_high_node(NODE_DATA(nid), > - sizeof(struct page) * PAGES_PER_SECTION); > - if (map) > - return map; > - > map = alloc_bootmem_node(NODE_DATA(nid), > sizeof(struct page) * PAGES_PER_SECTION); > return map; -apw ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [stable] 2.6.23 boot failures on x86-64. 2007-10-29 18:37 ` Linus Torvalds ` (2 preceding siblings ...) 2007-10-29 20:23 ` Andy Whitcroft @ 2007-10-29 20:27 ` Martin Ebourne 3 siblings, 0 replies; 17+ messages in thread From: Martin Ebourne @ 2007-10-29 20:27 UTC (permalink / raw) To: Linus Torvalds Cc: Greg KH, Dave Jones, Linux Kernel, Zou Nan hai, Suresh Siddha, Andi Kleen, stable, Andrew Morton, Christoph Lameter, Andy Whitcroft, Mel Gorman On Mon, 2007-10-29 at 11:37 -0700, Linus Torvalds wrote: > Martin - it would be great if you could try out your failing machine with > 2.6.24-rc1 (or a nightly snapshot or current git.. the more recent the > better). > > But if I'm right, that commit should be reverted from 2.6.24 just because > it's pointless (even if the bug itself is gone). And if I'm wrong, it > should be reverted. So something like the appended would make sense > regardless. > > Can I get a "tested-by"? And/or ack/nack's on my half-arsed theory above? Current git boots ok as is. I used the config from the fedora 2.6.23 kernel and accepted defaults for all the new options. Cheers, Martin. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 17:50 2.6.23 boot failures on x86-64 Dave Jones 2007-10-29 18:07 ` [stable] " Greg KH @ 2007-10-29 18:18 ` Andi Kleen 2007-10-29 18:47 ` Dave Jones 1 sibling, 1 reply; 17+ messages in thread From: Andi Kleen @ 2007-10-29 18:18 UTC (permalink / raw) To: Dave Jones Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Monday 29 October 2007 18:50:14 Dave Jones wrote: > We've had a number of people reporting that their x86-64s stopped booting > when they moved to 2.6.23. It rebooted just after discovering the AGP bridge > as a result of the IOMMU init. It's probably the usual "nobody tests sparsemem at all" issue. But if allocating bootmem >4G doesn't work on these systems most likely they have more problems anyways. It might be better to find out what goes wrong exactly. -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 18:18 ` Andi Kleen @ 2007-10-29 18:47 ` Dave Jones 2007-10-29 19:03 ` Andi Kleen 0 siblings, 1 reply; 17+ messages in thread From: Dave Jones @ 2007-10-29 18:47 UTC (permalink / raw) To: Andi Kleen Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Mon, Oct 29, 2007 at 07:18:43PM +0100, Andi Kleen wrote: > On Monday 29 October 2007 18:50:14 Dave Jones wrote: > > We've had a number of people reporting that their x86-64s stopped booting > > when they moved to 2.6.23. It rebooted just after discovering the AGP bridge > > as a result of the IOMMU init. > > It's probably the usual "nobody tests sparsemem at all" issue. We've been using SPARSEMEM in Fedora for a *long* time. So long in fact, I forget why we moved away from DISCONTIGMEM, so there's a significant number of users using that configuration for some time. > But if allocating bootmem >4G doesn't work on these systems > most likely they have more problems anyways. It might be better > to find out what goes wrong exactly. Any ideas on what to instrument ? Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 18:47 ` Dave Jones @ 2007-10-29 19:03 ` Andi Kleen 2007-10-29 19:43 ` Dave Jones 2007-10-29 20:06 ` Dave Jones 0 siblings, 2 replies; 17+ messages in thread From: Andi Kleen @ 2007-10-29 19:03 UTC (permalink / raw) To: Dave Jones Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Monday 29 October 2007 19:47:47 Dave Jones wrote: > On Mon, Oct 29, 2007 at 07:18:43PM +0100, Andi Kleen wrote: > > On Monday 29 October 2007 18:50:14 Dave Jones wrote: > > > We've had a number of people reporting that their x86-64s stopped booting > > > when they moved to 2.6.23. It rebooted just after discovering the AGP bridge > > > as a result of the IOMMU init. > > > > It's probably the usual "nobody tests sparsemem at all" issue. > > We've been using SPARSEMEM in Fedora for a *long* time. > So long in fact, I forget why we moved away from DISCONTIGMEM, so there's > a significant number of users using that configuration for some time. Supposedly you wanted a slower kernel that needs more memory? Ok I wasn't aware of that. I tended to get sparsemem reports usually at least 1-2 releases after the fact, so it looked like it was undertested. > > > But if allocating bootmem >4G doesn't work on these systems > > most likely they have more problems anyways. It might be better > > to find out what goes wrong exactly. > > Any ideas on what to instrument ? See what address the bootmem_alloc_high returns; check if it overlaps with something etc. Fill the memory on the system and see if it can access all of its memory. -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 19:03 ` Andi Kleen @ 2007-10-29 19:43 ` Dave Jones 2007-10-29 19:56 ` Andi Kleen 2007-10-29 21:21 ` Martin Ebourne 2007-10-29 20:06 ` Dave Jones 1 sibling, 2 replies; 17+ messages in thread From: Dave Jones @ 2007-10-29 19:43 UTC (permalink / raw) To: Andi Kleen Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > It's probably the usual "nobody tests sparsemem at all" issue. > > > > We've been using SPARSEMEM in Fedora for a *long* time. > > So long in fact, I forget why we moved away from DISCONTIGMEM, so there's > > a significant number of users using that configuration for some time. > > Supposedly you wanted a slower kernel that needs more memory? > > Ok I wasn't aware of that. I tended to get sparsemem reports usually > at least 1-2 releases after the fact, so it looked like it was undertested. Looking at cvs history, I can't figure out what the reasoning was, but every Fedora (and RHEL5) kernel since 2006/07/05 has been that way. Curious how no-one noticed either of the side-effects you mention. > > > But if allocating bootmem >4G doesn't work on these systems > > > most likely they have more problems anyways. It might be better > > > to find out what goes wrong exactly. > > Any ideas on what to instrument ? > > See what address the bootmem_alloc_high returns; check if it overlaps > with something etc. > > Fill the memory on the system and see if it can access all of its memory. Martin, as you have one of the affected systems, do you feel up to this? Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 19:43 ` Dave Jones @ 2007-10-29 19:56 ` Andi Kleen 2007-10-29 21:21 ` Martin Ebourne 1 sibling, 0 replies; 17+ messages in thread From: Andi Kleen @ 2007-10-29 19:56 UTC (permalink / raw) To: Dave Jones Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Monday 29 October 2007 20:43:11 Dave Jones wrote: > On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > > > It's probably the usual "nobody tests sparsemem at all" issue. > > > > > > We've been using SPARSEMEM in Fedora for a *long* time. > > > So long in fact, I forget why we moved away from DISCONTIGMEM, so there's > > > a significant number of users using that configuration for some time. > > > > Supposedly you wanted a slower kernel that needs more memory? > > > > Ok I wasn't aware of that. I tended to get sparsemem reports usually > > at least 1-2 releases after the fact, so it looked like it was undertested. > > Looking at cvs history, I can't figure out what the reasoning was, > but every Fedora (and RHEL5) kernel since 2006/07/05 has been that way. > > Curious how no-one noticed either of the side-effects you mention. It's a few percent on a few benchmarks iirc. vmemmap (now in .24) was supposed to address that. -Andi ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 19:43 ` Dave Jones 2007-10-29 19:56 ` Andi Kleen @ 2007-10-29 21:21 ` Martin Ebourne 2007-10-31 6:04 ` Zou Nan hai 1 sibling, 1 reply; 17+ messages in thread From: Martin Ebourne @ 2007-10-29 21:21 UTC (permalink / raw) To: Dave Jones Cc: Andi Kleen, Linux Kernel, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Mon, 2007-10-29 at 15:43 -0400, Dave Jones wrote: > On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > > But if allocating bootmem >4G doesn't work on these systems > > > > most likely they have more problems anyways. It might be better > > > > to find out what goes wrong exactly. > > > Any ideas on what to instrument ? > > > > See what address the bootmem_alloc_high returns; check if it overlaps > > with something etc. > > > > Fill the memory on the system and see if it can access all of its memory. > > Martin, as you have one of the affected systems, do you feel up to this? Faking a node at 0000000000000000-000000001fff0000 Bootmem setup node 0 0000000000000000-000000001fff0000 sparse_early_mem_map_alloc: returned address ffff81000070b000 My box has 512MB of RAM. Cheers, Martin. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 21:21 ` Martin Ebourne @ 2007-10-31 6:04 ` Zou Nan hai 2007-10-31 6:19 ` Zou Nan hai 0 siblings, 1 reply; 17+ messages in thread From: Zou Nan hai @ 2007-10-31 6:04 UTC (permalink / raw) To: Martin Ebourne Cc: Dave Jones, Andi Kleen, Linux Kernel, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Tue, 2007-10-30 at 05:21, Martin Ebourne wrote: > On Mon, 2007-10-29 at 15:43 -0400, Dave Jones wrote: > > On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > > > But if allocating bootmem >4G doesn't work on these systems > > > > > most likely they have more problems anyways. It might be better > > > > > to find out what goes wrong exactly. > > > > Any ideas on what to instrument ? > > > > > > See what address the bootmem_alloc_high returns; check if it overlaps > > > with something etc. > > > > > > Fill the memory on the system and see if it can access all of its memory. > > > > Martin, as you have one of the affected systems, do you feel up to this? > > Faking a node at 0000000000000000-000000001fff0000 > Bootmem setup node 0 0000000000000000-000000001fff0000 > sparse_early_mem_map_alloc: returned address ffff81000070b000 > > My box has 512MB of RAM. > > Cheers, > > Martin. Oops, sorry, seem to be a mistake of me. I forget to exclude the DMA range. Does the following patch fix the issue? Thanks Zou Nan hai --- a/arch/x86/mm/init_64.c 2007-10-31 11:24:11.000000000 +0800 +++ b/arch/x86/mm/init_64.c 2007-10-31 12:31:02.000000000 +0800 @@ -731,7 +731,7 @@ int in_gate_area_no_task(unsigned long a void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) { return __alloc_bootmem_core(pgdat->bdata, size, - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); + SMP_CACHE_BYTES, (4UL*1024*1024*1024), __pa(MAX_DMA_ADDRESS)); } const char *arch_vma_name(struct vm_area_struct *vma) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-31 6:04 ` Zou Nan hai @ 2007-10-31 6:19 ` Zou Nan hai 0 siblings, 0 replies; 17+ messages in thread From: Zou Nan hai @ 2007-10-31 6:19 UTC (permalink / raw) To: Martin Ebourne Cc: Dave Jones, Andi Kleen, Linux Kernel, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Wed, 2007-10-31 at 14:04, Zou Nan hai wrote: > On Tue, 2007-10-30 at 05:21, Martin Ebourne wrote: > > On Mon, 2007-10-29 at 15:43 -0400, Dave Jones wrote: > > > On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > > > > > > But if allocating bootmem >4G doesn't work on these systems > > > > > > most likely they have more problems anyways. It might be better > > > > > > to find out what goes wrong exactly. > > > > > Any ideas on what to instrument ? > > > > > > > > See what address the bootmem_alloc_high returns; check if it overlaps > > > > with something etc. > > > > > > > > Fill the memory on the system and see if it can access all of its memory. > > > > > > Martin, as you have one of the affected systems, do you feel up to this? > > > > Faking a node at 0000000000000000-000000001fff0000 > > Bootmem setup node 0 0000000000000000-000000001fff0000 > > sparse_early_mem_map_alloc: returned address ffff81000070b000 > > > > My box has 512MB of RAM. > > > > Cheers, > > > > Martin. > > Oops, sorry, > seem to be a mistake of me. > I forget to exclude the DMA range. > > Does the following patch fix the issue? > > Thanks > Zou Nan hai > > --- a/arch/x86/mm/init_64.c 2007-10-31 11:24:11.000000000 +0800 > +++ b/arch/x86/mm/init_64.c 2007-10-31 12:31:02.000000000 +0800 > @@ -731,7 +731,7 @@ int in_gate_area_no_task(unsigned long a > void * __init alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size) > { > return __alloc_bootmem_core(pgdat->bdata, size, > - SMP_CACHE_BYTES, (4UL*1024*1024*1024), 0); > + SMP_CACHE_BYTES, (4UL*1024*1024*1024), __pa(MAX_DMA_ADDRESS)); > } > > const char *arch_vma_name(struct vm_area_struct *vma) > > > > Please ignore the patch, the patch is wrong. However I think the root cause is when __alloc_bootmem_core fail to allocate a memory above 4G it will fall back to allocate from the lowest page. Then happens to be allocated in DMA region sometimes... Since this code path is dead, I am OK to revert the patch. Suresh and I will check the CONFIG_SPARSE_VMEMMAP path. Thanks Zou Nan hai ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: 2.6.23 boot failures on x86-64. 2007-10-29 19:03 ` Andi Kleen 2007-10-29 19:43 ` Dave Jones @ 2007-10-29 20:06 ` Dave Jones 1 sibling, 0 replies; 17+ messages in thread From: Dave Jones @ 2007-10-29 20:06 UTC (permalink / raw) To: Andi Kleen Cc: Linux Kernel, Martin Ebourne, Zou Nan hai, Suresh Siddha, stable, Andrew Morton, Linus Torvalds On Mon, Oct 29, 2007 at 08:03:09PM +0100, Andi Kleen wrote: > On Monday 29 October 2007 19:47:47 Dave Jones wrote: > > On Mon, Oct 29, 2007 at 07:18:43PM +0100, Andi Kleen wrote: > > > On Monday 29 October 2007 18:50:14 Dave Jones wrote: > > > > We've had a number of people reporting that their x86-64s stopped booting > > > > when they moved to 2.6.23. It rebooted just after discovering the AGP bridge > > > > as a result of the IOMMU init. > > > > > > It's probably the usual "nobody tests sparsemem at all" issue. > > > > We've been using SPARSEMEM in Fedora for a *long* time. > > So long in fact, I forget why we moved away from DISCONTIGMEM, so there's > > a significant number of users using that configuration for some time. > > Supposedly you wanted a slower kernel that needs more memory? Actually if what you say is true, the Kconfig entry for sparsemem could use changing as it suggests the opposite... This option provides some potential performance benefits, along with decreased code complexity, but it is newer, and more experimental. I'm still unclear why exactly we enabled it. The other comment in the Kconfig.. This will be the only option for some systems, including memory hotplug systems. This is normal. Sounds unlikely to be the reason, but maybe. Maybe benchmarking at some point in history showed sparsemem actually beat out discontigmem. I'm at a loss to explain it thanks to a particularly unhelpful changelog entry I wrote at the time. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-10-31 6:26 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-10-29 17:50 2.6.23 boot failures on x86-64 Dave Jones 2007-10-29 18:07 ` [stable] " Greg KH 2007-10-29 18:37 ` Linus Torvalds 2007-10-29 19:51 ` Christoph Lameter 2007-10-29 19:52 ` Siddha, Suresh B 2007-10-29 20:09 ` Christoph Lameter 2007-10-29 20:23 ` Andy Whitcroft 2007-10-29 20:27 ` Martin Ebourne 2007-10-29 18:18 ` Andi Kleen 2007-10-29 18:47 ` Dave Jones 2007-10-29 19:03 ` Andi Kleen 2007-10-29 19:43 ` Dave Jones 2007-10-29 19:56 ` Andi Kleen 2007-10-29 21:21 ` Martin Ebourne 2007-10-31 6:04 ` Zou Nan hai 2007-10-31 6:19 ` Zou Nan hai 2007-10-29 20:06 ` Dave Jones
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox