* [RFC 1/1] bootmem: move big allocations behing 4G @ 2010-01-18 22:56 Jiri Slaby 2010-01-19 14:33 ` Johannes Weiner 0 siblings, 1 reply; 7+ messages in thread From: Jiri Slaby @ 2010-01-18 22:56 UTC (permalink / raw) To: linux-mm; +Cc: hannes, linux-kernel, jirislaby Hi, I'm fighting a bug where Grub loads the kernel just fine, whereas isolinux doesn't. I found out, it's due to different addresses of loaded initrd. On a machine with 128G of memory, grub loads the initrd at 895M in our case and flat mem_map (2G long) is allocated above 4G due to 2-4G BIOS reservation. On the other hand, with isolinux, the 0-2G is free and mem_map is placed there leaving no space for others, hence kernel panics for swiotlb which needs to be below 4G. I use the patch below, but it seems, from the code, like it won't work out for section allocations. Any ideas? -- If there is a big amount of memory (128G) in a machine and 2G of low 4 gigs are reserved by BIOS, the rest of the "low" memory is consumed by mem_map with flat mapping enabled. Consequent allocations with limit being 4G (e.g. swiotlb) fails to allocate and kernel panics. Try to avoid that situation on 64-bit by allocating space bigger than 128M above 4G if possible. With that, mem_map is allocated above 4G and there is enough space for others (swiotlb) in low 4G. --- mm/bootmem.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/mm/bootmem.c b/mm/bootmem.c index 7d14868..365a0d1 100644 --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -486,6 +486,11 @@ static void * __init alloc_bootmem_core(struct bootmem_data *bdata, step = max(align >> PAGE_SHIFT, 1UL); + /* on 64-bit: allocate 128M+ at 4G if satisfies limit */ + if (BITS_PER_LONG == 64 && size >= (128UL << 20) && + (4UL << 30) + size < (max << PAGE_SHIFT)) + goal = 4UL << (30 - PAGE_SHIFT); + if (goal && min < goal && goal < max) start = ALIGN(goal, step); else -- 1.6.5.7 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC 1/1] bootmem: move big allocations behing 4G 2010-01-18 22:56 [RFC 1/1] bootmem: move big allocations behing 4G Jiri Slaby @ 2010-01-19 14:33 ` Johannes Weiner 2010-01-19 22:02 ` Jiri Slaby 2010-01-20 13:50 ` Jiri Slaby 0 siblings, 2 replies; 7+ messages in thread From: Johannes Weiner @ 2010-01-19 14:33 UTC (permalink / raw) To: Jiri Slaby; +Cc: linux-mm, linux-kernel, jirislaby, Ralf Baechle, x86 Hello Jiri, On Mon, Jan 18, 2010 at 11:56:30PM +0100, Jiri Slaby wrote: > Hi, I'm fighting a bug where Grub loads the kernel just fine, whereas > isolinux doesn't. I found out, it's due to different addresses of > loaded initrd. On a machine with 128G of memory, grub loads the > initrd at 895M in our case and flat mem_map (2G long) is allocated > above 4G due to 2-4G BIOS reservation. > > On the other hand, with isolinux, the 0-2G is free and mem_map is > placed there leaving no space for others, hence kernel panics for > swiotlb which needs to be below 4G. Bootmem already protects the lower 16MB DMA zone for the obvious reasons, how about shifting the default bootmem goal above the DMA32 zone if it exists? I added Ralf and the x86 Team on Cc as this only affects x86 and mips, afaics. > Any ideas? I tested the below on a rather dull x86_64 machine and it seems to work. Would this work in your case as well? The goal for mem_map should now be above 4G. Hannes ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC 1/1] bootmem: move big allocations behing 4G 2010-01-19 14:33 ` Johannes Weiner @ 2010-01-19 22:02 ` Jiri Slaby 2010-01-20 13:50 ` Jiri Slaby 1 sibling, 0 replies; 7+ messages in thread From: Jiri Slaby @ 2010-01-19 22:02 UTC (permalink / raw) To: Johannes Weiner; +Cc: linux-mm, linux-kernel, Ralf Baechle, x86 On 01/19/2010 03:33 PM, Johannes Weiner wrote: > On Mon, Jan 18, 2010 at 11:56:30PM +0100, Jiri Slaby wrote: >> Hi, I'm fighting a bug where Grub loads the kernel just fine, whereas >> isolinux doesn't. I found out, it's due to different addresses of >> loaded initrd. On a machine with 128G of memory, grub loads the >> initrd at 895M in our case and flat mem_map (2G long) is allocated >> above 4G due to 2-4G BIOS reservation. >> >> On the other hand, with isolinux, the 0-2G is free and mem_map is >> placed there leaving no space for others, hence kernel panics for >> swiotlb which needs to be below 4G. > > Bootmem already protects the lower 16MB DMA zone for the obvious reasons, > how about shifting the default bootmem goal above the DMA32 zone if it exists? Hi, I think it makes sense. > I tested the below on a rather dull x86_64 machine and it seems to work. Would > this work in your case as well? The goal for mem_map should now be above 4G. It seems that it will. I'll give it a try later (it needs to be set up) and report back. > From 1c11ce1e82c6209f0eda72e3340ab0c55cd6f330 Mon Sep 17 00:00:00 2001 > From: Johannes Weiner <jw@emlix.com> > Date: Tue, 19 Jan 2010 14:14:44 +0100 > Subject: [patch] bootmem: avoid DMA32 zone, if any, by default > > x86_64 and mips define a DMA32 zone additionally to the old DMA > zone of 16MB. Bootmem already avoids the old DMA zone if the > allocation site did not request otherwise. > > But since DMA32 is also a limited resource, avoid using it as well > by default, if defined. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> So for the time being: Reviewed-by: Jiri Slaby <jirislaby@gmail.com> thanks, -- js -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC 1/1] bootmem: move big allocations behing 4G 2010-01-19 14:33 ` Johannes Weiner 2010-01-19 22:02 ` Jiri Slaby @ 2010-01-20 13:50 ` Jiri Slaby 2010-01-20 15:30 ` Johannes Weiner 1 sibling, 1 reply; 7+ messages in thread From: Jiri Slaby @ 2010-01-20 13:50 UTC (permalink / raw) To: Johannes Weiner; +Cc: linux-mm, linux-kernel, Ralf Baechle, x86 On 01/19/2010 03:33 PM, Johannes Weiner wrote: > --- a/include/linux/bootmem.h > +++ b/include/linux/bootmem.h > @@ -96,20 +96,26 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat, > unsigned long align, > unsigned long goal); > > +#ifdef MAX_DMA32_PFN > +#define BOOTMEM_DEFAULT_GOAL (__pa(MAX_DMA32_PFN << PAGE_SHIFT)) > +#else > +#define BOOTMEM_DEFAULT_GOAL MAX_DMA_ADDRESS I just noticed this should write: #define BOOTMEM_DEFAULT_GOAL __pa(MAX_DMA_ADDRESS) > +#endif > + > #define alloc_bootmem(x) \ > - __alloc_bootmem(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) > + __alloc_bootmem(x, SMP_CACHE_BYTES, BOOTMEM_DEFAULT_GOAL) -- js -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC 1/1] bootmem: move big allocations behing 4G 2010-01-20 13:50 ` Jiri Slaby @ 2010-01-20 15:30 ` Johannes Weiner 2010-01-20 22:53 ` [PATCH] bootmem: avoid DMA32 zone by default Johannes Weiner 0 siblings, 1 reply; 7+ messages in thread From: Johannes Weiner @ 2010-01-20 15:30 UTC (permalink / raw) To: Jiri Slaby; +Cc: linux-mm, linux-kernel, Ralf Baechle, x86 Hi Jiri, On Wed, Jan 20, 2010 at 02:50:13PM +0100, Jiri Slaby wrote: > On 01/19/2010 03:33 PM, Johannes Weiner wrote: > > --- a/include/linux/bootmem.h > > +++ b/include/linux/bootmem.h > > @@ -96,20 +96,26 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat, > > unsigned long align, > > unsigned long goal); > > > > +#ifdef MAX_DMA32_PFN > > +#define BOOTMEM_DEFAULT_GOAL (__pa(MAX_DMA32_PFN << PAGE_SHIFT)) > > +#else > > +#define BOOTMEM_DEFAULT_GOAL MAX_DMA_ADDRESS > > I just noticed this should write: > #define BOOTMEM_DEFAULT_GOAL __pa(MAX_DMA_ADDRESS) Pardon my sloppiness, it's all backwards. The other case should be without the __pa(), of course. I'll send a fixed and tested version later. Thanks, Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] bootmem: avoid DMA32 zone by default 2010-01-20 15:30 ` Johannes Weiner @ 2010-01-20 22:53 ` Johannes Weiner 2010-01-20 23:12 ` Jiri Slaby 0 siblings, 1 reply; 7+ messages in thread From: Johannes Weiner @ 2010-01-20 22:53 UTC (permalink / raw) To: Andrew Morton Cc: Jiri Slaby, linux-mm, linux-kernel, Ralf Baechle, x86, stable Bootmem already tries normal allocations above the DMA zone to reserve it for users that can not cope with higher addresses. The same principle applies to the DMA32 zone, which is currently not spared from normal allocations. This can lead to exhaustion of this limited amount of address space through things that can easily live elsewhere, like the mem_map e.g. Raise bootmem's default goal beyond DMA32 for architectures with this zone defined. For now, these are x86 and mips. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: x86@kernel.org Cc: stable@kernel.org --- include/linux/bootmem.h | 20 +++++++++++++------- 1 files changed, 13 insertions(+), 7 deletions(-) I cc'd stable because this affects already released kernels. But since this is the first report of DMA32 memory exhaustion through bootmem that I hear of, you guys might want to skip this patch due to the fragile nature of early memory management. diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index b10ec49..52c8272 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -96,20 +96,26 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat, unsigned long align, unsigned long goal); +#ifdef MAX_DMA32_PFN +#define BOOTMEM_DEFAULT_GOAL (MAX_DMA32_PFN << PAGE_SHIFT) +#else +#define BOOTMEM_DEFAULT_GOAL __pa(MAX_DMA_ADDRESS) +#endif + #define alloc_bootmem(x) \ - __alloc_bootmem(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem(x, SMP_CACHE_BYTES, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_nopanic(x) \ - __alloc_bootmem_nopanic(x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem_nopanic(x, SMP_CACHE_BYTES, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_pages(x) \ - __alloc_bootmem(x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem(x, PAGE_SIZE, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_pages_nopanic(x) \ - __alloc_bootmem_nopanic(x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem_nopanic(x, PAGE_SIZE, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_node(pgdat, x) \ - __alloc_bootmem_node(pgdat, x, SMP_CACHE_BYTES, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem_node(pgdat, x, SMP_CACHE_BYTES, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_pages_node(pgdat, x) \ - __alloc_bootmem_node(pgdat, x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem_node(pgdat, x, PAGE_SIZE, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_pages_node_nopanic(pgdat, x) \ - __alloc_bootmem_node_nopanic(pgdat, x, PAGE_SIZE, __pa(MAX_DMA_ADDRESS)) + __alloc_bootmem_node_nopanic(pgdat, x, PAGE_SIZE, BOOTMEM_DEFAULT_GOAL) #define alloc_bootmem_low(x) \ __alloc_bootmem_low(x, SMP_CACHE_BYTES, 0) -- 1.6.5.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] bootmem: avoid DMA32 zone by default 2010-01-20 22:53 ` [PATCH] bootmem: avoid DMA32 zone by default Johannes Weiner @ 2010-01-20 23:12 ` Jiri Slaby 0 siblings, 0 replies; 7+ messages in thread From: Jiri Slaby @ 2010-01-20 23:12 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, linux-mm, linux-kernel, Ralf Baechle, x86, stable On 01/20/2010 11:53 PM, Johannes Weiner wrote: > I cc'd stable because this affects already released kernels. But since this is > the first report of DMA32 memory exhaustion through bootmem that I hear of, Just for how the setup look like: 128G of RAM, flat mapping sizeof(struct page)=56 0-1.75G mem_map 1.75-2G vfs caches, console and others. initrd reservation 2-4G reserved by BIOS Kernel panics with out of memory when swiotlb tries to allocate 64M of "low" bootmem. -- js suse labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-01-20 23:12 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-18 22:56 [RFC 1/1] bootmem: move big allocations behing 4G Jiri Slaby 2010-01-19 14:33 ` Johannes Weiner 2010-01-19 22:02 ` Jiri Slaby 2010-01-20 13:50 ` Jiri Slaby 2010-01-20 15:30 ` Johannes Weiner 2010-01-20 22:53 ` [PATCH] bootmem: avoid DMA32 zone by default Johannes Weiner 2010-01-20 23:12 ` Jiri Slaby
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).