From: Yinghai Lu <yinghai@kernel.org>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: early_node_mem()'s memory allocation policy
Date: Tue, 26 Oct 2010 22:49:33 -0700 [thread overview]
Message-ID: <4CC7BD6D.2030104@kernel.org> (raw)
In-Reply-To: <4CC753AD.1090403@goop.org>
On 10/26/2010 03:18 PM, Jeremy Fitzhardinge wrote:
> We're seeing problems under Xen where large portions of the memory
> could be reserved (because they're not yet physically present, even
> though the appear in E820), and the 'start' and 'end' early_node_mem()
> is choosing is entirely within that reserved range.
>
> Also, the code seems dubious because it adjusts start and end without
> regarding how much space it is trying to allocate:
>
> /* extend the search scope */
> end = max_pfn_mapped << PAGE_SHIFT;
> if (end > (MAX_DMA32_PFN<<PAGE_SHIFT))
> start = MAX_DMA32_PFN<<PAGE_SHIFT;
> else
> start = MAX_DMA_PFN<<PAGE_SHIFT;
>
> what if max_pfn_mapped is only a few pages larger than MAX_DMA32_PFN,
> and that is smaller than the size it is trying to allocate?
>
> I tried just removing the start and end adjustments in early_node_mem()
> and the kernel booted fine under Xen, but it seemed to allocate at a
> very low address. Should the for_each_active_range_index_in_nid() loop
> in find_memory_core_early() be iterating from high to low addresses? If
> the allocation could be relied on to be top-down, then you wouldn't need
> to adjust start at all, and it would return the highest available memory
> in a natural way.
please check
[PATCH] x86, memblock: Fix early_node_mem with big reserved region.
Jeremy said Xen could reserve huge mem but still show as ram in e820.
early_node_mem could not find range because of start/end adjusting.
Let's use memblock_find_in_range instead ***_node. So get real top down in fallback path.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 60f4985..7ffc9b7 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -178,11 +178,8 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
/* extend the search scope */
end = max_pfn_mapped << PAGE_SHIFT;
- if (end > (MAX_DMA32_PFN<<PAGE_SHIFT))
- start = MAX_DMA32_PFN<<PAGE_SHIFT;
- else
- start = MAX_DMA_PFN<<PAGE_SHIFT;
- mem = memblock_x86_find_in_range_node(nodeid, start, end, size, align);
+ start = MAX_DMA_PFN << PAGE_SHIFT;
+ mem = memblock_find_in_range(start, end, size, align);
if (mem != MEMBLOCK_ERROR)
return __va(mem);
next prev parent reply other threads:[~2010-10-27 5:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-26 22:18 early_node_mem()'s memory allocation policy Jeremy Fitzhardinge
2010-10-27 5:49 ` Yinghai Lu [this message]
2010-10-27 14:28 ` Konrad Rzeszutek Wilk
2010-10-27 20:21 ` Jeremy Fitzhardinge
2010-10-28 16:50 ` [PATCH] x86, memblock: Fix early_node_mem with big reserved region Yinghai Lu
2010-10-28 23:40 ` [tip:x86/urgent] " tip-bot for Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CC7BD6D.2030104@kernel.org \
--to=yinghai@kernel.org \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.