From: Yinghai Lu <yhlu.kernel@gmail.com>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [PATCH] x86: numa32 use find_e820_area to find KVA ram on node
Date: Fri, 6 Jun 2008 18:53:33 -0700 [thread overview]
Message-ID: <200806061853.33968.yhlu.kernel@gmail.com> (raw)
In-Reply-To: <200806061443.57488.yhlu.kernel@gmail.com>
don't assume we can use ram near end of every node.
esp some system has less memory. and they could have
kva address and kva ram all below max_low_pfn
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Index: linux-2.6/arch/x86/mm/discontig_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/discontig_32.c
+++ linux-2.6/arch/x86/mm/discontig_32.c
@@ -225,17 +225,21 @@ static unsigned long calculate_numa_rema
{
int nid;
unsigned long size, reserve_pages = 0;
- unsigned long pfn;
for_each_online_node(nid) {
- unsigned old_end_pfn = node_end_pfn[nid];
+ u64 node_end_target;
+ u64 node_end_final;
/*
* The acpi/srat node info can show hot-add memroy zones
* where memory could be added but not currently present.
*/
+ printk("node %d pfn: [%lx - %lx]\n",
+ nid, node_start_pfn[nid], node_end_pfn[nid]);
if (node_start_pfn[nid] > max_pfn)
continue;
+ if (!node_end_pfn[nid])
+ continue;
if (node_end_pfn[nid] > max_pfn)
node_end_pfn[nid] = max_pfn;
@@ -247,37 +251,40 @@ static unsigned long calculate_numa_rema
/* now the roundup is correct, convert to PAGE_SIZE pages */
size = size * PTRS_PER_PTE;
- /*
- * Validate the region we are allocating only contains valid
- * pages.
- */
- for (pfn = node_end_pfn[nid] - size;
- pfn < node_end_pfn[nid]; pfn++)
- if (!page_is_ram(pfn))
- break;
+ node_end_target = round_down(node_end_pfn[nid] - size,
+ PTRS_PER_PTE);
+ node_end_target <<= PAGE_SHIFT;
+ do {
+ node_end_final = find_e820_area(node_end_target,
+ ((u64)node_end_pfn[nid])<<PAGE_SHIFT,
+ ((u64)size)<<PAGE_SHIFT,
+ LARGE_PAGE_BYTES);
+ node_end_target -= LARGE_PAGE_BYTES;
+ } while (node_end_final == -1ULL &&
+ (node_end_target>>PAGE_SHIFT) > (node_start_pfn[nid]));
- if (pfn != node_end_pfn[nid])
- size = 0;
+ if (node_end_final == -1ULL)
+ panic("Can not get kva ram\n");
printk("Reserving %ld pages of KVA for lmem_map of node %d\n",
size, nid);
node_remap_size[nid] = size;
node_remap_offset[nid] = reserve_pages;
reserve_pages += size;
- printk("Shrinking node %d from %ld pages to %ld pages\n",
- nid, node_end_pfn[nid], node_end_pfn[nid] - size);
+ printk("Shrinking node %d from %ld pages to %lld pages\n",
+ nid, node_end_pfn[nid], node_end_final>>PAGE_SHIFT);
- if (node_end_pfn[nid] & (PTRS_PER_PTE-1)) {
- /*
- * Align node_end_pfn[] and node_remap_start_pfn[] to
- * pmd boundary. remap_numa_kva will barf otherwise.
- */
- printk("Shrinking node %d further by %ld pages for proper alignment\n",
- nid, node_end_pfn[nid] & (PTRS_PER_PTE-1));
- size += node_end_pfn[nid] & (PTRS_PER_PTE-1);
- }
+ /*
+ * prevent kva address below max_low_pfn want it on system
+ * with less memory later.
+ * layout will be: KVA address , KVA RAM
+ */
+ if ((node_end_final>>PAGE_SHIFT) < max_low_pfn)
+ reserve_early(node_end_final,
+ node_end_final+(((u64)size)<<PAGE_SHIFT),
+ "KVA RAM");
- node_end_pfn[nid] -= size;
+ node_end_pfn[nid] = node_end_final>>PAGE_SHIFT;
node_remap_start_pfn[nid] = node_end_pfn[nid];
shrink_active_range(nid, node_end_pfn[nid]);
}
next prev parent reply other threads:[~2008-06-07 1:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-03 17:25 [PATCH] x86: early check if one system is numaq v2 Yinghai Lu
2008-06-04 2:32 ` [PATCH] x86: numa32 make sure get kva space Yinghai Lu
2008-06-04 10:26 ` Ingo Molnar
2008-06-04 2:34 ` [PATCH] x86: move e820_register_active to e820.c Yinghai Lu
2008-06-04 2:35 ` [PATCH] x86: 32 bit use e820_register_active_regions Yinghai Lu
2008-06-04 7:39 ` [PATCH] x86: e820 merge parse mem/memmap Yinghai Lu
2008-06-04 10:27 ` [PATCH] x86: 32 bit use e820_register_active_regions Ingo Molnar
2008-06-04 20:21 ` [PATCH] x86: e820 max_arch_pfn typo fix for 64 bit Yinghai Lu
2008-06-04 22:47 ` H. Peter Anvin
2008-06-06 21:43 ` [PATCH] x86: shrink pages should check all Yinghai Lu
2008-06-07 1:53 ` Yinghai Lu [this message]
2008-06-10 9:53 ` [PATCH] x86: numa32 use find_e820_area to find KVA ram on node Ingo Molnar
2008-06-07 1:54 ` [PATCH] x86: fix fail with 64g above system with numa32 Yinghai Lu
2008-06-10 9:53 ` Ingo Molnar
2008-06-09 2:39 ` [PATCH] x86: shrink pages should check all v2 Yinghai Lu
2008-06-09 10:15 ` Ingo Molnar
2008-06-10 19:55 ` [PATCH] x86: e820 merge parse mem/memmap Yinghai Lu
2008-06-04 10:26 ` [PATCH] x86: move e820_register_active to e820.c Ingo Molnar
2008-06-04 10:25 ` [PATCH] x86: early check if one system is numaq v2 Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806061853.33968.yhlu.kernel@gmail.com \
--to=yhlu.kernel@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).