From: Yinghai Lu <yhlu.kernel@gmail.com>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Andrew Morton <akpm@linux-foundation.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [PATCH] x86: numa32 use find_e820_area to find KVA ram on node
Date: Fri, 6 Jun 2008 18:53:33 -0700 [thread overview]
Message-ID: <200806061853.33968.yhlu.kernel@gmail.com> (raw)
In-Reply-To: <200806061443.57488.yhlu.kernel@gmail.com>
don't assume we can use ram near end of every node.
esp some system has less memory. and they could have
kva address and kva ram all below max_low_pfn
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Index: linux-2.6/arch/x86/mm/discontig_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/discontig_32.c
+++ linux-2.6/arch/x86/mm/discontig_32.c
@@ -225,17 +225,21 @@ static unsigned long calculate_numa_rema
{
int nid;
unsigned long size, reserve_pages = 0;
- unsigned long pfn;
for_each_online_node(nid) {
- unsigned old_end_pfn = node_end_pfn[nid];
+ u64 node_end_target;
+ u64 node_end_final;
/*
* The acpi/srat node info can show hot-add memroy zones
* where memory could be added but not currently present.
*/
+ printk("node %d pfn: [%lx - %lx]\n",
+ nid, node_start_pfn[nid], node_end_pfn[nid]);
if (node_start_pfn[nid] > max_pfn)
continue;
+ if (!node_end_pfn[nid])
+ continue;
if (node_end_pfn[nid] > max_pfn)
node_end_pfn[nid] = max_pfn;
@@ -247,37 +251,40 @@ static unsigned long calculate_numa_rema
/* now the roundup is correct, convert to PAGE_SIZE pages */
size = size * PTRS_PER_PTE;
- /*
- * Validate the region we are allocating only contains valid
- * pages.
- */
- for (pfn = node_end_pfn[nid] - size;
- pfn < node_end_pfn[nid]; pfn++)
- if (!page_is_ram(pfn))
- break;
+ node_end_target = round_down(node_end_pfn[nid] - size,
+ PTRS_PER_PTE);
+ node_end_target <<= PAGE_SHIFT;
+ do {
+ node_end_final = find_e820_area(node_end_target,
+ ((u64)node_end_pfn[nid])<<PAGE_SHIFT,
+ ((u64)size)<<PAGE_SHIFT,
+ LARGE_PAGE_BYTES);
+ node_end_target -= LARGE_PAGE_BYTES;
+ } while (node_end_final == -1ULL &&
+ (node_end_target>>PAGE_SHIFT) > (node_start_pfn[nid]));
- if (pfn != node_end_pfn[nid])
- size = 0;
+ if (node_end_final == -1ULL)
+ panic("Can not get kva ram\n");
printk("Reserving %ld pages of KVA for lmem_map of node %d\n",
size, nid);
node_remap_size[nid] = size;
node_remap_offset[nid] = reserve_pages;
reserve_pages += size;
- printk("Shrinking node %d from %ld pages to %ld pages\n",
- nid, node_end_pfn[nid], node_end_pfn[nid] - size);
+ printk("Shrinking node %d from %ld pages to %lld pages\n",
+ nid, node_end_pfn[nid], node_end_final>>PAGE_SHIFT);
- if (node_end_pfn[nid] & (PTRS_PER_PTE-1)) {
- /*
- * Align node_end_pfn[] and node_remap_start_pfn[] to
- * pmd boundary. remap_numa_kva will barf otherwise.
- */
- printk("Shrinking node %d further by %ld pages for proper alignment\n",
- nid, node_end_pfn[nid] & (PTRS_PER_PTE-1));
- size += node_end_pfn[nid] & (PTRS_PER_PTE-1);
- }
+ /*
+ * prevent kva address below max_low_pfn want it on system
+ * with less memory later.
+ * layout will be: KVA address , KVA RAM
+ */
+ if ((node_end_final>>PAGE_SHIFT) < max_low_pfn)
+ reserve_early(node_end_final,
+ node_end_final+(((u64)size)<<PAGE_SHIFT),
+ "KVA RAM");
- node_end_pfn[nid] -= size;
+ node_end_pfn[nid] = node_end_final>>PAGE_SHIFT;
node_remap_start_pfn[nid] = node_end_pfn[nid];
shrink_active_range(nid, node_end_pfn[nid]);
}
next prev parent reply other threads:[~2008-06-07 1:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-03 17:25 [PATCH] x86: early check if one system is numaq v2 Yinghai Lu
2008-06-04 2:32 ` [PATCH] x86: numa32 make sure get kva space Yinghai Lu
2008-06-04 10:26 ` Ingo Molnar
2008-06-04 2:34 ` [PATCH] x86: move e820_register_active to e820.c Yinghai Lu
2008-06-04 2:35 ` [PATCH] x86: 32 bit use e820_register_active_regions Yinghai Lu
2008-06-04 7:39 ` [PATCH] x86: e820 merge parse mem/memmap Yinghai Lu
2008-06-04 10:27 ` [PATCH] x86: 32 bit use e820_register_active_regions Ingo Molnar
2008-06-04 20:21 ` [PATCH] x86: e820 max_arch_pfn typo fix for 64 bit Yinghai Lu
2008-06-04 22:47 ` H. Peter Anvin
2008-06-06 21:43 ` [PATCH] x86: shrink pages should check all Yinghai Lu
2008-06-07 1:53 ` Yinghai Lu [this message]
2008-06-10 9:53 ` [PATCH] x86: numa32 use find_e820_area to find KVA ram on node Ingo Molnar
2008-06-07 1:54 ` [PATCH] x86: fix fail with 64g above system with numa32 Yinghai Lu
2008-06-10 9:53 ` Ingo Molnar
2008-06-09 2:39 ` [PATCH] x86: shrink pages should check all v2 Yinghai Lu
2008-06-09 10:15 ` Ingo Molnar
2008-06-10 19:55 ` [PATCH] x86: e820 merge parse mem/memmap Yinghai Lu
2008-06-04 10:26 ` [PATCH] x86: move e820_register_active to e820.c Ingo Molnar
2008-06-04 10:25 ` [PATCH] x86: early check if one system is numaq v2 Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806061853.33968.yhlu.kernel@gmail.com \
--to=yhlu.kernel@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.