public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jesse Barnes <jbarnes@virtuousgeek.org>,
	Christoph Lameter <cl@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	Yinghai Lu <yinghai@kernel.org>
Subject: [PATCH 10/36] x86: make early_node_mem get mem > 4g if possible
Date: Wed, 20 Jan 2010 22:27:57 -0800	[thread overview]
Message-ID: <1264055303-15123-11-git-send-email-yinghai@kernel.org> (raw)
In-Reply-To: <1264055303-15123-1-git-send-email-yinghai@kernel.org>

so we could put pgdata for the node high, and later sparse
vmmap will get the section nr that need.

with this patch will make <4g ram will not use sparse vmmap

before this patch, will get, before swiotlb try get bootmem
[    0.000000] nid=1 start=0 end=2080000 aligned=1
[    0.000000]   free [10 - 96]
[    0.000000]   free [b12 - 1000]
[    0.000000]   free [359f - 38a3]
[    0.000000]   free [38b5 - 3a00]
[    0.000000]   free [41e01 - 42000]
[    0.000000]   free [73dde - 73e00]
[    0.000000]   free [73fdd - 74000]
[    0.000000]   free [741dd - 74200]
[    0.000000]   free [743dd - 74400]
[    0.000000]   free [745dd - 74600]
[    0.000000]   free [747dd - 74800]
[    0.000000]   free [749dd - 74a00]
[    0.000000]   free [74bdd - 74c00]
[    0.000000]   free [74ddd - 74e00]
[    0.000000]   free [74fdd - 75000]
[    0.000000]   free [751dd - 75200]
[    0.000000]   free [753dd - 75400]
[    0.000000]   free [755dd - 75600]
[    0.000000]   free [757dd - 75800]
[    0.000000]   free [759dd - 75a00]
[    0.000000]   free [75bdd - 7bf5f]
[    0.000000]   free [7f730 - 7f750]
[    0.000000]   free [100000 - 2080000]
[    0.000000]   total free 1f87170
[   93.301474] Placing 64MB software IO TLB between ffff880075bdd000 - ffff880079bdd000
[   93.311814] software IO TLB at phys 0x75bdd000 - 0x79bdd000

with this patch will get: before swiotlb try get bootmem
[    0.000000] nid=1 start=0 end=2080000 aligned=1
[    0.000000]   free [a - 96]
[    0.000000]   free [702 - 1000]
[    0.000000]   free [359f - 3600]
[    0.000000]   free [37de - 3800]
[    0.000000]   free [39dd - 3a00]
[    0.000000]   free [3bdd - 3c00]
[    0.000000]   free [3ddd - 3e00]
[    0.000000]   free [3fdd - 4000]
[    0.000000]   free [41dd - 4200]
[    0.000000]   free [43dd - 4400]
[    0.000000]   free [45dd - 4600]
[    0.000000]   free [47dd - 4800]
[    0.000000]   free [49dd - 4a00]
[    0.000000]   free [4bdd - 4c00]
[    0.000000]   free [4ddd - 4e00]
[    0.000000]   free [4fdd - 5000]
[    0.000000]   free [51dd - 5200]
[    0.000000]   free [53dd - 5400]
[    0.000000]   free [55dd - 7bf5f]
[    0.000000]   free [7f730 - 7f750]
[    0.000000]   free [100428 - 100600]
[    0.000000]   free [13ea01 - 13ec00]
[    0.000000]   free [170800 - 2080000]
[    0.000000]   total free 1f87170

[   92.689485] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[   92.699799] Placing 64MB software IO TLB between ffff8800055dd000 - ffff8800095dd000
[   92.710916] software IO TLB at phys 0x55dd000 - 0x95dd000

so will get enough space below 4G, aka pfn 0x100000

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
 arch/x86/mm/numa_64.c |   23 ++++++++++++++++++-----
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
index 3232148..02f13cb 100644
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -163,14 +163,27 @@ static void * __init early_node_mem(int nodeid, unsigned long start,
 				    unsigned long end, unsigned long size,
 				    unsigned long align)
 {
-	unsigned long mem = find_e820_area(start, end, size, align);
+	unsigned long mem;
 
+	/*
+	 * put it on high as possible
+	 * something will go with NODE_DATA
+	 */
+	if (start < (MAX_DMA_PFN<<PAGE_SHIFT))
+		start = MAX_DMA_PFN<<PAGE_SHIFT;
+	if (start < (MAX_DMA32_PFN<<PAGE_SHIFT) &&
+	    end > (MAX_DMA32_PFN<<PAGE_SHIFT))
+		start = MAX_DMA32_PFN<<PAGE_SHIFT;
+	mem = find_e820_area(start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
 
-
-	start = __pa(MAX_DMA_ADDRESS);
-	end = max_low_pfn_mapped << PAGE_SHIFT;
+	/* extend the search scope */
+	end = max_pfn_mapped << PAGE_SHIFT;
+	if (end > (MAX_DMA32_PFN<<PAGE_SHIFT))
+		start = MAX_DMA32_PFN<<PAGE_SHIFT;
+	else
+		start = MAX_DMA_PFN<<PAGE_SHIFT;
 	mem = find_e820_area(start, end, size, align);
 	if (mem != -1L)
 		return __va(mem);
-- 
1.6.4.2


  parent reply	other threads:[~2010-01-21  6:35 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-21  6:27 [PATCH -v4 0/36] x86: not use bootmem for x86 Yinghai Lu
2010-01-21  6:27 ` [PATCH 01/36] x86: move range related operation to one file Yinghai Lu
2010-01-21  6:27 ` [PATCH 02/36] x86: check range in update range Yinghai Lu
2010-01-21 20:43   ` Christoph Lameter
2010-01-21 21:02     ` Yinghai Lu
2010-01-21 21:07       ` Christoph Lameter
2010-01-21  6:27 ` [PATCH 03/36] x86/pci: use u64 instead of size_t in amd_bus.c Yinghai Lu
2010-01-21  6:27 ` [PATCH 04/36] x86/pci: add cap_resource Yinghai Lu
2010-01-21 15:49   ` Linus Torvalds
2010-01-21 20:01     ` Yinghai Lu
2010-01-21  6:27 ` [PATCH 05/36] x86/pci: enable pci root res read out for 32bit too Yinghai Lu
2010-01-21 15:54   ` Linus Torvalds
2010-01-21 20:12     ` Yinghai Lu
2010-01-21  6:27 ` [PATCH 06/36] x86: call early_res_to_bootmem one time Yinghai Lu
2010-01-21  6:27 ` [PATCH 07/36] x86: introduce max_early_res and early_res_count Yinghai Lu
2010-01-21  6:27 ` [PATCH 08/36] x86: dynamic increase early_res array size Yinghai Lu
2010-01-21  6:27 ` [PATCH 09/36] x86: print bootmem free before pci_iommu_alloc and free_all_bootmem -v2 Yinghai Lu
2010-01-21  6:27 ` Yinghai Lu [this message]
2010-01-21  6:27 ` [PATCH 11/36] x86: only call dma32_reserve_bootmem 64bit !CONFIG_NUMA Yinghai Lu
2010-01-21  6:27 ` [PATCH 12/36] x86: make 64 bit use early_res instead of bootmem before slab Yinghai Lu
2010-01-21  6:28 ` [PATCH 13/36] sparsemem: put usemap for one node together Yinghai Lu
2010-01-21  6:28 ` [PATCH 14/36] sparsemem: put mem map " Yinghai Lu
2010-01-21  6:28 ` [PATCH 15/36] x86: change range end to start+size Yinghai Lu
2010-01-21  6:28 ` [PATCH 16/36] x86: move bios page reserve early to head32/64.c Yinghai Lu
2010-01-21  6:28 ` [PATCH 17/36] x86: seperate early_res related code from e820.c Yinghai Lu
2010-01-21  6:28 ` [PATCH 18/36] x86: add find_early_area_size Yinghai Lu
2010-01-21  6:28 ` [PATCH 19/36] x86: move back find_e820_area to e820.c Yinghai Lu
2010-01-21  6:28 ` [PATCH 20/36] early_res: enhance check_and_double_early_res Yinghai Lu
2010-01-21  6:28 ` [PATCH 21/36] x86: make 32bit support NO_BOOTMEM Yinghai Lu
2010-01-21  6:28 ` [PATCH 22/36] move round_up/down to kernel.h Yinghai Lu
2010-01-21 20:48   ` Christoph Lameter
2010-01-21 23:14     ` Andi Kleen
2010-01-21  6:28 ` [PATCH 23/36] x86: add find_fw_memmap_area Yinghai Lu
2010-01-21  6:28 ` [PATCH 24/36] core: move early_res Yinghai Lu
2010-01-21  6:28 ` [PATCH 25/36] x86: print out for RAM buffer Yinghai Lu
2010-01-21  6:28 ` [PATCH 26/36] x86: remove bios data range from e820 Yinghai Lu
2010-01-21  6:28 ` [PATCH 27/36] x86/pci: add mmconf range into e820 for when it is from MSR with amd faml0h Yinghai Lu
2010-01-21  6:28 ` [PATCH 28/36] irq: remove not need bootmem code Yinghai Lu
2010-01-21  6:28 ` [PATCH 29/36] radix: move radix init early Yinghai Lu
2010-01-21  6:28 ` [PATCH 30/36] sparseirq: change irq_desc_ptrs to static Yinghai Lu
2010-01-21  6:28 ` [PATCH 31/36] sparseirq: use radix_tree instead of ptrs array Yinghai Lu
2010-01-21  6:28 ` [PATCH 32/36] x86: remove arch_probe_nr_irqs Yinghai Lu
2010-01-21  6:28 ` [PATCH 33/36] use nr_cpus= to set nr_cpu_ids early Yinghai Lu
2010-01-21  6:28 ` [PATCH 34/36] x86: according to nr_cpu_ids to decide if need to leave logical flat Yinghai Lu
2010-01-21  6:28 ` [PATCH 35/36] x86: make 32bit apic flat to physflat switch like 64bit Yinghai Lu
2010-01-21  6:28 ` [PATCH 36/36] x86: use num_processors for possible cpus Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1264055303-15123-11-git-send-email-yinghai@kernel.org \
    --to=yinghai@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=jbarnes@virtuousgeek.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox